CN115550944B - Dynamic service placement method based on edge calculation and deep reinforcement learning in Internet of vehicles - Google Patents
Dynamic service placement method based on edge calculation and deep reinforcement learning in Internet of vehicles Download PDFInfo
- Publication number
- CN115550944B CN115550944B CN202210992657.5A CN202210992657A CN115550944B CN 115550944 B CN115550944 B CN 115550944B CN 202210992657 A CN202210992657 A CN 202210992657A CN 115550944 B CN115550944 B CN 115550944B
- Authority
- CN
- China
- Prior art keywords
- service
- network
- edge
- edge server
- vehicles
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 49
- 230000002787 reinforcement Effects 0.000 title claims abstract description 24
- 230000006870 function Effects 0.000 claims abstract description 54
- 230000009471 action Effects 0.000 claims abstract description 31
- 238000012549 training Methods 0.000 claims abstract description 11
- 238000011156 evaluation Methods 0.000 claims abstract description 10
- 238000004891 communication Methods 0.000 claims description 10
- 238000013507 mapping Methods 0.000 claims description 9
- 230000008569 process Effects 0.000 claims description 9
- 230000006855 networking Effects 0.000 claims description 6
- 230000001934 delay Effects 0.000 abstract description 2
- 230000004075 alteration Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000004044 response Effects 0.000 description 2
- 238000006467 substitution reaction Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W16/00—Network planning, e.g. coverage or traffic planning tools; Network deployment, e.g. resource partitioning or cells structures
- H04W16/18—Network planning tools
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5061—Partitioning or combining of resources
- G06F9/5072—Grid computing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16Y—INFORMATION AND COMMUNICATION TECHNOLOGY SPECIALLY ADAPTED FOR THE INTERNET OF THINGS [IoT]
- G16Y20/00—Information sensed or collected by the things
- G16Y20/10—Information sensed or collected by the things relating to the environment, e.g. temperature; relating to location
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16Y—INFORMATION AND COMMUNICATION TECHNOLOGY SPECIALLY ADAPTED FOR THE INTERNET OF THINGS [IoT]
- G16Y20/00—Information sensed or collected by the things
- G16Y20/30—Information sensed or collected by the things relating to resources, e.g. consumed power
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16Y—INFORMATION AND COMMUNICATION TECHNOLOGY SPECIALLY ADAPTED FOR THE INTERNET OF THINGS [IoT]
- G16Y30/00—IoT infrastructure
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W16/00—Network planning, e.g. coverage or traffic planning tools; Network deployment, e.g. resource partitioning or cells structures
- H04W16/22—Traffic simulation tools or models
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2209/00—Indexing scheme relating to G06F9/00
- G06F2209/50—Indexing scheme relating to G06F9/50
- G06F2209/502—Proximity
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Mathematical Physics (AREA)
- Environmental & Geological Engineering (AREA)
- Toxicology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Molecular Biology (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
The invention discloses a dynamic service placement method based on edge calculation and deep reinforcement learning in the Internet of vehicles, which comprises the following steps: 1) Establishing a network and service request model, and acquiring information related to the network and service request; 2) Establishing a network and service request calculation model; 3) Constructing a state space, an action space, a strategy function and a reward function; 4) Constructing an actor network and a criticism network, and training the actor network and the criticism network; 5) Generating a service placement strategy by the actor network and inputting the strategy into the criticizing home network; 6) And (3) the criticizing home network evaluates the strategy quality of the service placement strategy, if the evaluation is not passed, updating actor network parameters, returning to the step (5), and if the evaluation is passed, outputting the service placement strategy. The present invention minimizes maximum edge resource usage and service delays while taking into account vehicle mobility, changing demands, and dynamics of different types of service requests.
Description
Technical Field
The invention relates to the field of Internet of vehicles, in particular to a dynamic service placement method based on edge calculation and deep reinforcement learning in the Internet of vehicles.
Background
The internet of vehicles is an interactive network composed of information such as vehicle location, speed, and route. The rapid development of communication technology has brought many new possibilities to the current field of internet of vehicles. The internet of vehicles becomes more intelligent due to the occurrence of the fifth generation mobile communication technology, and the service coverage range is further enlarged. However, as delay-sensitive applications such as intelligent voice assistants and autopilot become the most popular applications at present in the field of internet of vehicles, the traditional cloud computing paradigm is gradually unable to meet the needs of users. The European telecom standards institute introduces mobile edge computing into the field of Internet of vehicles, expands storage resources and computing resources of cloud computing, enables the cloud computing to be closer to users, and meets requirements of the users on high reliability, low delay, safety and the like of intelligent application.
In the internet of vehicles, vehicles communicate with infrastructure to obtain services such as media download, collaboration messages, decentralized environment notification messages, etc., to achieve coordination in applications such as remote driving, parking space discovery, navigation, etc. In the edge computing paradigm, multiple services can be deployed on an edge server, making full use of computing resources and storage resources. Service placement is one of the research hotspots in the field of internet of vehicles. In particular, service placement is the mapping of services to edge servers in the Internet of vehicles to meet the requirements of the requested service while efficiently using edge resources. From the user's perspective, it is important to minimize delays in vehicle perceived services. From the service provider's point of view, it is desirable to maximize edge resource usage while maintaining as much as possible resource load balancing between servers.
Disclosure of Invention
The invention aims to provide a dynamic service placement method based on edge calculation and deep reinforcement learning in the Internet of vehicles, which comprises the following steps:
1) Establishing a network and service request model, and acquiring information related to the network and service request;
the information related to the network and the service request comprises edge server information, vehicle information and service information;
the edge server information comprises an edge server set E, an edge server E and the residual resource capacity C of the edge server E e ;
The vehicle information includes a vehicle collection V.
The service information includes a service set S, a number lambda of vehicles requesting the service S s One service instance at a time (such as media file download in a car networking environment, collaboration awareness messages and environment notification services, etc.), or the number epsilon of vehicles that can provide parallel connections, the specified time t and vehicle location loc in the service request message, the amount of resources R consumed by the edge server deployment service s s Time delay requirement threshold D s 。
2) Establishing a network and service request calculation model;
the network and service request calculation model comprises a total service delay calculation model and an edge resource utilization calculation model;
the total service delay calculation model is as follows:
in the method, in the process of the invention,is the total service delay;Propagation delay and queuing delay; dist (v, s) is the euclidean distance between the vehicle v and the edge server deployed by service s; c is the propagation velocity of the signal through the communication medium;
when requesting the number lambda of vehicles of the service s s Queuing delay when epsilon is less than or equal to epsilonWhen requesting the number lambda of vehicles of the service s s At ∈ ->Satisfies the following formula:
wherein the number difference lambda 'is' s =λ s -ε;
Propagation delayThe following is shown:
wherein dist (v, s) is the Euclidean distance between the vehicle v and the edge server deployed by the service s; c is the propagation velocity of the signal through the communication medium.
The edge resource usage calculation model is as follows:
edge resource utilizationIs the ratio between the resources consumed by the service instance and the available resources of the edge server, as follows:
in the parameters ofC e The remaining resource capacity of the edge server e;The edge resource utilization rate is used; r is R s The amount of resources consumed by the service s for the edge server deployment.
3) Constructing a state space, an action space, a strategy function and a reward function;
the state space is characterized by a state space set ω, namely:
ω={[v 1 ,loc 1 ,s],[v 2 ,loc 2 ,s],...,[v n ,loc n ,s]} t (6)
wherein S is S; v 1 ,v 2 ,...,v n Aggregate for a group of vehicles; loc 1 ,loc 2 ,...,loc n For the set of vehicle locations at t for the request service s.
The action space is used for describing actions taken when placing services on the edge server;
wherein the action a taken at a given time t is as follows:
where pi is a policy function required to generate an action in the observation set of time unit t to ω;representing that the service s is deployed at the edge server e;Indicating that service s is not deployed at edge server e.
The strategy function pi is a function executed by the actor network and is used for mapping a state space to an action space, namely pi, omega and a;
the goal of the policy function pi is to minimize the maximum edge resource usage and service latency, and to control the relative importance of resource usage and service latency by using the parameter β. The policy function pi is expressed as
Wherein, beta is a weight coefficient;
the constraints of the policy function pi include mapping constraintsDelay constraint->Resource constraints
The reward function is as follows:
in the method, in the process of the invention,for instant rewards. Gamma is the prize coefficient.The service time delay at the time t;
4) Constructing an actor network and a criticism network, and training the actor network and the criticism network;
loss function in criticism network training processThe following is shown:
wherein θ is a criticizing home network parameter;is a target value for evaluating policy quality; q (Q) i (ω, a; θ) is the policy quality of the service placement policy;Is the number of available resource units in the edge server;
5) Generating a service placement strategy by the actor network and inputting the strategy into the criticizing home network;
6) And (3) the criticizing home network evaluates the strategy quality of the service placement strategy, if the evaluation is not passed, updating actor network parameters, returning to the step (5), and if the evaluation is passed, outputting the service placement strategy.
The method for evaluating the policy quality of the service placement policy by the criticizing home network comprises the following steps: judging criticism home network loss functionWhether the convergence is carried out, if so, the passing is evaluated, otherwise, the passing is evaluated.
It is worth noting that the invention proposes a three-layer internet of vehicles architecture based on edge computing, and considers the dynamic service placement problem, with the optimization objective being to minimize maximum edge resource usage (from the service provider's perspective) and service delay (from the user's perspective).
In addition, the invention provides a service placement framework based on deep reinforcement learning, which consists of a strategy function (actor network) and a cost function (criticism network). The actor network makes service placement policies, and the critics network evaluates the decision making performance by the actor network based on the delay observed by the vehicle.
The technical effect of the invention is undoubted. The invention provides a dynamic service placement method based on edge calculation and deep reinforcement learning in the Internet of vehicles, which provides a dynamic service placement framework based on deep reinforcement learning in the Internet of vehicles, and aims to minimize the maximum edge resource use and service delay while considering the mobility of vehicles, the changing requirements and the dynamic nature of different types of service requests.
Drawings
FIG. 1 is a three-layer vehicle networking rack based on edge calculation;
FIG. 2 is a diagram of a smart structure;
fig. 3 is a flow chart of the present invention.
Detailed Description
The present invention is further described below with reference to examples, but it should not be construed that the scope of the above subject matter of the present invention is limited to the following examples. Various substitutions and alterations are made according to the ordinary skill and familiar means of the art without departing from the technical spirit of the invention, and all such substitutions and alterations are intended to be included in the scope of the invention.
Example 1:
referring to fig. 1 to 3, a dynamic service placement method based on edge calculation and deep reinforcement learning in the internet of vehicles includes the following steps:
1) Establishing a network and service request model, and acquiring information related to the network and service request;
the information related to the network and the service request comprises edge server information, vehicle information and service information;
the edge server information comprises an edge server set E, an edge server E and the residual resource capacity C of the edge server E e ;
The vehicle information includes a vehicle collection V.
The service information includes a service set S, a number lambda of vehicles requesting the service S s One service instance at a time (such as media file download in a car networking environment, collaboration awareness messages and environment notification services, etc.), or the number epsilon of vehicles that can provide parallel connections, the specified time t and vehicle location loc in the service request message, the amount of resources R consumed by the edge server deployment service s s Time delay requirement threshold D s 。
2) Establishing a network and service request calculation model;
the network and service request calculation model comprises a total service delay calculation model and an edge resource utilization calculation model;
the total service delay calculation model is as follows:
in the method, in the process of the invention,is the total service delay;Propagation delay and queuing delay; dist (v, s) is the euclidean distance between the vehicle v and the edge server deployed by service s; c is the propagation velocity of the signal through the communication medium;
when requesting the number lambda of vehicles of the service s s Queuing delay when epsilon is less than or equal to epsilonWhen requesting the number lambda of vehicles of the service s s At ∈ ->Satisfies the following formula:
wherein the number difference lambda 'is' s =λ s -ε;
Propagation delayThe following is shown:
wherein dist (v, s) is the Euclidean distance between the vehicle v and the edge server deployed by the service s; c is the propagation velocity of the signal through the communication medium.
The edge resource usage calculation model is as follows:
edge resource utilizationIs the ratio between the resources consumed by the service instance and the available resources of the edge server, as follows:
in the parameters ofC e The remaining resource capacity of the edge server e;The edge resource utilization rate is used; r is R s The amount of resources consumed by the service s for the edge server deployment.
3) Constructing a state space, an action space, a strategy function and a reward function;
the state space is characterized by a state space set ω, namely:
ω={[v 1 ,loc 1 ,s],[v 2 ,loc 2 ,s],...,[v n ,loc n ,s]} t (6)
wherein S is S; v 1 ,v 2 ,...,v n Aggregate for a group of vehicles; loc 1 ,loc 2 ,...,loc n For the set of vehicle locations at t for the request service s.
The action space is used for describing actions taken when placing services on the edge server;
wherein the action a taken at a given time t is as follows:
where pi is a policy function required to generate an action in the observation set of time unit t to ω;representing that the service s is deployed at the edge server e;Indicating that service s is not deployed at edge server e.
The strategy function pi is a function executed by the actor network and is used for mapping a state space to an action space, namely pi, omega and a;
the goal of the policy function pi is to minimize the maximum edge resource usage and service latency, and to control the relative importance of resource usage and service latency by using the parameter β. The policy function pi is expressed as
Wherein, beta is a weight coefficient;
the principle of the policy function pi is: and iterating the service set and the edge server set through the subscript s, searching the maximum edge resource use and service delay, and minimizing the service delay to obtain a corresponding strategy function pi.
Constraint package of the policy function piBracketing map constraintsDelay constraint->Resource constraints
The reward function is as follows:
in the method, in the process of the invention,for instant rewards. Gamma is the prize coefficient.The service time delay at the time t;
4) Constructing an actor network and a criticism network, and training the actor network and the criticism network;
loss function in criticism network training processThe following is shown:
wherein θ is a criticizing home network parameter;is a target value for evaluating policy quality; q (Q) i (ω, a; θ) is the policy quality of the service placement policy;Is the number of available resource units in the edge server;
5) Generating a service placement strategy by the actor network and inputting the strategy into the criticizing home network;
6) And (3) the criticizing home network evaluates the strategy quality of the service placement strategy, if the evaluation is not passed, updating actor network parameters, returning to the step (5), and if the evaluation is passed, outputting the service placement strategy.
The method for evaluating the policy quality of the service placement policy by the criticizing home network comprises the following steps: judging criticism home network loss functionWhether the convergence is carried out, if so, the passing is evaluated, otherwise, the passing is evaluated.
Example 2:
a dynamic service placement method based on edge calculation and deep reinforcement learning in the Internet of vehicles comprises the following steps:
1) And establishing a network and service request model, and acquiring the information of the edge server, the vehicle information and the service information.
The server information, the vehicle information and the service information comprise an edge server set E, an edge server E and the residual resource capacity C of the edge server E e Vehicle set V and service set S, vehicle number λ requesting service S s One service instance (such as media file download in a car networking environment, collaboration awareness information, environment notification service, etc.) can be processed at a time, or the number epsilon of vehicles connected in parallel can be provided, the time t and the vehicle position loc are specified in the service request information, and the resource quantity R consumed by the edge server deployment service s is calculated s Time delay requirement threshold D s 。
2) And (5) establishing a calculation model.
2.1 Total service delay modeling. The entire edge internet of vehicles system is modeled as an M/D/1 queue. Wherein when requesting service s from edge server e, the total service delay of the vehicleRefers to the total time from the vehicle sending a service request to the edge server receiving the corresponding response. Total service delay->By propagation delay->And queuing delay->The composition is as follows:
if lambda is s Epsilon and queuing delayIs 0. If lambda is s A queue is created and the average queuing delay for service s on the edge server will be as follows:
wherein lambda' s =λ s Epsilon and the average propagation delay is calculated as the ratio of distance to propagation speed on the medium as follows:
where dist (v, s) is the euclidean distance between the vehicle v and the edge server deployed by service s, and c is the propagation speed of the signal through the communication medium. Thus, the total service delay is as follows:
2.2 Edge resource usage modeling. Edge resource utilizationIs the ratio between the resources consumed by the service instance and the available resources of the edge server, as follows:
wherein,
3) The state space is designed. At a given time t, the set of state spaces describes the network environment. The agent observes the environment to form a set of state spaces ω from the service request model as follows:
ω={[v 1 ,loc 1 ,s],[v 2 ,loc 2 ,s],...,[v n ,loc n ,s]} t 。 (6)
wherein S is S, v 1 ,v 2 ,...,v n To a group of vehicles, loc 1 ,loc 2 ,...,loc n For the set of vehicle locations at t for the request service s.
4) And (3) a designed action space. The action space describes the actions taken by the policy module when placing a service on an edge server, as follows at a given time t:
where pi is the policy function required to generate the action in the observation set of time units t versus ω, a binary variableA matrix indicating the location of the service s on the edge server e is given, +.>Representing that service s is deployed at edge server e. On the contrary, the method comprises the steps of,indicating that service s is not deployed at edge server e.
5) And designing a strategy function. The policy function pi is a function performed by the actor network to map the state space to the action space pi omega-a. The goal of the policy function pi is to minimize the maximum edge resource usage and service latency, and to control the relative importance of resource usage and service latency by using the parameter β. The policy function pi is expressed as
The policy function is also subject to mapping constraints,delay constraint (S)>The resource constraints are set by the user and,
6) The bonus function is designed. At each time unit t, the system receives an instant prize from the environment in response to actions taken by the actor network of the agentThe following is shown:
7) The criticizing home network is constructed in charge of evaluating the quality Q (ω, a) of decisions made by the actor network. Inputting the state and movementTraining criticizing network for making and rewarding, and the criticizing network updates its parameter θ to minimize the loss functionThe following is shown:
wherein y is t Is the target value. The replay memory M is further used for storing the experience of the training criticizing home network. The criticizing home network obtains experience after a random time exists in replay and optimizes network parameters to obtain better performance.
8) After the training of the actor network and the criticizing home network is converged through the steps, the actor network can find the optimal placement strategy of the service while considering the mobility and the dynamics of the vehicle in different types of service requests. The criticizing home network may evaluate the policy quality of the actor network by a value function.
Example 3:
a dynamic service placement method based on edge calculation and deep reinforcement learning in the Internet of vehicles comprises the following steps:
1) And establishing the network and service request model, and acquiring information related to the network and service request.
2) Establishing a network and service request calculation model;
3) Constructing a state space, an action space, a strategy function and a reward function;
4) Constructing an actor network and a criticism network, and training the actor network and the criticism network;
5) Generating a service placement strategy by the actor network and inputting the strategy into the criticizing home network;
6) And (3) the criticizing home network evaluates the strategy quality of the service placement strategy, if the evaluation is not passed, updating actor network parameters, returning to the step (5), and if the evaluation is passed, outputting the service placement strategy.
Example 4:
the main content of the dynamic service placement method based on edge calculation and deep reinforcement learning in the Internet of vehicles is shown in the embodiment 3, wherein the information related to the network and the service request comprises edge server information, vehicle information and service information;
the edge server information comprises an edge server set E, an edge server E and the residual resource capacity C of the edge server E e ;
The vehicle information includes a vehicle collection V.
The service information includes a service set S, a number lambda of vehicles requesting the service S s The number of vehicles epsilon that can handle one service instance at a time or can provide parallel connections, the specified time t and vehicle location loc in the service request message, the amount of resources R consumed by the edge server deployment service s s Time delay requirement threshold D s The method comprises the steps of carrying out a first treatment on the surface of the The service examples include media file downloads in a car networking environment, collaboration awareness messages, and environment notification services.
Example 5:
the main content of the dynamic service placement method based on edge calculation and deep reinforcement learning in the Internet of vehicles is shown in the embodiment 3, wherein the network and service request calculation model comprises a total service time delay calculation model and an edge resource utilization rate calculation model;
the total service delay calculation model is as follows:
in the method, in the process of the invention,is the total service delay;Propagation delay and queuing delay; dist (v, s) is the euclidean distance between the vehicle v and the edge server deployed by service s; c is the propagation velocity of the signal through the communication medium;
when requesting the number lambda of vehicles of the service s s Queuing delay when epsilon is less than or equal to epsilonWhen requesting the number lambda of vehicles of the service s s At ∈ ->Satisfies the following formula:
wherein the number difference lambda 'is' s =λ s -ε;
Propagation delayThe following is shown:
wherein dist (v, s) is the Euclidean distance between the vehicle v and the edge server deployed by the service s; c is the propagation velocity of the signal through the communication medium.
The edge resource usage calculation model is as follows:
edge resource utilizationIs the ratio between the resources consumed by the service instance and the available resources of the edge server, as follows:
in the parameters ofC e Is an edge servere remaining resource capacity;The edge resource utilization rate is used; r is R s The amount of resources consumed by the service s for the edge server deployment.
Example 6:
the main content of the dynamic service placement method based on edge calculation and deep reinforcement learning in the internet of vehicles is shown in embodiment 3, wherein the state space is characterized by a state space set ω, namely:
ω={[v 1 ,loc 1 ,s],[v 2 ,loc 2 ,s],...,[v n ,loc n ,s]} t (6)
wherein S is S; v 1 ,v 2 ,...,v n Aggregate for a group of vehicles; loc 1 ,loc 2 ,...,loc n For the set of vehicle locations at t for the request service s.
Example 7:
the main content of the dynamic service placement method based on edge calculation and deep reinforcement learning in the Internet of vehicles is shown in the embodiment 3, wherein the action space is used for describing actions taken when placing services on an edge server;
wherein the action a taken at a given time t is as follows:
where pi is a policy function required to generate an action in the observation set of time unit t to ω;representing that the service s is deployed at the edge server e;Indicating that service s is not deployed at edge server e.
Example 8:
the main content of the dynamic service placement method based on edge calculation and deep reinforcement learning in the Internet of vehicles is shown in the embodiment 3, wherein the strategy function pi is a function executed by an actor network and is used for mapping a state space to an action space, namely pi:omega-a;
the goal of the policy function pi is to minimize the maximum edge resource usage and service delay, and to control the relative importance of resource usage and service delay by using parameter β;
the policy function pi is expressed as follows:
wherein, beta is a weight coefficient.
The constraints of the policy function pi include mapping constraintsDelay constraint->Resource constraints
Example 9:
the main content of the dynamic service placement method based on edge calculation and deep reinforcement learning in the internet of vehicles is shown in the embodiment 3, wherein the reward function is as follows:
in the method, in the process of the invention,for instant rewards. Gamma is the prize coefficient.The service time delay at the time t;
example 10:
a dynamic service placement method based on edge calculation and deep reinforcement learning in the Internet of vehicles is disclosed in embodiment 3, wherein the main content is the loss function in the criticizing home network training processThe following is shown:
wherein θ is a criticizing home network parameter;is a target value for evaluating policy quality; q (Q) i (ω, a; θ) is the policy quality of the service placement policy;Is the number of available resource units in the edge server;
example 11:
the main content of the dynamic service placement method based on edge calculation and deep reinforcement learning in the internet of vehicles is as shown in embodiment 3, wherein the method for evaluating the policy quality of the service placement policy by the criticizing home network comprises the following steps: judging criticism home network loss functionWhether the convergence is carried out, if so, the passing is evaluated, otherwise, the passing is evaluated. />
Claims (5)
1. A dynamic service placement method based on edge calculation and deep reinforcement learning in the Internet of vehicles is characterized by comprising the following steps:
1) Establishing a network and service request model, and acquiring information related to the network and service request;
2) Establishing a network and service request calculation model;
3) Constructing a state space, an action space, a strategy function and a reward function;
4) Constructing an actor network and a criticism network, and training the actor network and the criticism network;
5) Generating a service placement strategy by the actor network and inputting the strategy into the criticizing home network;
6) The criticizing home network evaluates the strategy quality of the service placement strategy, if the evaluation is not passed, the actor network parameters are updated, and the step 5) is returned, if the evaluation is passed, the service placement strategy is output;
the network and service request calculation model comprises a total service delay calculation model and an edge resource utilization calculation model;
the total service delay calculation model is as follows:
in the method, in the process of the invention,is the total service delay;Propagation delay and queuing delay; dist (v, s) is the euclidean distance between the vehicle v and the edge server deployed by service s; c is the propagation velocity of the signal through the communication medium;
when requesting the number lambda of vehicles of the service s s Queuing delay when epsilon is less than or equal to epsilonWhen requesting the number lambda of vehicles of the service s s At ∈ ->Satisfies the following formula:
wherein the number difference lambda 'is' s =λ s -ε;
Propagation delayThe following is shown:
wherein dist (v, s) is the Euclidean distance between the vehicle v and the edge server deployed by the service s; c is the propagation velocity of the signal through the communication medium;
the edge resource usage calculation model is as follows:
edge resource utilizationIs the ratio between the resources consumed by the service instance and the available resources of the edge server, as follows:
in the parameters ofC e The remaining resource capacity of the edge server e;The edge resource utilization rate is used; r is R s The amount of resources consumed by the deployment of the service s for the edge server;
the state space is characterized by a state space set ω, namely:
ω={[v 1 ,loc 1 ,s],[v 2 ,loc 2 ,s],…,[v n ,loc n ,s]} t (6)
wherein S is S; v 1 ,v 2 ,...,v n Aggregate for a group of vehicles; loc 1 ,loc 2 ,...,loc n A set of vehicle locations for the request service s at t;
the action space is used for describing actions taken when placing services on the edge server;
wherein the action a taken at a given time t is as follows:
where pi is a policy function required to generate an action in the observation set of time unit t to ω;representing that the service s is deployed at the edge server e;Indicating that service s is not deployed at edge server e;
the strategy function pi is a function executed by the actor network and is used for mapping a state space to an action space, namely pi, omega and a;
the goal of the policy function pi is to minimize the maximum edge resource usage and service delay, and to control the relative importance of resource usage and service delay by using parameter β;
the policy function pi is expressed as follows:
wherein, beta is a weight coefficient;
the constraints of the policy function pi include mapping constraintsDelay constraint->Resource constraints
2. The dynamic service placement method based on edge calculation and deep reinforcement learning in the internet of vehicles according to claim 1, wherein the network and service request related information includes edge server information, vehicle information, service information;
the edge server information comprises an edge server set E, an edge server E and the residual resource capacity C of the edge server E e ;
The vehicle information includes a vehicle set V;
the service information includes a service set S, a number lambda of vehicles requesting the service S s The number of vehicles epsilon that can handle one service instance at a time or can provide parallel connections, the specified time t and vehicle location loc in the service request message, the amount of resources R consumed by the edge server deployment service s s Time delay requirement threshold D s The method comprises the steps of carrying out a first treatment on the surface of the The service examples include media file downloads in a car networking environment, collaboration awareness messages, and environment notification services.
3. The method for dynamic service placement based on edge computation and deep reinforcement learning in the internet of vehicles according to claim 1, wherein the reward function is as follows:
in the method, in the process of the invention,is an instant rewards; gamma is the reward coefficient;And the service delay at the time t.
4. The dynamic service placement method based on edge calculation and deep reinforcement learning in the internet of vehicles according to claim 1, wherein the criticizing home network training process has a loss functionThe following is shown:
wherein θ is a criticizing home network parameter;is a target value for evaluating policy quality; q (Q) i (ω, a; θ) is the policy quality of the service placement policy;Is the number of available resource units in the edge server.
5. The method for dynamic service placement based on edge computation and deep reinforcement learning in internet of vehicles according to claim 4, wherein the method for criticizing the network to evaluate the policy quality of the service placement policy comprises: judging criticism home network loss functionWhether the convergence is carried out, if so, the passing is evaluated, otherwise, the passing is evaluated.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210992657.5A CN115550944B (en) | 2022-08-18 | 2022-08-18 | Dynamic service placement method based on edge calculation and deep reinforcement learning in Internet of vehicles |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210992657.5A CN115550944B (en) | 2022-08-18 | 2022-08-18 | Dynamic service placement method based on edge calculation and deep reinforcement learning in Internet of vehicles |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115550944A CN115550944A (en) | 2022-12-30 |
CN115550944B true CN115550944B (en) | 2024-02-27 |
Family
ID=84725291
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210992657.5A Active CN115550944B (en) | 2022-08-18 | 2022-08-18 | Dynamic service placement method based on edge calculation and deep reinforcement learning in Internet of vehicles |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115550944B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118502967A (en) * | 2024-07-17 | 2024-08-16 | 北京师范大学珠海校区 | Delay-aware container scheduling method, system and terminal for cluster upgrading |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110213796A (en) * | 2019-05-28 | 2019-09-06 | 大连理工大学 | A kind of intelligent resource allocation methods in car networking |
CN113382383A (en) * | 2021-06-11 | 2021-09-10 | 浙江工业大学 | Method for unloading calculation tasks of public transport vehicle based on strategy gradient |
WO2021237996A1 (en) * | 2020-05-26 | 2021-12-02 | 多伦科技股份有限公司 | Fuzzy c-means-based adaptive energy consumption optimization vehicle clustering method |
CN114528042A (en) * | 2022-01-30 | 2022-05-24 | 南京信息工程大学 | Energy-saving automatic interconnected vehicle service unloading method based on deep reinforcement learning |
CN114625504A (en) * | 2022-03-09 | 2022-06-14 | 天津理工大学 | Internet of vehicles edge computing service migration method based on deep reinforcement learning |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US12045738B2 (en) * | 2020-12-23 | 2024-07-23 | Intel Corporation | Transportation operator collaboration system |
-
2022
- 2022-08-18 CN CN202210992657.5A patent/CN115550944B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110213796A (en) * | 2019-05-28 | 2019-09-06 | 大连理工大学 | A kind of intelligent resource allocation methods in car networking |
WO2021237996A1 (en) * | 2020-05-26 | 2021-12-02 | 多伦科技股份有限公司 | Fuzzy c-means-based adaptive energy consumption optimization vehicle clustering method |
CN113382383A (en) * | 2021-06-11 | 2021-09-10 | 浙江工业大学 | Method for unloading calculation tasks of public transport vehicle based on strategy gradient |
CN114528042A (en) * | 2022-01-30 | 2022-05-24 | 南京信息工程大学 | Energy-saving automatic interconnected vehicle service unloading method based on deep reinforcement learning |
CN114625504A (en) * | 2022-03-09 | 2022-06-14 | 天津理工大学 | Internet of vehicles edge computing service migration method based on deep reinforcement learning |
Non-Patent Citations (4)
Title |
---|
Offloading Strategy for Vehicles in the Architecture of Vehicle-MEC-Cloud;Dasong Zhuang;《2022 IEEE/CIC International Conference on Communications in China (ICCC Workshops)》;20220811;全文 * |
Task Offloading for End-Edge-Cloud Orchestrated Computing in Mobile Networks;Xiuhua Li;《2020 IEEE Wireless Communications and Networking Conference (WCNC)》;20200525;全文 * |
一种车载服务的快速深度Q学习网络边云迁移策略;彭军;王成龙;蒋富;顾欣;牟玥玥,;刘伟荣;;电子与信息学报;20200115(01);全文 * |
车联网中一种基于软件定义网络与移动边缘计算的卸载策略;张海波;荆昆仑;刘开健;贺晓帆;;电子与信息学报;20200315(03);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN115550944A (en) | 2022-12-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109756378B (en) | Intelligent computing unloading method under vehicle-mounted network | |
Zhang et al. | Mobile-edge computing for vehicular networks: A promising network paradigm with predictive off-loading | |
CN110312231A (en) | Content caching decision and resource allocation joint optimization method based on mobile edge calculations in a kind of car networking | |
CN111262940B (en) | Vehicle-mounted edge computing application caching method, device and system | |
CN113543074B (en) | Joint computing migration and resource allocation method based on vehicle-road cloud cooperation | |
CN112995950B (en) | Resource joint allocation method based on deep reinforcement learning in Internet of vehicles | |
CN114143346B (en) | Joint optimization method and system for task unloading and service caching of Internet of vehicles | |
CN103763334B (en) | Multimedia cooperative sharing method based on P2P-BT in VANET | |
CN111400001A (en) | Online computing task unloading scheduling method facing edge computing environment | |
CN112395090B (en) | Intelligent hybrid optimization method for service placement in mobile edge calculation | |
CN115550944B (en) | Dynamic service placement method based on edge calculation and deep reinforcement learning in Internet of vehicles | |
CN114979145B (en) | Content distribution method integrating sensing, communication and caching in Internet of vehicles | |
CN113507503B (en) | Internet of vehicles resource allocation method with load balancing function | |
Xu et al. | Socially driven joint optimization of communication, caching, and computing resources in vehicular networks | |
CN115941790A (en) | Edge collaborative content caching method, device, equipment and storage medium | |
CN108332767A (en) | A kind of electricity sharing method and relevant device | |
CN115052262A (en) | Potential game-based vehicle networking computing unloading and power optimization method | |
CN113873534A (en) | Block chain assisted federal learning active content caching method in fog calculation | |
Wei et al. | OCVC: An overlapping-enabled cooperative vehicular fog computing protocol | |
CN113411826A (en) | Edge network equipment caching method based on attention mechanism reinforcement learning | |
CN116566838A (en) | Internet of vehicles task unloading and content caching method with cooperative blockchain and edge calculation | |
CN116489668A (en) | Edge computing task unloading method based on high-altitude communication platform assistance | |
CN113573365B (en) | Internet of vehicles edge caching method based on Markov transition probability | |
CN114928826A (en) | Two-stage optimization method, controller and decision method for software-defined vehicle-mounted task unloading and resource allocation | |
Lee et al. | A study of mobile edge computing system architecture for connected car media services on highway |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |