CN115550944B - Dynamic service placement method based on edge calculation and deep reinforcement learning in Internet of vehicles - Google Patents

Dynamic service placement method based on edge calculation and deep reinforcement learning in Internet of vehicles Download PDF

Info

Publication number
CN115550944B
CN115550944B CN202210992657.5A CN202210992657A CN115550944B CN 115550944 B CN115550944 B CN 115550944B CN 202210992657 A CN202210992657 A CN 202210992657A CN 115550944 B CN115550944 B CN 115550944B
Authority
CN
China
Prior art keywords
service
network
edge
edge server
vehicles
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210992657.5A
Other languages
Chinese (zh)
Other versions
CN115550944A (en
Inventor
李秀华
李辉
孙川
徐峥辉
郝金隆
蔡春茂
范琪琳
杨正益
文俊浩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing University
Original Assignee
Chongqing University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University filed Critical Chongqing University
Priority to CN202210992657.5A priority Critical patent/CN115550944B/en
Publication of CN115550944A publication Critical patent/CN115550944A/en
Application granted granted Critical
Publication of CN115550944B publication Critical patent/CN115550944B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W16/00Network planning, e.g. coverage or traffic planning tools; Network deployment, e.g. resource partitioning or cells structures
    • H04W16/18Network planning tools
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5072Grid computing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16YINFORMATION AND COMMUNICATION TECHNOLOGY SPECIALLY ADAPTED FOR THE INTERNET OF THINGS [IoT]
    • G16Y20/00Information sensed or collected by the things
    • G16Y20/10Information sensed or collected by the things relating to the environment, e.g. temperature; relating to location
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16YINFORMATION AND COMMUNICATION TECHNOLOGY SPECIALLY ADAPTED FOR THE INTERNET OF THINGS [IoT]
    • G16Y20/00Information sensed or collected by the things
    • G16Y20/30Information sensed or collected by the things relating to resources, e.g. consumed power
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16YINFORMATION AND COMMUNICATION TECHNOLOGY SPECIALLY ADAPTED FOR THE INTERNET OF THINGS [IoT]
    • G16Y30/00IoT infrastructure
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W16/00Network planning, e.g. coverage or traffic planning tools; Network deployment, e.g. resource partitioning or cells structures
    • H04W16/22Traffic simulation tools or models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/502Proximity

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Environmental & Geological Engineering (AREA)
  • Toxicology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention discloses a dynamic service placement method based on edge calculation and deep reinforcement learning in the Internet of vehicles, which comprises the following steps: 1) Establishing a network and service request model, and acquiring information related to the network and service request; 2) Establishing a network and service request calculation model; 3) Constructing a state space, an action space, a strategy function and a reward function; 4) Constructing an actor network and a criticism network, and training the actor network and the criticism network; 5) Generating a service placement strategy by the actor network and inputting the strategy into the criticizing home network; 6) And (3) the criticizing home network evaluates the strategy quality of the service placement strategy, if the evaluation is not passed, updating actor network parameters, returning to the step (5), and if the evaluation is passed, outputting the service placement strategy. The present invention minimizes maximum edge resource usage and service delays while taking into account vehicle mobility, changing demands, and dynamics of different types of service requests.

Description

Dynamic service placement method based on edge calculation and deep reinforcement learning in Internet of vehicles
Technical Field
The invention relates to the field of Internet of vehicles, in particular to a dynamic service placement method based on edge calculation and deep reinforcement learning in the Internet of vehicles.
Background
The internet of vehicles is an interactive network composed of information such as vehicle location, speed, and route. The rapid development of communication technology has brought many new possibilities to the current field of internet of vehicles. The internet of vehicles becomes more intelligent due to the occurrence of the fifth generation mobile communication technology, and the service coverage range is further enlarged. However, as delay-sensitive applications such as intelligent voice assistants and autopilot become the most popular applications at present in the field of internet of vehicles, the traditional cloud computing paradigm is gradually unable to meet the needs of users. The European telecom standards institute introduces mobile edge computing into the field of Internet of vehicles, expands storage resources and computing resources of cloud computing, enables the cloud computing to be closer to users, and meets requirements of the users on high reliability, low delay, safety and the like of intelligent application.
In the internet of vehicles, vehicles communicate with infrastructure to obtain services such as media download, collaboration messages, decentralized environment notification messages, etc., to achieve coordination in applications such as remote driving, parking space discovery, navigation, etc. In the edge computing paradigm, multiple services can be deployed on an edge server, making full use of computing resources and storage resources. Service placement is one of the research hotspots in the field of internet of vehicles. In particular, service placement is the mapping of services to edge servers in the Internet of vehicles to meet the requirements of the requested service while efficiently using edge resources. From the user's perspective, it is important to minimize delays in vehicle perceived services. From the service provider's point of view, it is desirable to maximize edge resource usage while maintaining as much as possible resource load balancing between servers.
Disclosure of Invention
The invention aims to provide a dynamic service placement method based on edge calculation and deep reinforcement learning in the Internet of vehicles, which comprises the following steps:
1) Establishing a network and service request model, and acquiring information related to the network and service request;
the information related to the network and the service request comprises edge server information, vehicle information and service information;
the edge server information comprises an edge server set E, an edge server E and the residual resource capacity C of the edge server E e
The vehicle information includes a vehicle collection V.
The service information includes a service set S, a number lambda of vehicles requesting the service S s One service instance at a time (such as media file download in a car networking environment, collaboration awareness messages and environment notification services, etc.), or the number epsilon of vehicles that can provide parallel connections, the specified time t and vehicle location loc in the service request message, the amount of resources R consumed by the edge server deployment service s s Time delay requirement threshold D s
2) Establishing a network and service request calculation model;
the network and service request calculation model comprises a total service delay calculation model and an edge resource utilization calculation model;
the total service delay calculation model is as follows:
in the method, in the process of the invention,is the total service delay;Propagation delay and queuing delay; dist (v, s) is the euclidean distance between the vehicle v and the edge server deployed by service s; c is the propagation velocity of the signal through the communication medium;
when requesting the number lambda of vehicles of the service s s Queuing delay when epsilon is less than or equal to epsilonWhen requesting the number lambda of vehicles of the service s s At ∈ ->Satisfies the following formula:
wherein the number difference lambda 'is' s =λ s -ε;
Propagation delayThe following is shown:
wherein dist (v, s) is the Euclidean distance between the vehicle v and the edge server deployed by the service s; c is the propagation velocity of the signal through the communication medium.
The edge resource usage calculation model is as follows:
edge resource utilizationIs the ratio between the resources consumed by the service instance and the available resources of the edge server, as follows:
in the parameters ofC e The remaining resource capacity of the edge server e;The edge resource utilization rate is used; r is R s The amount of resources consumed by the service s for the edge server deployment.
3) Constructing a state space, an action space, a strategy function and a reward function;
the state space is characterized by a state space set ω, namely:
ω={[v 1 ,loc 1 ,s],[v 2 ,loc 2 ,s],...,[v n ,loc n ,s]} t (6)
wherein S is S; v 1 ,v 2 ,...,v n Aggregate for a group of vehicles; loc 1 ,loc 2 ,...,loc n For the set of vehicle locations at t for the request service s.
The action space is used for describing actions taken when placing services on the edge server;
wherein the action a taken at a given time t is as follows:
where pi is a policy function required to generate an action in the observation set of time unit t to ω;representing that the service s is deployed at the edge server e;Indicating that service s is not deployed at edge server e.
The strategy function pi is a function executed by the actor network and is used for mapping a state space to an action space, namely pi, omega and a;
the goal of the policy function pi is to minimize the maximum edge resource usage and service latency, and to control the relative importance of resource usage and service latency by using the parameter β. The policy function pi is expressed as
Wherein, beta is a weight coefficient;
the constraints of the policy function pi include mapping constraintsDelay constraint->Resource constraints
The reward function is as follows:
in the method, in the process of the invention,for instant rewards. Gamma is the prize coefficient.The service time delay at the time t;
4) Constructing an actor network and a criticism network, and training the actor network and the criticism network;
loss function in criticism network training processThe following is shown:
wherein θ is a criticizing home network parameter;is a target value for evaluating policy quality; q (Q) i (ω, a; θ) is the policy quality of the service placement policy;Is the number of available resource units in the edge server;
5) Generating a service placement strategy by the actor network and inputting the strategy into the criticizing home network;
6) And (3) the criticizing home network evaluates the strategy quality of the service placement strategy, if the evaluation is not passed, updating actor network parameters, returning to the step (5), and if the evaluation is passed, outputting the service placement strategy.
The method for evaluating the policy quality of the service placement policy by the criticizing home network comprises the following steps: judging criticism home network loss functionWhether the convergence is carried out, if so, the passing is evaluated, otherwise, the passing is evaluated.
It is worth noting that the invention proposes a three-layer internet of vehicles architecture based on edge computing, and considers the dynamic service placement problem, with the optimization objective being to minimize maximum edge resource usage (from the service provider's perspective) and service delay (from the user's perspective).
In addition, the invention provides a service placement framework based on deep reinforcement learning, which consists of a strategy function (actor network) and a cost function (criticism network). The actor network makes service placement policies, and the critics network evaluates the decision making performance by the actor network based on the delay observed by the vehicle.
The technical effect of the invention is undoubted. The invention provides a dynamic service placement method based on edge calculation and deep reinforcement learning in the Internet of vehicles, which provides a dynamic service placement framework based on deep reinforcement learning in the Internet of vehicles, and aims to minimize the maximum edge resource use and service delay while considering the mobility of vehicles, the changing requirements and the dynamic nature of different types of service requests.
Drawings
FIG. 1 is a three-layer vehicle networking rack based on edge calculation;
FIG. 2 is a diagram of a smart structure;
fig. 3 is a flow chart of the present invention.
Detailed Description
The present invention is further described below with reference to examples, but it should not be construed that the scope of the above subject matter of the present invention is limited to the following examples. Various substitutions and alterations are made according to the ordinary skill and familiar means of the art without departing from the technical spirit of the invention, and all such substitutions and alterations are intended to be included in the scope of the invention.
Example 1:
referring to fig. 1 to 3, a dynamic service placement method based on edge calculation and deep reinforcement learning in the internet of vehicles includes the following steps:
1) Establishing a network and service request model, and acquiring information related to the network and service request;
the information related to the network and the service request comprises edge server information, vehicle information and service information;
the edge server information comprises an edge server set E, an edge server E and the residual resource capacity C of the edge server E e
The vehicle information includes a vehicle collection V.
The service information includes a service set S, a number lambda of vehicles requesting the service S s One service instance at a time (such as media file download in a car networking environment, collaboration awareness messages and environment notification services, etc.), or the number epsilon of vehicles that can provide parallel connections, the specified time t and vehicle location loc in the service request message, the amount of resources R consumed by the edge server deployment service s s Time delay requirement threshold D s
2) Establishing a network and service request calculation model;
the network and service request calculation model comprises a total service delay calculation model and an edge resource utilization calculation model;
the total service delay calculation model is as follows:
in the method, in the process of the invention,is the total service delay;Propagation delay and queuing delay; dist (v, s) is the euclidean distance between the vehicle v and the edge server deployed by service s; c is the propagation velocity of the signal through the communication medium;
when requesting the number lambda of vehicles of the service s s Queuing delay when epsilon is less than or equal to epsilonWhen requesting the number lambda of vehicles of the service s s At ∈ ->Satisfies the following formula:
wherein the number difference lambda 'is' s =λ s -ε;
Propagation delayThe following is shown:
wherein dist (v, s) is the Euclidean distance between the vehicle v and the edge server deployed by the service s; c is the propagation velocity of the signal through the communication medium.
The edge resource usage calculation model is as follows:
edge resource utilizationIs the ratio between the resources consumed by the service instance and the available resources of the edge server, as follows:
in the parameters ofC e The remaining resource capacity of the edge server e;The edge resource utilization rate is used; r is R s The amount of resources consumed by the service s for the edge server deployment.
3) Constructing a state space, an action space, a strategy function and a reward function;
the state space is characterized by a state space set ω, namely:
ω={[v 1 ,loc 1 ,s],[v 2 ,loc 2 ,s],...,[v n ,loc n ,s]} t (6)
wherein S is S; v 1 ,v 2 ,...,v n Aggregate for a group of vehicles; loc 1 ,loc 2 ,...,loc n For the set of vehicle locations at t for the request service s.
The action space is used for describing actions taken when placing services on the edge server;
wherein the action a taken at a given time t is as follows:
where pi is a policy function required to generate an action in the observation set of time unit t to ω;representing that the service s is deployed at the edge server e;Indicating that service s is not deployed at edge server e.
The strategy function pi is a function executed by the actor network and is used for mapping a state space to an action space, namely pi, omega and a;
the goal of the policy function pi is to minimize the maximum edge resource usage and service latency, and to control the relative importance of resource usage and service latency by using the parameter β. The policy function pi is expressed as
Wherein, beta is a weight coefficient;
the principle of the policy function pi is: and iterating the service set and the edge server set through the subscript s, searching the maximum edge resource use and service delay, and minimizing the service delay to obtain a corresponding strategy function pi.
Constraint package of the policy function piBracketing map constraintsDelay constraint->Resource constraints
The reward function is as follows:
in the method, in the process of the invention,for instant rewards. Gamma is the prize coefficient.The service time delay at the time t;
4) Constructing an actor network and a criticism network, and training the actor network and the criticism network;
loss function in criticism network training processThe following is shown:
wherein θ is a criticizing home network parameter;is a target value for evaluating policy quality; q (Q) i (ω, a; θ) is the policy quality of the service placement policy;Is the number of available resource units in the edge server;
5) Generating a service placement strategy by the actor network and inputting the strategy into the criticizing home network;
6) And (3) the criticizing home network evaluates the strategy quality of the service placement strategy, if the evaluation is not passed, updating actor network parameters, returning to the step (5), and if the evaluation is passed, outputting the service placement strategy.
The method for evaluating the policy quality of the service placement policy by the criticizing home network comprises the following steps: judging criticism home network loss functionWhether the convergence is carried out, if so, the passing is evaluated, otherwise, the passing is evaluated.
Example 2:
a dynamic service placement method based on edge calculation and deep reinforcement learning in the Internet of vehicles comprises the following steps:
1) And establishing a network and service request model, and acquiring the information of the edge server, the vehicle information and the service information.
The server information, the vehicle information and the service information comprise an edge server set E, an edge server E and the residual resource capacity C of the edge server E e Vehicle set V and service set S, vehicle number λ requesting service S s One service instance (such as media file download in a car networking environment, collaboration awareness information, environment notification service, etc.) can be processed at a time, or the number epsilon of vehicles connected in parallel can be provided, the time t and the vehicle position loc are specified in the service request information, and the resource quantity R consumed by the edge server deployment service s is calculated s Time delay requirement threshold D s
2) And (5) establishing a calculation model.
2.1 Total service delay modeling. The entire edge internet of vehicles system is modeled as an M/D/1 queue. Wherein when requesting service s from edge server e, the total service delay of the vehicleRefers to the total time from the vehicle sending a service request to the edge server receiving the corresponding response. Total service delay->By propagation delay->And queuing delay->The composition is as follows:
if lambda is s Epsilon and queuing delayIs 0. If lambda is s A queue is created and the average queuing delay for service s on the edge server will be as follows:
wherein lambda' s =λ s Epsilon and the average propagation delay is calculated as the ratio of distance to propagation speed on the medium as follows:
where dist (v, s) is the euclidean distance between the vehicle v and the edge server deployed by service s, and c is the propagation speed of the signal through the communication medium. Thus, the total service delay is as follows:
2.2 Edge resource usage modeling. Edge resource utilizationIs the ratio between the resources consumed by the service instance and the available resources of the edge server, as follows:
wherein,
3) The state space is designed. At a given time t, the set of state spaces describes the network environment. The agent observes the environment to form a set of state spaces ω from the service request model as follows:
ω={[v 1 ,loc 1 ,s],[v 2 ,loc 2 ,s],...,[v n ,loc n ,s]} t 。 (6)
wherein S is S, v 1 ,v 2 ,...,v n To a group of vehicles, loc 1 ,loc 2 ,...,loc n For the set of vehicle locations at t for the request service s.
4) And (3) a designed action space. The action space describes the actions taken by the policy module when placing a service on an edge server, as follows at a given time t:
where pi is the policy function required to generate the action in the observation set of time units t versus ω, a binary variableA matrix indicating the location of the service s on the edge server e is given, +.>Representing that service s is deployed at edge server e. On the contrary, the method comprises the steps of,indicating that service s is not deployed at edge server e.
5) And designing a strategy function. The policy function pi is a function performed by the actor network to map the state space to the action space pi omega-a. The goal of the policy function pi is to minimize the maximum edge resource usage and service latency, and to control the relative importance of resource usage and service latency by using the parameter β. The policy function pi is expressed as
The policy function is also subject to mapping constraints,delay constraint (S)>The resource constraints are set by the user and,
6) The bonus function is designed. At each time unit t, the system receives an instant prize from the environment in response to actions taken by the actor network of the agentThe following is shown:
7) The criticizing home network is constructed in charge of evaluating the quality Q (ω, a) of decisions made by the actor network. Inputting the state and movementTraining criticizing network for making and rewarding, and the criticizing network updates its parameter θ to minimize the loss functionThe following is shown:
wherein y is t Is the target value. The replay memory M is further used for storing the experience of the training criticizing home network. The criticizing home network obtains experience after a random time exists in replay and optimizes network parameters to obtain better performance.
8) After the training of the actor network and the criticizing home network is converged through the steps, the actor network can find the optimal placement strategy of the service while considering the mobility and the dynamics of the vehicle in different types of service requests. The criticizing home network may evaluate the policy quality of the actor network by a value function.
Example 3:
a dynamic service placement method based on edge calculation and deep reinforcement learning in the Internet of vehicles comprises the following steps:
1) And establishing the network and service request model, and acquiring information related to the network and service request.
2) Establishing a network and service request calculation model;
3) Constructing a state space, an action space, a strategy function and a reward function;
4) Constructing an actor network and a criticism network, and training the actor network and the criticism network;
5) Generating a service placement strategy by the actor network and inputting the strategy into the criticizing home network;
6) And (3) the criticizing home network evaluates the strategy quality of the service placement strategy, if the evaluation is not passed, updating actor network parameters, returning to the step (5), and if the evaluation is passed, outputting the service placement strategy.
Example 4:
the main content of the dynamic service placement method based on edge calculation and deep reinforcement learning in the Internet of vehicles is shown in the embodiment 3, wherein the information related to the network and the service request comprises edge server information, vehicle information and service information;
the edge server information comprises an edge server set E, an edge server E and the residual resource capacity C of the edge server E e
The vehicle information includes a vehicle collection V.
The service information includes a service set S, a number lambda of vehicles requesting the service S s The number of vehicles epsilon that can handle one service instance at a time or can provide parallel connections, the specified time t and vehicle location loc in the service request message, the amount of resources R consumed by the edge server deployment service s s Time delay requirement threshold D s The method comprises the steps of carrying out a first treatment on the surface of the The service examples include media file downloads in a car networking environment, collaboration awareness messages, and environment notification services.
Example 5:
the main content of the dynamic service placement method based on edge calculation and deep reinforcement learning in the Internet of vehicles is shown in the embodiment 3, wherein the network and service request calculation model comprises a total service time delay calculation model and an edge resource utilization rate calculation model;
the total service delay calculation model is as follows:
in the method, in the process of the invention,is the total service delay;Propagation delay and queuing delay; dist (v, s) is the euclidean distance between the vehicle v and the edge server deployed by service s; c is the propagation velocity of the signal through the communication medium;
when requesting the number lambda of vehicles of the service s s Queuing delay when epsilon is less than or equal to epsilonWhen requesting the number lambda of vehicles of the service s s At ∈ ->Satisfies the following formula:
wherein the number difference lambda 'is' s =λ s -ε;
Propagation delayThe following is shown:
wherein dist (v, s) is the Euclidean distance between the vehicle v and the edge server deployed by the service s; c is the propagation velocity of the signal through the communication medium.
The edge resource usage calculation model is as follows:
edge resource utilizationIs the ratio between the resources consumed by the service instance and the available resources of the edge server, as follows:
in the parameters ofC e Is an edge servere remaining resource capacity;The edge resource utilization rate is used; r is R s The amount of resources consumed by the service s for the edge server deployment.
Example 6:
the main content of the dynamic service placement method based on edge calculation and deep reinforcement learning in the internet of vehicles is shown in embodiment 3, wherein the state space is characterized by a state space set ω, namely:
ω={[v 1 ,loc 1 ,s],[v 2 ,loc 2 ,s],...,[v n ,loc n ,s]} t (6)
wherein S is S; v 1 ,v 2 ,...,v n Aggregate for a group of vehicles; loc 1 ,loc 2 ,...,loc n For the set of vehicle locations at t for the request service s.
Example 7:
the main content of the dynamic service placement method based on edge calculation and deep reinforcement learning in the Internet of vehicles is shown in the embodiment 3, wherein the action space is used for describing actions taken when placing services on an edge server;
wherein the action a taken at a given time t is as follows:
where pi is a policy function required to generate an action in the observation set of time unit t to ω;representing that the service s is deployed at the edge server e;Indicating that service s is not deployed at edge server e.
Example 8:
the main content of the dynamic service placement method based on edge calculation and deep reinforcement learning in the Internet of vehicles is shown in the embodiment 3, wherein the strategy function pi is a function executed by an actor network and is used for mapping a state space to an action space, namely pi:omega-a;
the goal of the policy function pi is to minimize the maximum edge resource usage and service delay, and to control the relative importance of resource usage and service delay by using parameter β;
the policy function pi is expressed as follows:
wherein, beta is a weight coefficient.
The constraints of the policy function pi include mapping constraintsDelay constraint->Resource constraints
Example 9:
the main content of the dynamic service placement method based on edge calculation and deep reinforcement learning in the internet of vehicles is shown in the embodiment 3, wherein the reward function is as follows:
in the method, in the process of the invention,for instant rewards. Gamma is the prize coefficient.The service time delay at the time t;
example 10:
a dynamic service placement method based on edge calculation and deep reinforcement learning in the Internet of vehicles is disclosed in embodiment 3, wherein the main content is the loss function in the criticizing home network training processThe following is shown:
wherein θ is a criticizing home network parameter;is a target value for evaluating policy quality; q (Q) i (ω, a; θ) is the policy quality of the service placement policy;Is the number of available resource units in the edge server;
example 11:
the main content of the dynamic service placement method based on edge calculation and deep reinforcement learning in the internet of vehicles is as shown in embodiment 3, wherein the method for evaluating the policy quality of the service placement policy by the criticizing home network comprises the following steps: judging criticism home network loss functionWhether the convergence is carried out, if so, the passing is evaluated, otherwise, the passing is evaluated. />

Claims (5)

1. A dynamic service placement method based on edge calculation and deep reinforcement learning in the Internet of vehicles is characterized by comprising the following steps:
1) Establishing a network and service request model, and acquiring information related to the network and service request;
2) Establishing a network and service request calculation model;
3) Constructing a state space, an action space, a strategy function and a reward function;
4) Constructing an actor network and a criticism network, and training the actor network and the criticism network;
5) Generating a service placement strategy by the actor network and inputting the strategy into the criticizing home network;
6) The criticizing home network evaluates the strategy quality of the service placement strategy, if the evaluation is not passed, the actor network parameters are updated, and the step 5) is returned, if the evaluation is passed, the service placement strategy is output;
the network and service request calculation model comprises a total service delay calculation model and an edge resource utilization calculation model;
the total service delay calculation model is as follows:
in the method, in the process of the invention,is the total service delay;Propagation delay and queuing delay; dist (v, s) is the euclidean distance between the vehicle v and the edge server deployed by service s; c is the propagation velocity of the signal through the communication medium;
when requesting the number lambda of vehicles of the service s s Queuing delay when epsilon is less than or equal to epsilonWhen requesting the number lambda of vehicles of the service s s At ∈ ->Satisfies the following formula:
wherein the number difference lambda 'is' s =λ s -ε;
Propagation delayThe following is shown:
wherein dist (v, s) is the Euclidean distance between the vehicle v and the edge server deployed by the service s; c is the propagation velocity of the signal through the communication medium;
the edge resource usage calculation model is as follows:
edge resource utilizationIs the ratio between the resources consumed by the service instance and the available resources of the edge server, as follows:
in the parameters ofC e The remaining resource capacity of the edge server e;The edge resource utilization rate is used; r is R s The amount of resources consumed by the deployment of the service s for the edge server;
the state space is characterized by a state space set ω, namely:
ω={[v 1 ,loc 1 ,s],[v 2 ,loc 2 ,s],…,[v n ,loc n ,s]} t (6)
wherein S is S; v 1 ,v 2 ,...,v n Aggregate for a group of vehicles; loc 1 ,loc 2 ,...,loc n A set of vehicle locations for the request service s at t;
the action space is used for describing actions taken when placing services on the edge server;
wherein the action a taken at a given time t is as follows:
where pi is a policy function required to generate an action in the observation set of time unit t to ω;representing that the service s is deployed at the edge server e;Indicating that service s is not deployed at edge server e;
the strategy function pi is a function executed by the actor network and is used for mapping a state space to an action space, namely pi, omega and a;
the goal of the policy function pi is to minimize the maximum edge resource usage and service delay, and to control the relative importance of resource usage and service delay by using parameter β;
the policy function pi is expressed as follows:
wherein, beta is a weight coefficient;
the constraints of the policy function pi include mapping constraintsDelay constraint->Resource constraints
2. The dynamic service placement method based on edge calculation and deep reinforcement learning in the internet of vehicles according to claim 1, wherein the network and service request related information includes edge server information, vehicle information, service information;
the edge server information comprises an edge server set E, an edge server E and the residual resource capacity C of the edge server E e
The vehicle information includes a vehicle set V;
the service information includes a service set S, a number lambda of vehicles requesting the service S s The number of vehicles epsilon that can handle one service instance at a time or can provide parallel connections, the specified time t and vehicle location loc in the service request message, the amount of resources R consumed by the edge server deployment service s s Time delay requirement threshold D s The method comprises the steps of carrying out a first treatment on the surface of the The service examples include media file downloads in a car networking environment, collaboration awareness messages, and environment notification services.
3. The method for dynamic service placement based on edge computation and deep reinforcement learning in the internet of vehicles according to claim 1, wherein the reward function is as follows:
in the method, in the process of the invention,is an instant rewards; gamma is the reward coefficient;And the service delay at the time t.
4. The dynamic service placement method based on edge calculation and deep reinforcement learning in the internet of vehicles according to claim 1, wherein the criticizing home network training process has a loss functionThe following is shown:
wherein θ is a criticizing home network parameter;is a target value for evaluating policy quality; q (Q) i (ω, a; θ) is the policy quality of the service placement policy;Is the number of available resource units in the edge server.
5. The method for dynamic service placement based on edge computation and deep reinforcement learning in internet of vehicles according to claim 4, wherein the method for criticizing the network to evaluate the policy quality of the service placement policy comprises: judging criticism home network loss functionWhether the convergence is carried out, if so, the passing is evaluated, otherwise, the passing is evaluated.
CN202210992657.5A 2022-08-18 2022-08-18 Dynamic service placement method based on edge calculation and deep reinforcement learning in Internet of vehicles Active CN115550944B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210992657.5A CN115550944B (en) 2022-08-18 2022-08-18 Dynamic service placement method based on edge calculation and deep reinforcement learning in Internet of vehicles

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210992657.5A CN115550944B (en) 2022-08-18 2022-08-18 Dynamic service placement method based on edge calculation and deep reinforcement learning in Internet of vehicles

Publications (2)

Publication Number Publication Date
CN115550944A CN115550944A (en) 2022-12-30
CN115550944B true CN115550944B (en) 2024-02-27

Family

ID=84725291

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210992657.5A Active CN115550944B (en) 2022-08-18 2022-08-18 Dynamic service placement method based on edge calculation and deep reinforcement learning in Internet of vehicles

Country Status (1)

Country Link
CN (1) CN115550944B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118502967A (en) * 2024-07-17 2024-08-16 北京师范大学珠海校区 Delay-aware container scheduling method, system and terminal for cluster upgrading

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110213796A (en) * 2019-05-28 2019-09-06 大连理工大学 A kind of intelligent resource allocation methods in car networking
CN113382383A (en) * 2021-06-11 2021-09-10 浙江工业大学 Method for unloading calculation tasks of public transport vehicle based on strategy gradient
WO2021237996A1 (en) * 2020-05-26 2021-12-02 多伦科技股份有限公司 Fuzzy c-means-based adaptive energy consumption optimization vehicle clustering method
CN114528042A (en) * 2022-01-30 2022-05-24 南京信息工程大学 Energy-saving automatic interconnected vehicle service unloading method based on deep reinforcement learning
CN114625504A (en) * 2022-03-09 2022-06-14 天津理工大学 Internet of vehicles edge computing service migration method based on deep reinforcement learning

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US12045738B2 (en) * 2020-12-23 2024-07-23 Intel Corporation Transportation operator collaboration system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110213796A (en) * 2019-05-28 2019-09-06 大连理工大学 A kind of intelligent resource allocation methods in car networking
WO2021237996A1 (en) * 2020-05-26 2021-12-02 多伦科技股份有限公司 Fuzzy c-means-based adaptive energy consumption optimization vehicle clustering method
CN113382383A (en) * 2021-06-11 2021-09-10 浙江工业大学 Method for unloading calculation tasks of public transport vehicle based on strategy gradient
CN114528042A (en) * 2022-01-30 2022-05-24 南京信息工程大学 Energy-saving automatic interconnected vehicle service unloading method based on deep reinforcement learning
CN114625504A (en) * 2022-03-09 2022-06-14 天津理工大学 Internet of vehicles edge computing service migration method based on deep reinforcement learning

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
Offloading Strategy for Vehicles in the Architecture of Vehicle-MEC-Cloud;Dasong Zhuang;《2022 IEEE/CIC International Conference on Communications in China (ICCC Workshops)》;20220811;全文 *
Task Offloading for End-Edge-Cloud Orchestrated Computing in Mobile Networks;Xiuhua Li;《2020 IEEE Wireless Communications and Networking Conference (WCNC)》;20200525;全文 *
一种车载服务的快速深度Q学习网络边云迁移策略;彭军;王成龙;蒋富;顾欣;牟玥玥,;刘伟荣;;电子与信息学报;20200115(01);全文 *
车联网中一种基于软件定义网络与移动边缘计算的卸载策略;张海波;荆昆仑;刘开健;贺晓帆;;电子与信息学报;20200315(03);全文 *

Also Published As

Publication number Publication date
CN115550944A (en) 2022-12-30

Similar Documents

Publication Publication Date Title
CN109756378B (en) Intelligent computing unloading method under vehicle-mounted network
Zhang et al. Mobile-edge computing for vehicular networks: A promising network paradigm with predictive off-loading
CN110312231A (en) Content caching decision and resource allocation joint optimization method based on mobile edge calculations in a kind of car networking
CN111262940B (en) Vehicle-mounted edge computing application caching method, device and system
CN113543074B (en) Joint computing migration and resource allocation method based on vehicle-road cloud cooperation
CN112995950B (en) Resource joint allocation method based on deep reinforcement learning in Internet of vehicles
CN114143346B (en) Joint optimization method and system for task unloading and service caching of Internet of vehicles
CN103763334B (en) Multimedia cooperative sharing method based on P2P-BT in VANET
CN111400001A (en) Online computing task unloading scheduling method facing edge computing environment
CN112395090B (en) Intelligent hybrid optimization method for service placement in mobile edge calculation
CN115550944B (en) Dynamic service placement method based on edge calculation and deep reinforcement learning in Internet of vehicles
CN114979145B (en) Content distribution method integrating sensing, communication and caching in Internet of vehicles
CN113507503B (en) Internet of vehicles resource allocation method with load balancing function
Xu et al. Socially driven joint optimization of communication, caching, and computing resources in vehicular networks
CN115941790A (en) Edge collaborative content caching method, device, equipment and storage medium
CN108332767A (en) A kind of electricity sharing method and relevant device
CN115052262A (en) Potential game-based vehicle networking computing unloading and power optimization method
CN113873534A (en) Block chain assisted federal learning active content caching method in fog calculation
Wei et al. OCVC: An overlapping-enabled cooperative vehicular fog computing protocol
CN113411826A (en) Edge network equipment caching method based on attention mechanism reinforcement learning
CN116566838A (en) Internet of vehicles task unloading and content caching method with cooperative blockchain and edge calculation
CN116489668A (en) Edge computing task unloading method based on high-altitude communication platform assistance
CN113573365B (en) Internet of vehicles edge caching method based on Markov transition probability
CN114928826A (en) Two-stage optimization method, controller and decision method for software-defined vehicle-mounted task unloading and resource allocation
Lee et al. A study of mobile edge computing system architecture for connected car media services on highway

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant