CN114328291A - Industrial Internet edge service cache decision method and system - Google Patents

Industrial Internet edge service cache decision method and system Download PDF

Info

Publication number
CN114328291A
CN114328291A CN202111556974.4A CN202111556974A CN114328291A CN 114328291 A CN114328291 A CN 114328291A CN 202111556974 A CN202111556974 A CN 202111556974A CN 114328291 A CN114328291 A CN 114328291A
Authority
CN
China
Prior art keywords
service
decision
cache
edge
reinforcement learning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111556974.4A
Other languages
Chinese (zh)
Inventor
叶可江
唐璐婕
须成忠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Institute of Advanced Technology of CAS
Original Assignee
Shenzhen Institute of Advanced Technology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Institute of Advanced Technology of CAS filed Critical Shenzhen Institute of Advanced Technology of CAS
Priority to CN202111556974.4A priority Critical patent/CN114328291A/en
Publication of CN114328291A publication Critical patent/CN114328291A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Information Transfer Between Computers (AREA)

Abstract

The invention relates to the technical field of industrial Internet, in particular to a method and a system for cache decision of industrial Internet edge service; the industrial Internet edge service cache decision method in the embodiment of the invention can calculate the optimal solution for the edge cache strategy mathematical model through the algorithm constructed based on the distributed deep reinforcement learning method, and can solve the optimization problem of the mathematical model of the system. The method is based on the establishment of a network mathematical model and the determination of an optimal target, and based on the combination of reinforcement learning and deep learning technologies, a machine is used for learning and predicting the preference of a user and the change trend of the popularity of contents in the network according to a large amount of historical data of the user, and a service cache strategy is adjusted according to the learning result. The optimal solution of the service caching decision can be effectively given. The corresponding system also has the same technical effect.

Description

Industrial Internet edge service cache decision method and system
Technical Field
The invention relates to the technical field of industrial internet, in particular to a method and a system for cache decision of industrial internet edge service.
Background
With the increasing number of industrial devices accessing the internet, it is difficult to satisfy the requirements of industrial applications in terms of time delay and economy by only relying on the traditional cloud computing mode. Edge computing is used as a new computing paradigm, and can relieve the physical resource bottleneck of intelligent equipment. In an edge computing system, traffic load and quality of service may be improved by service caching. However, how to flexibly configure the edge service cache within the limited edge storage capacity to improve the system performance is extremely challenging.
The prior art is mainly directed to the research of the mobile edge calculation caching problem. Most work has focused on improving some caching strategies on legacy networks based on the new characteristics of the mobile edge computing network. There is also a portion of work exploring new caching schemes, such as caching strategies based on user preferences, based on learning, or multi-edge node collaboration. But because content popularity, user preferences, etc. are constantly changing over time and are unpredictable. Meanwhile, the service cache problem is an integer linear programming problem and a problem that cannot be solved within polynomial time, and the traditional optimization method is difficult to effectively realize the result of optimizing the service cache. The prior art has the defects.
Disclosure of Invention
In order to solve at least one technical problem, embodiments of the present invention provide a method and a system for deciding an edge service cache of an industrial internet, which solve an optimal edge service cache policy through a distributed deep reinforcement learning algorithm, so as to achieve the purpose of minimizing service access delay and energy consumption.
According to an embodiment of the present invention, a method for deciding an industrial internet edge service cache is provided, which includes the following steps:
s1, performing mathematical modeling on an industrial Internet system based on the fact that a task corresponding to a service can be executed only when corresponding service data are cached in a server; the cloud server of the system model caches data required by all services;
s2, establishing a mathematical model for service access time delay in the opposite side cloud coordination system;
s3, performing mathematical modeling on the energy consumption of the industrial Internet system according to the power of data transmission between the edge servers and the cloud server and the computing power of the edge servers and the cloud server;
s4, establishing an optimization target for achieving the minimized service access time delay and the minimized energy consumption based on the system model, the time delay model and the energy consumption model;
and S5, constructing an algorithm capable of realizing the optimization target based on a distributed deep reinforcement learning method.
The invention also provides an industrial internet edge service cache decision system adopting the method, which comprises the following steps: the system comprises a mathematical modeling module and a service cache decision module;
the mathematical modeling module performs mathematical modeling on the industrial Internet system based on the task that the corresponding service can be executed only when corresponding service data is cached in the server; the cloud server of the system model caches data required by all services;
establishing a mathematical model for service access delay in the edge cloud cooperative system;
performing mathematical modeling on the energy consumption of the industrial Internet system according to the power of data transmission between the edge servers and the cloud servers and the computing power of the edge servers and the cloud servers;
establishing an optimization target for achieving the minimized service access delay and the minimized energy consumption based on the system model, the delay model and the energy consumption model;
and the service cache decision module constructs an algorithm capable of realizing the optimization goal based on a distributed deep reinforcement learning method.
According to the method and the system for the industrial Internet edge service cache decision, the optimal solution can be calculated for the edge cache strategy mathematical model through the algorithm constructed based on the distributed deep reinforcement learning method, and part of the problems can be solved. The method is characterized in that a machine learns and predicts the preference of a user and the change trend of the content popularity in the network according to a large amount of user historical data by determining the digital modeling and the optimal solution target of an industrial Internet system and based on the combination of reinforcement learning and deep learning technologies, and adjusts a service cache strategy according to the learning result. The optimal solution of the service caching decision can be effectively given.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the invention without limiting the invention. In the drawings:
FIG. 1 is a flow chart of an industrial Internet edge service cache decision method according to the present invention;
FIG. 2 is a schematic diagram of an industrial Internet edge service cache decision method according to the present invention;
fig. 3 is a schematic diagram of a side cloud cooperative service structure of the industrial internet edge service caching decision system according to the present invention.
Detailed Description
In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
As shown in FIG. 3, to facilitate modeling (building a mathematical model) an industrial Internet system, the present invention discretizes time into evenly distributed time slices
Figure BDA0003419116110000031
The interval of each time slice is Δ t. Consider that the industrial Internet system consists of N edge servers, denoted as
Figure BDA0003419116110000032
Each edge server may provide data analysis and processing services for sensor devices and industrial devices. The secondary computing and storage resources on the edge server are limited compared to the cloud server, and the computing and storage capabilities of the edge server n are denoted as
Figure BDA0003419116110000034
And
Figure BDA0003419116110000035
let FcloudRepresenting the computing power of the cloud server.
Referring to fig. 1 to 3, according to an embodiment of the present invention, a method for deciding an industrial internet edge service cache is provided, which includes the following steps:
s1, performing mathematical modeling on an industrial Internet system based on the fact that a task corresponding to a service can be executed only when corresponding service data are cached in a server; the cloud server of the system model caches data required by all services.
When mathematical modeling is implemented, a certain type of task is performed on an edge server, and a corresponding service is placed first. A service is an abstraction of an application, and to run a particular service, an edge server should cache a relevant data, including the software and databases needed by the application. And the corresponding tasks of the service can be executed only if the corresponding service data is cached in the server. In the modeling process of the invention, the cloud server is assumed to cache data required by all services.
In an industrial Internet system, the generated service set is represented as
Figure BDA0003419116110000033
The invention assumes that different services have different data volumes, require different computational and memory resources to process, respectively using flAnd mlIs shown in which
Figure BDA0003419116110000041
Each edge server may cache one or more services. The invention defines a binary variable xl,n(t) is equal to {0,1} and represents whether the service is cached on the edge server, the service caching policy is
Figure BDA0003419116110000042
If service l is cached on edge server n at time t, xl,n(t) 1, whereas xl,n(t) is 0. Let p bel,nRepresenting the computing power allocated to service l by edge server n, since the service cache is limited by the edge server memory space and computing power:
Figure BDA0003419116110000043
Figure BDA0003419116110000044
in order to analyze the access delay of the service, it is assumed that at time t, the number of requests of the industrial device for the service l from the edge server n is λl,n(t) of (d). Due to the diversity of the data collected by the devices and the dynamic nature of the requested service, lambdal,n(t) is dynamically variable. It should be noted that when the requested service is not cached on the edge server, it is necessary to do soAnd scheduling the line workload, namely scheduling the arrived service request to an edge server caching the service or to a cloud server for execution. Let mu letl,n(t)∈[0,1]Representing the proportion of the load that service i performs on edge server n. Mu.sl,c(t) represents the proportion of service l executed on the cloud server, μl,nl,cThe values of (c) can be set using a number of different strategies, but the following conditions are generally required:
Figure BDA0003419116110000045
at time t, the total number of requests for service l in the industrial internet system is represented as:
Figure BDA0003419116110000046
thus, the total workload and data size for an edge server n in the system to run service l are:
Fl,n(t)=μl,nflλl(t)
Ml,n(t)=μl,nmlλl(t);
the final implementation uses a mathematical model to embody a heterogeneous edge cloud collaborative offload framework, as shown in fig. 3, which includes a large number of industrial and sensor devices, a plurality of edge servers and a cloud server, the industrial and sensor devices communicating with the edge server through a wireless channel, and the edge server connecting to a remote cloud through a wired link. The device tasks may be offloaded to an edge server or a cloud server for execution. For an edge server without a cache service or not providing enough computing power, the corresponding task can be offloaded to an edge server or a cloud server with a cache service existing nearby for execution. The cooperation between the edge nodes can fully utilize the resource capacity of the heterogeneous edge server, and the problem of mismatching of the resource capacity of a single edge node is solved.
And S2, establishing a mathematical model for service access time delay in the opposite side cloud coordination system.
In particular implementation, the computation latency for executing service l on edge server n is represented as:
Figure BDA0003419116110000051
the computation delay for executing service l in the cloud server is represented as:
Figure BDA0003419116110000052
because the industrial equipment is close to the edge server, the transmission delay between the equipment and the edge server is ignored, only the workflow transmission delay between adjacent edge servers is considered, and the data transmission rate between the edge servers is made to be reThen, the data transmission delay between the edge servers is expressed as:
Figure BDA0003419116110000053
if the task is unloaded to a remote cloud server for execution, the data transmission rate of the core network is set as rcThe transmission delay of the task to be unloaded to the cloud end for execution is represented as:
Figure BDA0003419116110000054
the access delay for running service l includes the transmission delay and the calculated delay of the service:
Figure BDA0003419116110000055
and S3, performing mathematical modeling on the energy consumption of the industrial Internet system according to the power of data transmission between the edge servers and the cloud server and the computing power of the edge servers and the cloud server.
In particular, if
Figure BDA0003419116110000056
Indicating the power of the data transfer between the edge servers,
Figure BDA0003419116110000057
representing the edge server computation power, the energy consumed to run service l on edge server n at time t is:
Figure BDA0003419116110000061
if it is
Figure BDA0003419116110000062
Representing the power of core network data transmission between the edge server and the cloud server,
Figure BDA0003419116110000063
representing the computing power of the cloud server, the energy consumed by running the service l on the cloud server at the time t is:
Figure BDA0003419116110000064
therefore, at time t, the total energy consumption of running service l is represented as:
Figure BDA0003419116110000065
s4, establishing an optimization target for achieving the minimized service access time delay and the minimized energy consumption based on the system model, the time delay model and the energy consumption model;
in specific implementation, the optimal service caching strategy is solved by the optimization target of the invention, so that the aims of minimizing service access delay and energy consumption are fulfilled, wherein beta represents the weight occupied by energy consumption. The optimization objectives are as follows:
Figure BDA0003419116110000066
Figure BDA0003419116110000067
Figure BDA0003419116110000068
Figure BDA0003419116110000069
Figure BDA00034191161100000610
Figure BDA00034191161100000611
and S5, constructing an algorithm capable of realizing an optimization target based on the distributed deep reinforcement learning method.
Further, step S5 further includes the following steps:
and S51, combining the multiple parallel deep neural networks DNN with a reinforcement learning algorithm Q-learning to construct a parallel deep reinforcement learning algorithm for service cache decision.
Specifically, in order to minimize service access delay and energy consumption, a parallel deep reinforcement learning algorithm is designed, and a plurality of parallel deep neural networks DNN and a reinforcement learning algorithm Q-learning are combined to perform service caching decision.
Deep reinforcement learning generally defines a problem as a markov decision process. The method mainly focuses on how the intelligent agent interacts with the environment, adopts different actions and improves the accumulated reward to the maximum extent. Its main components include agents, environments, states, actions and rewards. Thus, the present invention describes the service optimization caching problem as a Markov decision process. It is composed of three parts of state space S, action space A and reward function R, and is defined as follows:
state space: sn,tE S represents the state of the edge server n at time slot t.
Figure BDA0003419116110000071
Respectively representing the storage and computation capabilities of the edge server n, the arriving service requests and the edge service caching policies.
An action space: in each time slot t, the edge server needs to make a service caching decision based on the current state, at=Υt
The reward function: the optimization goal of edge service caching is to minimize service access latency and energy consumption. Thus, the designed reward function is as follows:
Figure BDA0003419116110000072
defining the action cost function as Q(s)t,at) And updating the Q value:
Figure BDA0003419116110000073
alpha e (0, 1) represents the learning rate, and the reward attenuation factor gamma e 0, 1.
M parallel neural network elements are provided to act in conjunction with Q-learning. Each neural network action execution is parallel, and comprises two DNNs with the same structure but different parameters, one is a main neural network for predicting Q estimation values and provided with the latest network parameter theta, and the other is a target neural network for predicting Q actual values, and the used parameter is the parameter theta of a period of time before*And remain unchanged for a period of time. And when the main neural network learns for a certain number of times, updating the parameters of the target neural network.Each neural network element will give the selected action value according to the Q value calculated by the greedy algorithm. The loss function is defined as:
Figure BDA0003419116110000081
s52, selecting an action to execute according to the current state by a greedy strategy in a training stage to obtain a reward and a next state, and converting and storing the obtained states into an experience pool; when the experience pool D stores a large enough capacity, a certain number of state transitions are extracted from the experience pool to train the network parameters.
In the training phase, according to the current state stSelecting an action a with an epsilon-greedy strategytExecuting the action receives a reward RtAnd the next state st+1Converting the obtained state [ s ]t,at,Rt,st+1]And storing the state transition into an experience pool D, and randomly extracting a certain number of state transitions from the experience pool to train the network parameter theta when the storage capacity of the experience pool D is large enough.
The algorithm flow of the edge service caching algorithm training phase based on the distributed deep reinforcement learning is as follows:
inputting an algorithm: the method comprises the following steps of environment information psi, a reward attenuation factor gamma, a learning rate alpha, a search and utilization balance parameter epsilon, an experience pool D, an iteration round M and an updating step number C.
And (3) outputting an algorithm: m DNN network parameters. The specific training steps are as follows:
step 1: parallel DNNs are initialized, and experience pool D is initialized.
Step 2: for epoch from 1 to M.
And step 3: initializing states s for m parallel DNNst
And 4, step 4: for time sol T from 1 to T, iteration is performed.
And 5: selecting m actions through an epsilon-greedy strategy
Figure BDA0003419116110000082
i=1,2,3...,m。
Step 6: in m DNNs, each execution is
Figure BDA0003419116110000083
Observing awards obtained
Figure BDA0003419116110000084
And obtaining a new state st+1
And 7: handle
Figure BDA0003419116110000085
And storing the experience into an experience pool D.
And 8: randomly taking n samples [ s ] from the experience pool Dj,aj,Rj(sj,aj),sj+1],j=1,2,...,n。
And step 9: and calculating a Loss function Loss, and updating m main neural network parameters theta through the gradient back propagation of the neural network.
Step 10: if t% C is 0, m main neural network parameters theta are assigned to the target neural network theta*
Step 11: and if T < ═ T, entering the next time gap and returning to the step 5.
Step 12: and if the epoch is M, ending the iteration and outputting M DNN network parameters.
And S53, generating a plurality of cache decisions through a plurality of parallel deep neural networks in a decision stage, storing the cache decisions into an action set, calculating corresponding rewards obtained by each cache decision, and taking the cache decision with the highest reward as an output action.
In the decision phase, an optimal service caching strategy needs to be obtained. Firstly, the invention generates m cache decisions a through m parallel DNNsiAnd stored in the action set A, and calculates and executes the action aiThe prize R (s, a) earnedi) And Q (s, a)i) Finding the action a as argmax from the action set AAQ(StAnd A), outputting the action a as an edge service cache policy.
The algorithm flow of the edge service cache algorithm decision stage based on the distributed deep reinforcement learning is as follows:
inputting an algorithm: m DNN network parameters θ.
And (3) outputting an algorithm: service caching policy a. The method comprises the following specific steps:
step 1: for i from 1 to m.
Step 2: generation of action a by the ith DNNiAnd input to action set a (a ═ a)1,a2,a3,...,ai})。
And step 3: execution of aiTo obtain a reward R (s, a)i) Calculating Q (s, a)i)。
And 4, step 4: if i is m, selecting a motion a from motion set A as argmaxAQ(St,A)。
And 5: and outputting the action a as a service caching strategy.
Further, in step S1, the edge server provides data analysis and processing services for the sensor device and the industrial device; edge servers have limited computing and storage resources relative to cloud servers.
Further, in step S1, when the requested service is not cached on the edge server closest to the requested service, the service is executed on the cloud server or another edge server caching the service.
Further, in step S5, the service optimization caching problem is described as a markov decision process; the Markov decision process consists of three parts, namely a state space S, an action space A and a reward function R.
Further, in step S51, the actions of the deep neural network DNN are performed as being executed in parallel, where the deep neural network DNN includes two neural network structures having the same structure but different parameters; one is a main neural network used for predicting the Q estimation value of the reinforcement learning algorithm Q-learning, and has the latest network parameters; the other is a target neural network for predicting the actual value of Q of the reinforcement learning algorithm Q-learning, and the used parameters are parameters before a period of time and are kept unchanged for a period of time.
The invention also provides an industrial internet edge service cache decision system adopting any one of the methods, which comprises the following steps: the system comprises a mathematical modeling module and a service cache decision module;
the mathematical modeling module performs mathematical modeling on the industrial Internet system based on the task that the corresponding service can be executed only when corresponding service data is cached in the server; the cloud server of the system model caches data required by all services;
establishing a mathematical model for service access delay in the edge cloud cooperative system;
performing mathematical modeling on the energy consumption of the industrial Internet system according to the power of data transmission between the edge servers and the cloud servers and the computing power of the edge servers and the cloud servers;
establishing an optimization target for achieving the minimized service access delay and the minimized energy consumption based on the system model, the delay model and the energy consumption model;
the service cache decision module constructs an algorithm capable of realizing an optimization target based on a distributed deep reinforcement learning method.
Further, the service caching decision module further includes: the system comprises an algorithm construction unit, a training unit and a decision unit;
the algorithm construction unit combines a plurality of parallel deep neural networks DNN and a reinforcement learning algorithm Q-learning to construct a parallel deep reinforcement learning algorithm and perform service cache decision;
the training unit selects an action to execute according to the current state by a greedy strategy in a training stage to obtain a reward and a next state, and the obtained state is converted and stored into an experience pool; when the storage capacity of the experience pool D is large enough, extracting a certain number of state transitions from the experience pool to train network parameters;
the decision unit generates a plurality of cache decisions through a plurality of parallel deep neural networks in a decision stage and stores the cache decisions into an action set, calculates corresponding rewards obtained by each cache decision and takes the cache decision with the largest reward as an output action.
Further, the algorithm construction unit describes the service optimization caching problem as a Markov decision process; the Markov decision process consists of three parts, namely a state space S, an action space A and a reward function R.
Further, in the parallel deep reinforcement learning algorithm constructed by the algorithm construction unit, the action execution of the deep neural network DNN is parallel execution, and the deep neural network DNN includes two neural network structures having the same structure but different parameters; one is a main neural network used for predicting the Q estimation value of the reinforcement learning algorithm Q-learning, and has the latest network parameters; the other is a target neural network for predicting the actual value of Q of the reinforcement learning algorithm Q-learning, and the used parameters are parameters before a period of time and are kept unchanged for a period of time.
The method is used for calculating the optimal solution of the mathematical model of the edge cache strategy based on the combination of a plurality of parallel deep learning networks and a reinforcement learning algorithm, and can solve the technical problem of inaccurate prediction of the cache strategy in the prior art. The machine learning is combined with a deep learning algorithm, a machine learns and predicts the preference of a user and the change trend of the content popularity in the network according to a large amount of user historical data, and a service caching strategy is adjusted according to the learning result.
Although many researches on the problem of edge computing service caching have been carried out at present, the researches on edge service caching by adopting a deep reinforcement learning method are less, and the application to the field of industrial internet is less. In addition, compared with the traditional deep reinforcement learning method, the distributed method is designed, a plurality of parallel DNNs are adopted for service caching decision, and the service caching decision is better in performance of minimizing service access delay and energy consumption.
The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.

Claims (10)

1. An industrial Internet edge service cache decision method is characterized by comprising the following steps:
s1, performing mathematical modeling on an industrial Internet system based on the fact that a task corresponding to a service can be executed only when corresponding service data are cached in a server; the cloud server of the system model caches data required by all services;
s2, establishing a mathematical model for service access time delay in the opposite side cloud coordination system;
s3, performing mathematical modeling on the energy consumption of the industrial Internet system according to the power of data transmission between the edge servers and the cloud server and the computing power of the edge servers and the cloud server;
s4, establishing an optimization target for achieving the minimized service access time delay and the minimized energy consumption based on the system model, the time delay model and the energy consumption model;
and S5, constructing an algorithm capable of realizing the optimization target based on a distributed deep reinforcement learning method.
2. The method according to claim 1, wherein the step S5 further comprises the steps of:
s51, combining a plurality of parallel deep neural networks DNN with a reinforcement learning algorithm Q-learning to construct a parallel deep reinforcement learning algorithm for service cache decision;
s52, selecting an action to execute according to the current state by a greedy strategy in a training stage to obtain a reward and a next state, and converting and storing the obtained states into an experience pool; when the storage capacity of the experience pool D is large enough, extracting a certain number of state transitions from the experience pool to train network parameters;
and S53, generating a plurality of cache decisions through a plurality of parallel deep neural networks in a decision stage, storing the cache decisions into an action set, calculating corresponding rewards obtained by each cache decision, and taking the cache decision with the highest reward as an output action.
3. The method of claim 2, wherein in the step S1, the edge server provides data analysis and processing services for the sensor device and the industrial device; the edge server has limited computing and storage resources relative to the cloud server.
4. The method according to claim 3, wherein in step S1, when the requested service is not cached on the edge server closest to the requested service, the service is executed on a cloud server or another edge server caching the service.
5. The method according to claim 4, wherein in step S5, the service optimization caching problem is described as a Markov decision process; the Markov decision process consists of three parts, namely a state space S, an action space A and a reward function R.
6. The method according to claim 5, characterized in that in step S51, the actions of the deep neural network DNN are performed in parallel, the deep neural network DNN comprising two neural network structures having the same structure but different parameters; one is a main neural network for predicting the Q estimation value of the reinforcement learning algorithm Q-learning, and has the latest network parameters; the other is a target neural network for predicting the actual value of Q of the reinforcement learning algorithm Q-learning, and the parameters used are parameters before a period of time and are kept unchanged for a period of time.
7. An industrial internet edge service caching decision system using the method of any one of claims 1 to 6, comprising: the system comprises a mathematical modeling module and a service cache decision module; it is characterized in that the preparation method is characterized in that,
the mathematical modeling module performs mathematical modeling on the industrial Internet system based on the task that the corresponding service can be executed only when corresponding service data is cached in the server; the cloud server of the system model caches data required by all services;
establishing a mathematical model for service access delay in the edge cloud cooperative system;
performing mathematical modeling on the energy consumption of the industrial Internet system according to the power of data transmission between the edge servers and the cloud servers and the computing power of the edge servers and the cloud servers;
establishing an optimization target for achieving the minimized service access delay and the minimized energy consumption based on the system model, the delay model and the energy consumption model;
and the service cache decision module constructs an algorithm capable of realizing the optimization goal based on a distributed deep reinforcement learning method.
8. The system of claim 7, wherein the service caching decision module further comprises: the system comprises an algorithm construction unit, a training unit and a decision unit;
the algorithm construction unit combines a plurality of parallel deep neural networks DNN and a reinforcement learning algorithm Q-learning to construct a parallel deep reinforcement learning algorithm for service cache decision;
the training unit selects an action to execute according to the current state by a greedy strategy in a training stage to obtain a reward and a next state, and the obtained state is converted and stored into an experience pool; when the storage capacity of the experience pool D is large enough, extracting a certain number of state transitions from the experience pool to train network parameters;
the decision unit generates a plurality of cache decisions through a plurality of parallel deep neural networks in a decision stage and stores the cache decisions into an action set, and calculates corresponding rewards obtained by each cache decision, and the cache decision with the largest reward is used as an output action.
9. The system of claim 8, wherein the algorithm building unit describes the service optimization caching problem as a markov decision process; the Markov decision process consists of three parts, namely a state space S, an action space A and a reward function R.
10. The system according to claim 9, wherein in the parallel deep reinforcement learning algorithm constructed by the algorithm construction unit, the actions of the deep neural network DNN are executed in parallel, and the deep neural network DNN comprises two neural network structures with the same structure but different parameters; one is a main neural network for predicting the Q estimation value of the reinforcement learning algorithm Q-learning, and has the latest network parameters; the other is a target neural network for predicting the actual value of Q of the reinforcement learning algorithm Q-learning, and the parameters used are parameters before a period of time and are kept unchanged for a period of time.
CN202111556974.4A 2021-12-18 2021-12-18 Industrial Internet edge service cache decision method and system Pending CN114328291A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111556974.4A CN114328291A (en) 2021-12-18 2021-12-18 Industrial Internet edge service cache decision method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111556974.4A CN114328291A (en) 2021-12-18 2021-12-18 Industrial Internet edge service cache decision method and system

Publications (1)

Publication Number Publication Date
CN114328291A true CN114328291A (en) 2022-04-12

Family

ID=81053229

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111556974.4A Pending CN114328291A (en) 2021-12-18 2021-12-18 Industrial Internet edge service cache decision method and system

Country Status (1)

Country Link
CN (1) CN114328291A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114900556A (en) * 2022-05-24 2022-08-12 杭州师范大学钱江学院 Cloud interconnection method and system based on service preference learning in multi-cloud heterogeneous environment
CN115174681A (en) * 2022-06-14 2022-10-11 武汉大学 Method, equipment and storage medium for scheduling edge computing service request
CN115633380A (en) * 2022-11-16 2023-01-20 合肥工业大学智能制造技术研究院 Multi-edge service cache scheduling method and system considering dynamic topology
CN115866678A (en) * 2023-02-20 2023-03-28 中国传媒大学 Mobile edge cache resource optimization method based on network energy consumption hotspot detection

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114900556A (en) * 2022-05-24 2022-08-12 杭州师范大学钱江学院 Cloud interconnection method and system based on service preference learning in multi-cloud heterogeneous environment
CN114900556B (en) * 2022-05-24 2023-04-11 杭州师范大学钱江学院 Cloud interconnection method and system based on service preference learning in multi-cloud heterogeneous environment
CN115174681A (en) * 2022-06-14 2022-10-11 武汉大学 Method, equipment and storage medium for scheduling edge computing service request
CN115174681B (en) * 2022-06-14 2023-12-15 武汉大学 Method, equipment and storage medium for scheduling edge computing service request
CN115633380A (en) * 2022-11-16 2023-01-20 合肥工业大学智能制造技术研究院 Multi-edge service cache scheduling method and system considering dynamic topology
CN115633380B (en) * 2022-11-16 2023-03-17 合肥工业大学智能制造技术研究院 Multi-edge service cache scheduling method and system considering dynamic topology
CN115866678A (en) * 2023-02-20 2023-03-28 中国传媒大学 Mobile edge cache resource optimization method based on network energy consumption hotspot detection

Similar Documents

Publication Publication Date Title
Sundararaj Optimal task assignment in mobile cloud computing by queue based ant-bee algorithm
He et al. Green resource allocation based on deep reinforcement learning in content-centric IoT
CN114328291A (en) Industrial Internet edge service cache decision method and system
Sun et al. Cooperative computation offloading for multi-access edge computing in 6G mobile networks via soft actor critic
Sun et al. Autonomous resource slicing for virtualized vehicular networks with D2D communications based on deep reinforcement learning
CN112134916A (en) Cloud edge collaborative computing migration method based on deep reinforcement learning
Peng et al. Joint optimization of service chain caching and task offloading in mobile edge computing
Zhou et al. Learning from peers: Deep transfer reinforcement learning for joint radio and cache resource allocation in 5G RAN slicing
CN113822456A (en) Service combination optimization deployment method based on deep reinforcement learning in cloud and mist mixed environment
Xie et al. Workflow scheduling in serverless edge computing for the industrial internet of things: A learning approach
Ren et al. Multi-objective optimization for task offloading based on network calculus in fog environments
KR20230007941A (en) Edge computational task offloading scheme using reinforcement learning for IIoT scenario
CN115714820A (en) Distributed micro-service scheduling optimization method
Li et al. DQN-enabled content caching and quantum ant colony-based computation offloading in MEC
Dong et al. Content caching-enhanced computation offloading in mobile edge service networks
Huang et al. Reinforcement learning for cost-effective IoT service caching at the edge
Ullah et al. Optimizing task offloading and resource allocation in edge-cloud networks: a DRL approach
Li et al. Efficient service selection approach for mobile devices in mobile cloud
CN116760722A (en) Storage auxiliary MEC task unloading system and resource scheduling method
CN113766540B (en) Low-delay network content transmission method, device, electronic equipment and medium
CN114500561B (en) Power Internet of things network resource allocation decision-making method, system, equipment and medium
Borzemski et al. Adaptive and intelligent request distribution for content delivery networks
CN116418808A (en) Combined computing unloading and resource allocation method and device for MEC
Zhang et al. Deep Reinforcement Learning Based Joint Caching and Resources Allocation for Cooperative MEC
Liu et al. Optimized min-min dynamic task scheduling algorithm in grid computing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination