CN116346837A - Internet of things edge collaborative caching method based on deep reinforcement learning - Google Patents
Internet of things edge collaborative caching method based on deep reinforcement learning Download PDFInfo
- Publication number
- CN116346837A CN116346837A CN202310296228.9A CN202310296228A CN116346837A CN 116346837 A CN116346837 A CN 116346837A CN 202310296228 A CN202310296228 A CN 202310296228A CN 116346837 A CN116346837 A CN 116346837A
- Authority
- CN
- China
- Prior art keywords
- edge server
- parameter
- global
- server
- video
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 37
- 230000002787 reinforcement Effects 0.000 title claims abstract description 24
- 238000012549 training Methods 0.000 claims abstract description 42
- 238000013528 artificial neural network Methods 0.000 claims abstract description 33
- 230000002776 aggregation Effects 0.000 claims abstract description 6
- 238000004220 aggregation Methods 0.000 claims abstract description 6
- 239000011159 matrix material Substances 0.000 claims description 18
- 230000008569 process Effects 0.000 claims description 9
- 238000012804 iterative process Methods 0.000 claims description 7
- 230000006870 function Effects 0.000 claims description 6
- 238000004364 calculation method Methods 0.000 claims description 4
- 230000003993 interaction Effects 0.000 claims description 4
- 230000004913 activation Effects 0.000 claims description 3
- 230000000379 polymerizing effect Effects 0.000 claims description 3
- 238000010586 diagram Methods 0.000 description 4
- 230000009286 beneficial effect Effects 0.000 description 2
- 230000008447 perception Effects 0.000 description 2
- 238000009825 accumulation Methods 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 238000010223 real-time analysis Methods 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/1097—Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/12—Protocols specially adapted for proprietary or special-purpose networking environments, e.g. medical networks, sensor networks, networks in vehicles or remote metering networks
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- Computing Systems (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Computer Networks & Wireless Communication (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Signal Processing (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Molecular Biology (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Biomedical Technology (AREA)
- Medical Informatics (AREA)
- Evolutionary Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Information Transfer Between Computers (AREA)
Abstract
The invention provides an Internet of things edge collaborative caching method based on deep reinforcement learning, which comprises the following steps: (1) The edge server collects video cache information of the terminal layer user equipment and constructs a data set, (2) the distributed edge server trains the data set through a model training module; (3) The central server receives the local gradient parameters of each distributed edge server to carry out parameter aggregation, so as to obtain global gradient parameters; (4) The central server inputs the fitted global gradient parameters through a parameter training module, trains the neural network and outputs updated global model parameters; (5) Repeating the steps (1) - (4) to obtain a prediction model of the online video request, and predicting the video content request to obtain a prediction list of the online video request of the user; (6) And according to the obtained prediction list of the online video request of the user, the plurality of distributed edge servers perform collaborative caching until each distributed edge server reaches the upper storage limit.
Description
Technical Field
The invention belongs to the technical field of edge caching, and particularly relates to an Internet of things edge collaborative caching method based on deep reinforcement learning.
Background
With the rapid growth of network users, video quality is upgraded from the traditional 1080P high-definition content to 4K and 8K levels, and complex man-machine interaction modes such as VR and AR are generated, so that a backbone network is subjected to severe pressure. The edge caching technique can effectively reduce latency and backhaul load by placing the most popular content on servers closer to the requesting user. Edge caching techniques present challenges due to the limited storage space of edge servers and the varying popularity of content over time and space.
Reinforcement learning can accommodate environmental changes without any prior knowledge about the dynamics of the environment. Conventional reinforcement learning algorithms are limited to dynamic environments with fully observable low-dimensional state spaces. The state space of an actual edge-cached environment is typically high-dimensional, and manually extracting all useful features from the environment can create significant difficulties. The deep reinforcement learning can automatically determine an optimal strategy based on the original high-dimensional environment state space, effectively solve the dimension disaster and provide an effective solution for the actual edge cache environment.
Most of the current edge caching systems based on deep reinforcement learning use centralized content caching, but the centralized scheme requires a centralized controller to collect the local parameters of all servers and then generate content caching decisions for them. The computational complexity of the centralized approach grows exponentially with the number of servers. Thus, some studies confirm the effectiveness of the distributed content caching scheme.
Disclosure of Invention
The invention aims to: the invention provides an Internet of things edge collaborative caching method based on deep reinforcement learning, which applies the deep reinforcement learning to content popularity perception and caching decision of edge caching, fully utilizes the self-adaptive capacity of the deep reinforcement learning, and can realize sensitive perception of network state, user request and content popularity and timely respond. The cache strategy is constantly optimized through learning, so that the cache hit rate of the edge cache is improved, the storage and calculation resources of the edge server are efficiently utilized, and the time delay and the return load are reduced.
The technical scheme is as follows: in order to solve the technical problems, the invention provides an Internet of things edge collaborative caching method based on deep reinforcement learning, which comprises the following steps:
step one: the edge server collects video cache information of terminal layer user equipment in the area to construct a dynamic log file data set, wherein data set elements comprise video user ID, time stamp and video content ID;
step two: each distributed edge server trains a data set through a model training module, a neural network of the model training module inputs the data set into a dynamic log file data set containing video user ID, time stamp and video content ID label, and after training, the model training module outputs local gradient parameters containing cache information of the edge serverSynchronously forwarding to a central server and an adjacent edge server; the neural network in the distributed edge server is to get local gradient parameters +.>The parameter may reflect a cache state of the edge server;
step three: the center server receives the local gradient parameters sent by each distributed edge serverThen, the local gradient parameter is treated by a parameter aggregation module>Polymerizing to obtain global gradient parameter G τ The method comprises the steps of carrying out a first treatment on the surface of the The distributed edge servers share the local gradient parameters of each other through a cooperative sharing module>So as to realize cache information interaction among the edge servers;
step four: the central server inputs the fitted global gradient parameter G through a parameter training module τ After training of the neural network, the updated global model parameter omega is output τ The neural network of the central server is to obtain the global model parameter omega τ This parameter may further optimize the neural network in the edge server; the central server uses the global model parameter omega τ Sending to each distributed edge server to perform a new round of local gradient parametersGlobal gradient parameter G τ Global model parameters omega τ Is updated according to the update of (a);
step five: repeating the first step to the fourth step until the prediction model converges to obtain a prediction model of the online video request of the video user, and predicting the video content request to obtain a prediction list of the online video request of the user; the prediction model after training is automatically updated;
step six: and according to a prediction list of the online video request of the user obtained by the prediction model, the plurality of distributed edge servers perform collaborative caching until each distributed edge server reaches the upper storage limit.
Further, the specific method of the first step is as follows: the distributed edge servers m collect video cache information i of terminal layer user equipment d in the coverage area of the distributed edge servers m, and each distributed edge server establishes a dynamic log file data set X according to the video cache information m The data in the data set is subjected to label classification to obtain three types of data: video user ID, timestamp, and video content ID.
Further, the specific method of the second step is as follows:
in the step 2.1 of the method,data set division, namely classifying tags into dynamic log file data sets X m Divided into having minimum batch sizeBeta represents the batch size into which the training data set is divided, and M represents the number of edge servers;
step 2.2, generating an output matrix, and for the DNN neural network, generating the output matrix by the distributed edge server:
wherein,,is the input matrix of the first layer of the neural network in the distributed edge server, alpha m Is a rectifying linear unit activation function in the edge server for converting the input of each layer of neural network into a nonlinear mode, defining global model parameters ω, ω= (W, v), w= [ W) covering all DNN layers 1 ,…,W l ,…,W L ]Sum v= [ v 1 ,…,v l ,…,v L ],W l Is a global weight matrix, v l Is a global bias vector, L represents the number of layers of the neural network;
step 2.3, calculating a predictive loss function, generating at the output layer Predictive loss p for finding minimum batch iteration number τ of edge server m m (ω τ ):
Where τ represents a small sample completion trainingAn iterative process of training; t represents the time for completing the iterative process τ; omega τ Representing global model parameters at the time of the iterative process τ; is a distributed edge server input matrix +.>Element(s) of->Is a distributed edge server output matrix +.>Is an element of (2);
step 2.4, calculating local gradient parameters; by calculation:
Further, calculating global gradient parameter G in the third step τ The formula of (2) is as follows:
further, the specific steps of the fourth step are as follows:
step 4.1, calculating a learning step length lambda, a neural network deployed by a parameter training module based on a central server, and a global gradient parameter, and obtaining eta τ And delta τ Respectively regarded as G τ Andfor estimating the mean to predict the variance, η at the current sample iteration τ τ+1 And delta τ+1 The updated formula of (c) is as follows:
wherein,,and->Respectively represent eta τ And delta τ Exponential decay step at τ for updating global model parameter ω τ Adding a learning step size lambda to determine to update the global model omega at each iteration process tau τ The update formula of the learning step size lambda is as follows:
step 4.2, calculating the global model parameter ω of the next iteration process τ+1 τ+1 :
Wherein omega τ+1 The dataset for the edge server to learn the next τ+1, ε representing a constant;
step 4.3, predicting the global model parameter ω τ+1 Sending to each distributed edge server to perform a new round of local gradient parametersGlobal gradient parameter G τ And global model parameters omega τ Is updated according to the update of the update program.
Further, the specific steps of the step six are as follows:
step 6.1, each distributed edge server interacts with a prediction list of the online video requests of the terminal layer users in the coverage area of the distributed edge server;
step 6.2, counting video request times according to the occurrence frequency of different online videos in prediction lists of different users;
and 6.3, carrying out collaborative caching by the distributed edge servers according to the video request times, and if the current server caches the video, not repeatedly caching the video by the adjacent servers until each edge server reaches the upper storage limit.
The beneficial effects are that: compared with the prior art, the technical scheme of the invention has the following beneficial technical effects:
1. according to the invention, training models are deployed at the edge server and the center server respectively, the cache data is subjected to centralized model training and distributed cache operation based on a deep reinforcement learning algorithm, and the flow consumption of a return link is reduced and a cache strategy is optimized in real time in a side cloud cooperative mode.
2. The invention realizes collaborative edge cache, allows the distributed edge servers to mutually collaborate, and further improves the cache hit rate of the system. Through the transmission parameters, the user information privacy is protected and the communication overhead is reduced while the data is shared.
3. The self-learning type buffer strategy adjustment is realized, the self-adaption capability of the deep reinforcement learning can realize real-time analysis of data requests and design of corresponding buffer strategies, and the learning capability of the deep reinforcement learning model is improved along with accumulation of buffer data quantity, so that the buffer hit rate is further improved.
Drawings
FIG. 1 is a general framework diagram of an edge collaborative caching system and an operation method of the Internet of things based on deep reinforcement learning;
FIG. 2 is a diagram of a central server architecture and workflow in accordance with the present invention;
FIG. 3 is a diagram of an edge server architecture and workflow in accordance with the present invention.
The specific embodiment is as follows:
embodiments of the present invention are described in detail below, examples of which are illustrated in the accompanying drawings. The embodiments described below by referring to the drawings are exemplary only for explaining the present invention and are not to be construed as limiting the present invention.
Fig. 1 is a general framework diagram of the deep reinforcement learning-based internet of things edge collaborative caching system and the operation method, which show the hierarchical relationship among a central server, an edge server and a terminal in the invention. Based on the above, the invention provides an Internet of things edge collaborative caching method based on deep reinforcement learning, which comprises the following steps:
step one: the edge server collects video cache information of terminal layer user equipment in the area to construct a dynamic log file data set, wherein data set elements comprise video user ID, time stamp and video content ID;
step two: each distributed edge server trains a data set through a model training module, a neural network of the model training module inputs the data set into a dynamic log file data set containing video user ID, time stamp and video content ID label, and after training, the model training module outputs local gradient parameters containing cache information of the edge serverSynchronously forwarding to a central server and an adjacent edge server; the neural network in the distributed edge server is to get local gradient parameters +.>The parameter may reflect a cache state of the edge server;
step three: the center server receives the local gradient parameters sent by each distributed edge serverThen, the local gradient parameter is treated by a parameter aggregation module>Polymerizing to obtain global gradient parameter G τ The method comprises the steps of carrying out a first treatment on the surface of the The distributed edge servers share the local gradient parameters of each other through a cooperative sharing module>So as to realize cache information interaction among the edge servers;
step four: the central server inputs the fitted global gradient parameter G through a parameter training module τ After training of the neural network, the updated global model parameter omega is output τ The neural network of the central server is to obtain the global model parameter omega τ This parameter may further optimize the neural network in the edge server; the central server uses the global model parameter omega τ Sending to each distributed edge server to perform a new round of local gradient parametersGlobal gradient parameter G τ Global model parameters omega τ Is updated according to the update of (a);
step five: repeating the first step to the fourth step until the prediction model converges to obtain a prediction model of the online video request of the video user, and predicting the video content request to obtain a prediction list of the online video request of the user; the prediction model after training is automatically updated;
step six: and according to a prediction list of the online video request of the user obtained by the prediction model, the plurality of distributed edge servers perform collaborative caching until each distributed edge server reaches the upper storage limit.
Further, the specific method of the first step is as follows: the distributed edge server m gathers video cache messages for the end layer user device d within its coverage areaInformation i, each distributed edge server establishes a dynamic log file data set X according to video cache information m The data in the data set is subjected to label classification to obtain three types of data: video user ID, timestamp, and video content ID.
Further, the specific method of the second step is as follows:
step 2.1, dividing the data set, and classifying the tags to obtain a dynamic log file data set X m Divided into having minimum batch sizeBeta represents the batch size into which the training data set is divided, and M represents the number of edge servers;
step 2.2, generating an output matrix, and for the DNN neural network, generating the output matrix by the distributed edge server:
wherein,,is the input matrix of the first layer of the neural network in the distributed edge server, alpha m Is a rectifying linear unit activation function in the edge server for converting the input of each layer of neural network into a nonlinear mode, defining global model parameters ω, ω= (W, v), w= [ W) covering all DNN layers 1 ,…,W l ,…,W L ]Sum v= [ v 1 ,…,v l ,…,v L ],W l Is a global weight matrix, v l Is a global bias vector, L represents the number of layers of the neural network;
step 2.3, calculating a predictive loss function, generating at the output layer For findingPredictive loss p to edge server m minimum number of batch iterations τ m (ω τ ):
Wherein τ represents the iteration process of training for a small batch of samples; t represents the time for completing the iterative process τ; omega τ Representing global model parameters at the time of the iterative process τ; is a distributed edge server input matrix +.>Element(s) of->Is a distributed edge server output matrix +.>Is an element of (2);
step 2.4, calculating local gradient parameters; by calculation:
Further, calculating global gradient parameter G in the third step τ The formula of (2) is as follows:
further, the specific steps of the fourth step are as follows:
step 4.1, calculating a learning step length lambda, a neural network deployed by a parameter training module based on a central server, and a global gradient parameter, and obtaining eta τ And delta τ Respectively regarded as G τ Andfor estimating the mean to predict the variance, η at the current sample iteration τ τ+1 And delta τ+1 The updated formula of (c) is as follows:
wherein,,and->Respectively represent eta τ And delta τ Exponential decay step at τ for updating global model parameter ω τ Adding a learning step size lambda to determine to update the global model omega at each iteration process tau τ The update formula of the learning step size lambda is as follows:
step 4.2, calculating the global model parameter ω of the next iteration process τ+1 τ+1 :
Wherein omega τ+1 The dataset for the edge server to learn the next τ+1, ε representing a constant;
step 4.3, predicting the global model parameter ω τ+1 Sending to each distributed edge server to perform a new round of local gradient parametersGlobal gradient parameter G τ And global model parameters omega τ Is updated according to the update of the update program.
Further, the specific steps of the step six are as follows:
step 6.1, each distributed edge server interacts with a prediction list of the online video requests of the terminal layer users in the coverage area of the distributed edge server;
step 6.2, counting video request times according to the occurrence frequency of different online videos in prediction lists of different users;
and 6.3, carrying out collaborative caching by the distributed edge servers according to the video request times, and if the current server caches the video, not repeatedly caching the video by the adjacent servers until each edge server reaches the upper storage limit.
As shown in fig. 2, the central server includes a parameter aggregation module, a parameter training module, and a cache state global view module. Wherein the parameter aggregation module aggregates local gradient parameters sent by each distributed edge serverObtaining global gradient parameter G τ The method comprises the steps of carrying out a first treatment on the surface of the The parameter training module is based on global gradient parameters G τ The central server will continue to calculate the global model parameters ω for the next iteration process τ And sends it to each distributed edge server. Each distributed edge server will be based on global model parameters ω τ Adjusting a subsequent cache strategy; the cache state global view module may overview all edge server cache cases.
As shown in fig. 3, edgesThe edge server comprises a model training module, a collaboration sharing module and a rewards module. Wherein, each distributed edge server trains the data set through the model training module, the neural network input of the model training module is a dynamic log file data set containing video user ID, time stamp and video content ID label, after training, the model training module outputs local gradient parameters containing cache information of the edge serverSynchronously forwarding to a central server and an adjacent edge server; the neural network in the distributed edge server is to get local gradient parameters +.>The parameter may reflect a cache state of the edge server; the collaboration sharing module is used for receiving cache information of other edge servers, and then combining global model parameters omega issued by the center server according to the local data request τ Helping the model training module to better make a caching decision; the rewards module is used for calculating rewards values of caching operation, data cached in the local server and the adjacent server both contribute to the hit rate of the system, and therefore rewards are set to be a weighted sum of the two so as to promote cooperation between the edge servers.
Claims (6)
1. The method for collaborative caching of the edges of the Internet of things based on deep reinforcement learning is characterized by comprising the following steps of:
step one: the edge server collects video cache information of terminal layer user equipment in the area to construct a dynamic log file data set, wherein data set elements comprise video user ID, time stamp and video content ID;
step two: each distributed edge server trains a data set through a model training module, a neural network of the model training module inputs a dynamic log file data set containing video user ID, time stamp and video content ID label, and after training, the model training module outputs data set containing cache information of the edge serverLocal gradient parameters of (a)Synchronously forwarding to a central server and an adjacent edge server; the neural network in the distributed edge server is to get local gradient parameters +.>The parameter reflects the cache state of the edge server;
step three: the center server receives the local gradient parameters sent by each distributed edge serverThen, the local gradient parameter is subjected to the parameter aggregation module>Polymerizing to obtain global gradient parameter G τ The method comprises the steps of carrying out a first treatment on the surface of the The distributed edge servers share the local gradient parameters of each other through a cooperative sharing module>So as to realize cache information interaction among the edge servers;
step four: the central server inputs the fitted global gradient parameter G through a parameter training module τ After training of the neural network, the updated global model parameter omega is output τ The neural network of the central server is to obtain the global model parameter omega τ This parameter further optimizes the neural network in the edge server; the central server uses the global model parameter omega τ Sending to each distributed edge server to perform a new round of local gradient parametersGlobal gradient parameter G τ Global model parameters omega τ Is updated according to the update of (a);
step five: repeating the first step to the fourth step until the prediction model converges to obtain a prediction model of the online video request of the video user, and predicting the video content request to obtain a prediction list of the online video request of the user; the prediction model after training is automatically updated;
step six: and according to a prediction list of the online video request of the user obtained by the prediction model, the plurality of distributed edge servers perform collaborative caching until each distributed edge server reaches the upper storage limit.
2. The method for collaborative caching of the edge of the internet of things based on deep reinforcement learning according to claim 1, wherein the specific method in the first step is as follows: the distributed edge servers m collect video cache information i of terminal layer user equipment d in the coverage area of the distributed edge servers m, and each distributed edge server establishes a dynamic log file data set X according to the video cache information m The data in the data set is subjected to label classification to obtain three types of data: video user ID, timestamp, and video content ID.
3. The method for collaborative caching of the edge of the internet of things based on deep reinforcement learning according to claim 1, wherein the specific method of the second step is as follows:
step 2.1, dividing the data set, and classifying the tags to obtain a dynamic log file data set X m Divided into having minimum batch sizeBeta represents the batch size into which the training data set is divided, and M represents the number of edge servers;
step 2.2, generating an output matrix, and for the DNN neural network, generating the output matrix by the distributed edge server:
wherein,,is the input matrix of the first layer of the neural network in the distributed edge server, alpha m Is a rectifying linear unit activation function in the edge server for converting the input of each layer of neural network into a nonlinear mode, defining global model parameters ω, ω= (W, v), w= [ W) covering all DNN layers 1 ,...,W l ,...,W L ]Sum v= [ v 1 ,...,v l ,....,v L ],W l Is a global weight matrix, v l Is a global bias vector, L represents the number of layers of the neural network;
step 2.3, calculating a predictive loss function, generating at the output layerPredictive loss p for finding minimum batch iteration number τ of edge server m m (ω τ ):
Wherein τ represents the iteration process of training for a small batch of samples; t represents the time for completing the iterative process τ; omega τ Representing global model parameters at the time of the iterative process τ;is a distributed edge server input matrixElement(s) of->Is a distributed edge server output matrix +.>Is an element of (2);
step 2.4, calculating local gradient parameters; by calculation:
5. the method for collaborative caching of the edge of the internet of things based on deep reinforcement learning according to claim 4, wherein the specific steps of the fourth step are as follows:
step 4.1, calculating a learning step length lambda, a neural network deployed by a parameter training module based on a central server, and a global gradient parameter, and obtaining eta τ And delta τ Respectively regarded as G τ Andfor estimating the mean to predict the variance, η at the current sample iteration τ τ+1 And delta τ+1 The updated formula of (c) is as follows:
wherein,,and->Respectively represent eta τ And delta τ Exponential decay step at τ for updating global model parameter ω τ Adding a learning step size lambda to determine to update the global model omega at each iteration process tau τ The update formula of the learning step size lambda is as follows:
step 4.2, calculating the global model parameter ω of the next iteration process τ+1 τ+1 :
Wherein omega τ+1 The dataset for the edge server to learn the next τ+1, ε representing a constant;
6. The method for collaborative caching of the edge of the internet of things based on deep reinforcement learning according to claim 1, wherein the specific steps in the sixth step are as follows:
step 6.1, each distributed edge server interacts with a prediction list of the online video requests of the terminal layer users in the coverage area of the distributed edge server;
step 6.2, counting video request times according to the occurrence frequency of different online videos in prediction lists of different users;
and 6.3, carrying out collaborative caching by the distributed edge servers according to the video request times, and if the current server caches the video, not repeatedly caching the video by the adjacent servers until each edge server reaches the upper storage limit.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310296228.9A CN116346837A (en) | 2023-03-24 | 2023-03-24 | Internet of things edge collaborative caching method based on deep reinforcement learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310296228.9A CN116346837A (en) | 2023-03-24 | 2023-03-24 | Internet of things edge collaborative caching method based on deep reinforcement learning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116346837A true CN116346837A (en) | 2023-06-27 |
Family
ID=86892560
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310296228.9A Pending CN116346837A (en) | 2023-03-24 | 2023-03-24 | Internet of things edge collaborative caching method based on deep reinforcement learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116346837A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116915781A (en) * | 2023-09-14 | 2023-10-20 | 南京邮电大学 | Edge collaborative caching system and method based on blockchain |
CN117010485A (en) * | 2023-10-08 | 2023-11-07 | 之江实验室 | Distributed model training system and gradient protocol method in edge scene |
-
2023
- 2023-03-24 CN CN202310296228.9A patent/CN116346837A/en active Pending
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116915781A (en) * | 2023-09-14 | 2023-10-20 | 南京邮电大学 | Edge collaborative caching system and method based on blockchain |
CN116915781B (en) * | 2023-09-14 | 2023-12-12 | 南京邮电大学 | Edge collaborative caching system and method based on blockchain |
CN117010485A (en) * | 2023-10-08 | 2023-11-07 | 之江实验室 | Distributed model training system and gradient protocol method in edge scene |
CN117010485B (en) * | 2023-10-08 | 2024-01-26 | 之江实验室 | Distributed model training system and gradient protocol method in edge scene |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Elgendy et al. | Joint computation offloading and task caching for multi-user and multi-task MEC systems: reinforcement learning-based algorithms | |
Ale et al. | Delay-aware and energy-efficient computation offloading in mobile-edge computing using deep reinforcement learning | |
Zhang et al. | Digital twin empowered content caching in social-aware vehicular edge networks | |
Jiang et al. | User preference learning-based edge caching for fog radio access network | |
Sun et al. | Cooperative computation offloading for multi-access edge computing in 6G mobile networks via soft actor critic | |
CN116346837A (en) | Internet of things edge collaborative caching method based on deep reinforcement learning | |
Zhang et al. | Deep learning for wireless coded caching with unknown and time-variant content popularity | |
CN107909108A (en) | Edge cache system and method based on content popularit prediction | |
CN113191484A (en) | Federal learning client intelligent selection method and system based on deep reinforcement learning | |
CN114143891A (en) | FDQL-based multi-dimensional resource collaborative optimization method in mobile edge network | |
Zhang et al. | Federated learning with adaptive communication compression under dynamic bandwidth and unreliable networks | |
Fan et al. | PA-cache: Evolving learning-based popularity-aware content caching in edge networks | |
CN113315978B (en) | Collaborative online video edge caching method based on federal learning | |
Tang et al. | Collective deep reinforcement learning for intelligence sharing in the internet of intelligence-empowered edge computing | |
Chen et al. | An artificial intelligence perspective on mobile edge computing | |
Zhou et al. | Edge computation offloading with content caching in 6G-enabled IoV | |
Fan et al. | DNN deployment, task offloading, and resource allocation for joint task inference in IIoT | |
CN114154643A (en) | Federal distillation-based federal learning model training method, system and medium | |
Geng et al. | Bearing fault diagnosis based on improved federated learning algorithm | |
Tang et al. | Representation and reinforcement learning for task scheduling in edge computing | |
CN116471286A (en) | Internet of things data sharing method based on block chain and federal learning | |
Chua et al. | Resource allocation for mobile metaverse with the Internet of Vehicles over 6G wireless communications: A deep reinforcement learning approach | |
Gali et al. | A Distributed Deep Meta Learning based Task Offloading Framework for Smart City Internet of Things with Edge-Cloud Computing. | |
Wan et al. | Deep Reinforcement Learning‐Based Collaborative Video Caching and Transcoding in Clustered and Intelligent Edge B5G Networks | |
Cui et al. | Multiagent reinforcement learning-based cooperative multitype task offloading strategy for internet of vehicles in B5G/6G network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |