CN112752308A - Mobile prediction wireless edge caching method based on deep reinforcement learning - Google Patents

Mobile prediction wireless edge caching method based on deep reinforcement learning Download PDF

Info

Publication number
CN112752308A
CN112752308A CN202011620501.1A CN202011620501A CN112752308A CN 112752308 A CN112752308 A CN 112752308A CN 202011620501 A CN202011620501 A CN 202011620501A CN 112752308 A CN112752308 A CN 112752308A
Authority
CN
China
Prior art keywords
user
cache
service node
neural network
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011620501.1A
Other languages
Chinese (zh)
Other versions
CN112752308B (en
Inventor
吴长汶
辛基梁
郑建武
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xiamen Yueren Health Technology Research And Development Co ltd
Original Assignee
Xiamen Yueren Health Technology Research And Development Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xiamen Yueren Health Technology Research And Development Co ltd filed Critical Xiamen Yueren Health Technology Research And Development Co ltd
Priority to CN202011620501.1A priority Critical patent/CN112752308B/en
Publication of CN112752308A publication Critical patent/CN112752308A/en
Application granted granted Critical
Publication of CN112752308B publication Critical patent/CN112752308B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W28/00Network traffic management; Network resource management
    • H04W28/02Traffic management, e.g. flow control or congestion control
    • H04W28/10Flow control between communication endpoints
    • H04W28/14Flow control between communication endpoints using intermediate storage
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/147Network analysis or design for predicting network behaviour

Abstract

The invention relates to a mobile prediction wireless edge caching method based on deep reinforcement learning, which comprises the following steps: constructing a wireless intelligent cache network model, which comprises a user set, a service node set, a user request content set, a cache content set, a source content library, a user historical track vector and a user classification group; constructing a long-short term memory network model, taking the historical track vector of the user as a prediction position of a prediction user in the next time slot, and classifying to obtain a user classification group; establishing a replacement cache strategy, acquiring a predicted user set of each service node according to the user classification group, and replacing the cache content of the current service node; and constructing a neural network combining Q learning and DQN reinforcement learning, training the neural network to obtain a trained dynamic cache replacement model, and utilizing the dynamic cache replacement model in a cache replacement strategy.

Description

Mobile prediction wireless edge caching method based on deep reinforcement learning
Technical Field
The invention relates to a mobile prediction wireless edge caching method based on deep reinforcement learning, and belongs to the technical field of wireless communication and computers.
Background
With the exponential growth of mobile wireless communication, data demand, and the continuous improvement of device storage and computing capabilities, real-time multimedia services are gradually becoming a major business in 5G communication networks, and human life and work are gradually migrating towards the overall mobile internet, pushing various network functions to the edge of the network, such as edge computing and edge caching. By pre-storing popular content requested by a user, edge caching aims to reduce traffic load and duplicate transmissions in the backhaul network, thereby significantly reducing latency, and therefore, accurately predicting a user's future needs is critical for edge caching replacement. To capture content popularity and the dynamics of a time-varying wireless environment, a policy control framework is introduced into the field of wireless caching. The deep reinforcement learning combines the deep neural network and the Q learning, shows excellent performance in the aspect of solving the complex control problem, and gets more and more attention in the research of wireless edge cache.
Disclosure of Invention
In order to solve the problems in the prior art, the invention provides a mobile prediction wireless edge caching method based on deep reinforcement learning, which can predict the position of a mobile user by using a long-term and short-term memory network, overcome the influence of the mobility of the user on the cache hit rate, and solve the caching problem in a wireless network by using a neural network framework combining Q learning and reinforcement learning to carry out a cache replacement strategy of a service node, thereby improving the capability of mobile prediction wireless edge caching.
The technical scheme of the invention is as follows:
a mobile prediction wireless edge caching method based on deep reinforcement learning comprises the following steps:
constructing a wireless intelligent cache network model, which comprises a service node model and a service node control model, wherein the service node model comprises a user set, a service node set, a user request content set, a cache content set and a source content library; the service node control model comprises a user historical track vector and a user classification group;
mobile prediction, namely constructing a long-term and short-term memory network model, taking the historical track vector of the user as input, and outputting the predicted position of the user in the next time slot; classifying according to the predicted position of each user in the user set in the next time slot to obtain the user classification group;
establishing a replacement cache strategy, acquiring a predicted user set of each service node in a service node set in the next time slot according to a user classification group, and acquiring replacement contents from a source content library according to history request contents of users in the predicted user set and cache contents of a current service node to replace the cache contents of the current service node;
and (3) constructing a deep learning neural network combined with Q learning, taking a sample state in a state space formed by a prediction user set, a user request content set and a cache content set as an input, taking a certain action in an action space formed by replacement content as an output, training the neural network to obtain a trained dynamic cache replacement model, and utilizing the dynamic cache replacement model in a cache replacement strategy.
Furthermore, the wireless intelligent cache network model operates in a time discrete mode, and in each time slot, the user request content and the user historical track are updated.
Further, the user historical track vector is a position sequence and represents the moving track of the user within a period of time, and the historical track vector of each user is stored in the service node control model;
and inputting the historical track vector of the user into the long-short term memory network model, introducing a weight matrix, and outputting the predicted position of each user in the next time slot.
Further, in the process of training the neural network, a reward function is constructed based on the cache hit rate to train the neural network, and the method specifically comprises the following steps:
constructing a reward function which calculates an instant reward value through an input sample state and an output action and provides the instant reward value to a neural network;
constructing a cache hit rate calculation formula, wherein the cache hit rate refers to the probability that the request content of each user in a user set corresponding to a service node can be found in the cache content of the corresponding service node;
presetting a threshold value, wherein the threshold value belongs to (0,1), acquiring the state of the sample in the next time slot according to the input sample state and the output action, calculating the cache hit rate of the sample in the state of the next time slot according to the cache hit rate calculation formula, comparing with the threshold value, and obtaining a positive instantaneous reward value when the cache hit rate of the sample in the state of the next time slot is greater than the threshold value.
Furthermore, an experience replay mechanism is arranged in the neural network, and the input sample state, the output action, the instant reward value and the state of the sample in the next time slot are combined and stored in an experience replay library to be used as a training sample of the neural network.
Further, the step of constructing the neural network combining Q learning and DQN reinforcement learning specifically includes:
defining an action value function for calculating a Q value through training samples in an experience replay library through Q learning;
the DQN reinforcement learning adopts a neural network to predict a q value, and for each training sample in an experience playback library, the q value of a currently taken action is predicted through the state and the action of the sample, and then the q value of the next state taken action is predicted through the state and the action of the sample in the next time slot;
and constructing a loss function taking the difference value between the q value of the action taken in the next state and the q value of the current action taken as a reference, and iteratively updating the weight parameters of the neural network by using a gradient descent method to make the neural network converge.
The invention has the following beneficial effects:
1. the invention discloses a mobile prediction wireless edge caching method based on deep reinforcement learning, which can predict the position of a mobile user by utilizing a long-term and short-term memory network, can overcome the influence of the mobility of the user on the cache hit rate, and simultaneously utilizes a neural network framework combining Q learning and reinforcement learning to carry out a cache replacement strategy of a service node, solves the caching problem in a wireless network, thereby improving the capability of mobile prediction wireless edge caching.
2. The invention relates to a mobile prediction wireless edge caching method based on deep reinforcement learning, which is characterized in that a reward function based on cache hit rate is established, and a positive instantaneous reward value is given only when the cache hit rate is greater than a threshold value after cache content is replaced, so that the accuracy of a neural network output result is improved.
3. The invention relates to a mobile prediction wireless edge caching method based on deep reinforcement learning, which is characterized in that a prediction user set of each node is obtained according to the prediction position of a user, so that the user can obtain caching resources in a service node as much as possible, and the time delay is reduced.
4. The invention discloses a mobile prediction wireless edge caching method based on deep reinforcement learning, which is characterized in that the calculation of approximate Q values of a neural network is used, the action of obtaining the maximum Q value in each state is generated through iteration, so that an optimal cache replacement strategy is obtained, the neural network continuously updates iteration parameters through gradient descent, so that a loss function tends to be stable to the minimum value, and the whole network is converged.
Drawings
FIG. 1 is an overall flow chart of an embodiment of the present invention;
FIG. 2 is a diagram illustrating an exemplary wireless intelligent cache network model according to an embodiment of the present invention;
FIG. 3 is a flow chart of a caching policy in an embodiment of the invention;
FIG. 4 is a diagram illustrating an example of different movement modes according to an embodiment of the present invention;
fig. 5 is a comparison example diagram of the calculation results after the scheme of the present embodiment is adopted for different movement modes.
Detailed Description
The invention is described in detail below with reference to the figures and the specific embodiments.
The first embodiment is as follows:
referring to fig. 1, a method for caching mobile prediction wireless edges based on deep reinforcement learning includes the following steps:
constructing a wireless intelligent cache network model, including a service node modelA profile and a service node control model, the service node model comprising a user set U ═ { U ═ U1,U2,...,UIB, service node set B ═ B1,B2,...,BJ}, user request content set
Figure BDA0002876016790000051
Caching content sets
Figure BDA0002876016790000052
And source content library O ═ { O1,O2,...,OK};
Figure BDA0002876016790000053
Indicating the request content of the ith user in the t time slot
Figure BDA0002876016790000054
The cache content of the jth service node in the tth time slot is represented;
the service node control model comprises a user historical track vector and a user classification group;
mobile prediction, namely constructing a long-term and short-term memory network model, taking the historical track vector of the user as input, and outputting the predicted position of the user in the next time slot; classifying according to the predicted position of each user in the user set in the next time slot to obtain the user classification group;
establishing a replacement cache strategy, and acquiring a predicted user set of each service node in the service node set in the next time slot according to the user classification group
Figure BDA0002876016790000061
And based on the predicted user set
Figure BDA0002876016790000062
From a user requesting a set
Figure BDA0002876016790000063
And caching content sets
Figure BDA0002876016790000064
The history request content of the user and the cache content of the current service node are obtained, and when the history request content of the user does not exist in the cache content of the current service node, the history request content of the user is obtained from a source content library O ═ O1,O2,...,OKGet the replacement content to replace the cache content of the current service node, that is, get the replacement content
Figure BDA0002876016790000065
Will be O ═ O1,O2,...,OKReplacing with new content provided;
optimizing model, constructing deep learning neural network combining Q learning and DQN reinforcement learning, and defining state space as
Figure BDA0002876016790000066
Taking the sample state in the state space as input, the motion space is defined as a(t)={x1,x2,…,xKAnd (5) taking each action in the action space of the replacement content as output, training the neural network to obtain a trained dynamic cache replacement model, and using the dynamic cache replacement model in a replacement cache strategy.
The implementation utilizes the long-term and short-term memory network to predict the position of the mobile user, can overcome the influence of the mobility of the user on the cache hit rate, and simultaneously utilizes the neural network framework combined with Q learning and reinforcement learning to carry out the cache replacement strategy of the service node, thereby solving the cache problem in the wireless network and further improving the capability of mobile prediction of wireless edge cache.
Example two:
further, the wireless intelligent cache network model runs in a time discrete mode, T ═ {1,2, …, T }. In each time slot, the location information and the request of the user are updated, i.e.
Figure BDA0002876016790000071
And
Figure BDA0002876016790000072
is updated if the requested content is cached in the collection
Figure BDA0002876016790000073
It will transmit directly to the user; otherwise, the content request and delivery need to be sent from a remote server over the backhaul. Predictive user set sent from serving node controller for updating cache content
Figure BDA0002876016790000074
Will be provided with
Figure BDA0002876016790000075
And
Figure BDA0002876016790000076
used as input to a neural network to determine the buffer content of the next time slot, i.e.
Figure BDA0002876016790000077
Some of the content in the store will be replaced by new content provided by the remote server.
Further, the user historical track vector is a position sequence and represents the moving track of the user within a period of time, and the historical track vector of each user is stored in the service node control model; defined as a sequence of positions of
Figure BDA0002876016790000078
A total of beta historical access records are included.
The position sequence is used as the input of a long-time memory network, and the predicted position of the user in the next time slot is realized, namely:
Figure BDA0002876016790000079
wherein
Figure BDA00028760167900000710
Here W is a different weight matrix.
Further, in the process of training the neural network, a reward function is constructed based on the cache hit rate to train the neural network, and the method specifically comprises the following steps:
constructing a reward function, and combining the cache of the service node of the current user after the service node receives the request content of the user and the predicted user set of the service node controller model at each time slot, namely generating a state s(t)Then, as the input of the neural network, the neural network will be based on the state s(t)Selecting a certain action a in the action space(t)When the output is taken, the action is executed according to the reward function
Figure BDA00028760167900000711
Obtaining an instantaneous prize value
Figure BDA00028760167900000712
Constructing a cache hit rate calculation formula:
Figure BDA00028760167900000713
wherein the content of the first and second substances,
Figure BDA0002876016790000081
indicating when the user requests
Figure BDA0002876016790000082
Caching at a current serving node
Figure BDA0002876016790000083
When the information can be found, the value of the indication function is 1, otherwise, the value is 0, and then the cache hit rate of the jth service node is cached, namely the cache hit rate is applied to all the users currently located in the service node
Figure BDA0002876016790000084
The indicator function is evaluated once and finally normalized to find the percentage hit rate.
Presetting a threshold value zeta, when the threshold value zeta belongs to (0,1), if the cache hit rate is larger than the threshold value
Figure BDA0002876016790000085
A positive reward is obtained
Figure BDA0002876016790000086
Experiments show that the cache hit rate where ζ ═ 0.6 is better than other values, and the purpose of the system is to maximize the cache hit rate per serving node.
Furthermore, an experience playback mechanism is arranged in the neural network, and after receiving the request content and the predicted user set of the user, the service node combines the cache content of the service node of the current user at each time slot, namely, generates the state s(t)Then, as the input of the neural network, the neural network will be based on the state s(t)Selecting a certain action a in the action space(t)When the output is taken, the system can execute the action according to the reward function
Figure BDA0002876016790000087
Obtaining an instantaneous prize value
Figure BDA0002876016790000088
And enters the next state s(t+1). The four elements will then be combined into a combination
Figure BDA0002876016790000089
Storing the training samples into an experience replay library to be used as training samples of the neural network.
Further, the step of constructing the neural network combining Q learning and DQN reinforcement learning specifically includes:
defining a motion cost function for computing Q values through training samples in an empirical replay library by Q learning (Q-learning):
Figure BDA00028760167900000810
where γ ∈ (0,1) represents the discount factor.
Because the large dimension of the action space can consume a large amount of memory, the DQN reinforcement learning adopts a neural network to estimate the q value qπ(s(t),a(t),ω)≈qπ(s(t),a(t)) When samples are extracted from an empirical replay library for training, each sample can calculate the q value q of the action taken under the current sample state in an estimation networkπ(s(t),a(t)ω) while using the next state s in the sample(t+1)Inputting the value into a target network (structure and estimation network are consistent, and updating is delayed) to calculate the value of the action taken by the next state
Figure BDA0002876016790000091
Defining the loss function as the square of the difference between the two
Figure BDA0002876016790000092
Updating a weight parameter omega of the neural network by using a gradient descent method;
Figure BDA0002876016790000093
the trained weight parameter omega is stable, and the whole neural network is in a convergence state.
In order to make those skilled in the art further understand the solution proposed in the present embodiment, the following detailed description is made with reference to specific embodiments. The embodiment is implemented on the premise of the technical scheme of the invention, and a detailed implementation mode and a specific operation process are given.
Fig. 2 shows a wireless intelligent cache network model.
The model mainly comprises service nodes, a controller, a source server, a cache template and the like, introduces a user cache model under the service nodes, and each service node can download content requested by a user on the source server through a backhaul link, locally caches the content requested by the user and directly serves the user in a cell.
Fig. 3 shows a flow chart of the cache policy.
In time slot t, the request and location made by the user. A request end: if cached at the service node, the user-requested content is sent directly to the user and if not downloaded from a remote server (source content repository). Position end: and updating the historical track vector of the user, predicting the mobility of the user through a long-term and short-term memory network, then obtaining a predicted user set corresponding to the service node through a classification function, and finally updating the cache content through a neural network.
As shown in fig. 4, which is an exemplary diagram of different movement scenarios.
To study the proposed DRL-based caching scheme in various mobility scenarios, three different mobility patterns were tested and compared, fig. 4 (a) being linear mobility: for simulating a user's straight line movement on a street or road. In fig. 4, (b) is a circular motion: this is a typical deterministic motion pattern used to simulate a fixed path trajectory. In fig. 4, (c) is random shift: for simulating irregular movement of the user in an open area.
The calculation result is shown in fig. 5, and the result shows that the algorithm adopting the mobility prediction is superior to the algorithm without the mobility prediction, and the performance gains of the cache hit rate are 14.5%, 19.3% and 10.0% respectively under the conditions of linear, circular and random motions, which indicates that the accurate prediction user plays a key role in content replacement to adapt to the data request of the user.
The above analysis shows that the scheme provided by the invention can obtain better caching capacity than the existing method, and can well improve the caching problem of the user.
The above description is only an embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes performed by the present specification and drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims (6)

1. A mobile prediction wireless edge caching method based on deep reinforcement learning is characterized by comprising the following steps:
constructing a wireless intelligent cache network model, which comprises a service node model and a service node control model, wherein the service node model comprises a user set, a service node set, a user request content set, a cache content set and a source content library; the service node control model comprises a user historical track vector and a user classification group;
mobile prediction, namely constructing a long-term and short-term memory network model, taking the historical track vector of the user as input, and outputting the predicted position of the user in the next time slot; classifying according to the predicted position of each user in the user set in the next time slot to obtain the user classification group;
establishing a replacement cache strategy, acquiring a predicted user set of each service node in a service node set in the next time slot according to a user classification group, and acquiring replacement contents from a source content library according to history request contents of users in the predicted user set and cache contents of a current service node to replace the cache contents of the current service node;
and constructing a neural network combining Q learning and DQN reinforcement learning, taking a sample state in a state space formed by a prediction user set, a user request content set and a cache content set as an input, taking a certain action in an action space formed by replacement content as an output, training the neural network to obtain a trained dynamic cache replacement model, and utilizing the dynamic cache replacement model in a cache replacement strategy.
2. The method for caching the mobile prediction wireless edge based on the deep reinforcement learning of claim 1, wherein: the wireless intelligent cache network model operates in a time discrete mode, and in each time slot, user request content and user historical tracks are updated.
3. The method of claim 2, wherein the method comprises: the user historical track vector is a position sequence and represents the moving track of the user within a period of time, and the historical track vector of each user is stored in the service node control model;
and inputting the historical track vector of the user into the long-short term memory network model, introducing a weight matrix, and outputting the predicted position of each user in the next time slot.
4. The method for mobile prediction wireless edge caching based on deep reinforcement learning according to claim 1, wherein in the process of training the neural network, a reward function is constructed based on a cache hit rate to train the neural network, and the method comprises the following specific steps:
constructing a reward function which calculates an instant reward value through an input sample state and an output action and provides the instant reward value to a neural network;
constructing a cache hit rate calculation formula, wherein the cache hit rate refers to the probability that the request content of each user in a user set corresponding to a service node can be found in the cache content of the corresponding service node;
presetting a threshold value, wherein the threshold value belongs to (0,1), acquiring the state of the sample in the next time slot according to the input sample state and the output action, calculating the cache hit rate of the sample in the state of the next time slot according to the cache hit rate calculation formula, comparing with the threshold value, and obtaining a positive instantaneous reward value when the cache hit rate of the sample in the state of the next time slot is greater than the threshold value.
5. The method of claim 4, wherein the method comprises: the neural network is provided with an experience replay mechanism, and the input sample state, the output action, the instant reward value and the state of the sample in the next time slot are combined and stored in an experience replay library to be used as a training sample of the neural network.
6. The method for mobile prediction wireless edge caching based on deep reinforcement learning of claim 5, wherein the step of constructing the neural network combining Q learning and DQN reinforcement learning specifically comprises:
defining an action value function for calculating a Q value through training samples in an experience replay library through Q learning;
the DQN reinforcement learning adopts a neural network to predict a q value, and for each training sample in an experience playback library, the q value of a currently taken action is predicted through the state and the action of the sample, and then the q value of the next state taken action is predicted through the state and the action of the sample in the next time slot;
and constructing a loss function taking the difference value between the q value of the action taken in the next state and the q value of the current action taken as a reference, and iteratively updating the weight parameters of the neural network by using a gradient descent method to make the neural network converge.
CN202011620501.1A 2020-12-31 2020-12-31 Mobile prediction wireless edge caching method based on deep reinforcement learning Active CN112752308B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011620501.1A CN112752308B (en) 2020-12-31 2020-12-31 Mobile prediction wireless edge caching method based on deep reinforcement learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011620501.1A CN112752308B (en) 2020-12-31 2020-12-31 Mobile prediction wireless edge caching method based on deep reinforcement learning

Publications (2)

Publication Number Publication Date
CN112752308A true CN112752308A (en) 2021-05-04
CN112752308B CN112752308B (en) 2022-08-05

Family

ID=75650307

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011620501.1A Active CN112752308B (en) 2020-12-31 2020-12-31 Mobile prediction wireless edge caching method based on deep reinforcement learning

Country Status (1)

Country Link
CN (1) CN112752308B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113315978A (en) * 2021-05-13 2021-08-27 江南大学 Collaborative online video edge caching method based on federal learning
CN113422801A (en) * 2021-05-13 2021-09-21 河南师范大学 Edge network node content distribution method, system, device and computer equipment
CN114025017A (en) * 2021-11-01 2022-02-08 杭州电子科技大学 Network edge caching method, device and equipment based on deep cycle reinforcement learning
CN114025017B (en) * 2021-11-01 2024-04-16 杭州电子科技大学 Network edge caching method, device and equipment based on deep circulation reinforcement learning

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103282891A (en) * 2010-08-16 2013-09-04 甲骨文国际公司 System and method for effective caching using neural networks
US9047225B1 (en) * 2012-09-27 2015-06-02 Emc Corporation Dynamic selection of data replacement protocol for cache
US20180124423A1 (en) * 2016-10-28 2018-05-03 Nec Laboratories America, Inc. Dynamic scene prediction with multiple interacting agents
CN109976909A (en) * 2019-03-18 2019-07-05 中南大学 Low delay method for scheduling task in edge calculations network based on study
CN110651279A (en) * 2017-06-28 2020-01-03 渊慧科技有限公司 Training motion selection neural networks with apprentices
US20200082248A1 (en) * 2018-09-11 2020-03-12 Nvidia Corporation Future object trajectory predictions for autonomous machine applications
US20200134445A1 (en) * 2018-10-31 2020-04-30 Advanced Micro Devices, Inc. Architecture for deep q learning
CN111901392A (en) * 2020-07-06 2020-11-06 北京邮电大学 Mobile edge computing-oriented content deployment and distribution method and system
US20200356878A1 (en) * 2019-05-07 2020-11-12 Cerebri AI Inc. Predictive, machine-learning, time-series computer models suitable for sparse training sets
US20200387163A1 (en) * 2019-06-07 2020-12-10 Tata Consultancy Services Limited Method and a system for hierarchical network based diverse trajectory proposal

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103282891A (en) * 2010-08-16 2013-09-04 甲骨文国际公司 System and method for effective caching using neural networks
US9047225B1 (en) * 2012-09-27 2015-06-02 Emc Corporation Dynamic selection of data replacement protocol for cache
US20180124423A1 (en) * 2016-10-28 2018-05-03 Nec Laboratories America, Inc. Dynamic scene prediction with multiple interacting agents
CN110651279A (en) * 2017-06-28 2020-01-03 渊慧科技有限公司 Training motion selection neural networks with apprentices
US20200082248A1 (en) * 2018-09-11 2020-03-12 Nvidia Corporation Future object trajectory predictions for autonomous machine applications
US20200134445A1 (en) * 2018-10-31 2020-04-30 Advanced Micro Devices, Inc. Architecture for deep q learning
CN109976909A (en) * 2019-03-18 2019-07-05 中南大学 Low delay method for scheduling task in edge calculations network based on study
US20200356878A1 (en) * 2019-05-07 2020-11-12 Cerebri AI Inc. Predictive, machine-learning, time-series computer models suitable for sparse training sets
US20200387163A1 (en) * 2019-06-07 2020-12-10 Tata Consultancy Services Limited Method and a system for hierarchical network based diverse trajectory proposal
CN111901392A (en) * 2020-07-06 2020-11-06 北京邮电大学 Mobile edge computing-oriented content deployment and distribution method and system

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
宋旭鸣等: "基于深度学习的智能移动边缘网络缓存", 《中国科学院大学学报》 *
杨任农等: "基于Bi-LSTM的无人机轨迹预测模型及仿真", 《航空工程进展》 *
肖延辉等: "基于长短记忆型卷积神经网络的犯罪地理位置预测方法", 《数据分析与知识发现》 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113315978A (en) * 2021-05-13 2021-08-27 江南大学 Collaborative online video edge caching method based on federal learning
CN113422801A (en) * 2021-05-13 2021-09-21 河南师范大学 Edge network node content distribution method, system, device and computer equipment
CN113315978B (en) * 2021-05-13 2022-03-15 江南大学 Collaborative online video edge caching method based on federal learning
CN114025017A (en) * 2021-11-01 2022-02-08 杭州电子科技大学 Network edge caching method, device and equipment based on deep cycle reinforcement learning
CN114025017B (en) * 2021-11-01 2024-04-16 杭州电子科技大学 Network edge caching method, device and equipment based on deep circulation reinforcement learning

Also Published As

Publication number Publication date
CN112752308B (en) 2022-08-05

Similar Documents

Publication Publication Date Title
Zhu et al. Caching transient data for Internet of Things: A deep reinforcement learning approach
CN109660598B (en) Cache replacement method and system for transient data of Internet of things
WO2020253664A1 (en) Video transmission method and system, and storage medium
Li et al. Energy-latency tradeoffs for edge caching and dynamic service migration based on DQN in mobile edge computing
Zhang et al. Using grouped linear prediction and accelerated reinforcement learning for online content caching
Zhang et al. Toward edge-assisted video content intelligent caching with long short-term memory learning
CN112752308B (en) Mobile prediction wireless edge caching method based on deep reinforcement learning
CN110213627A (en) Flow medium buffer distributor and its working method based on multiple cell user mobility
CN109639760A (en) It is a kind of based on deeply study D2D network in cache policy method
CN109982104B (en) Motion-aware video prefetching and cache replacement decision method in motion edge calculation
Li et al. Low-latency edge cooperation caching based on base station cooperation in SDN based MEC
CN114553963B (en) Multi-edge node collaborative caching method based on deep neural network in mobile edge calculation
CN111491331B (en) Network perception self-adaptive caching method based on transfer learning in fog computing network
Feng et al. Content popularity prediction via deep learning in cache-enabled fog radio access networks
Yan et al. Distributed edge caching with content recommendation in fog-rans via deep reinforcement learning
CN115809147B (en) Multi-edge collaborative cache scheduling optimization method, system and model training method
CN113687960A (en) Edge calculation intelligent caching method based on deep reinforcement learning
CN116321307A (en) Bidirectional cache placement method based on deep reinforcement learning in non-cellular network
Somesula et al. Cooperative cache update using multi-agent recurrent deep reinforcement learning for mobile edge networks
Li et al. DQN-enabled content caching and quantum ant colony-based computation offloading in MEC
Liu et al. Mobility-aware video prefetch caching and replacement strategies in mobile-edge computing networks
Yu et al. Mobility-aware proactive edge caching for large files in the internet of vehicles
CN113114762B (en) Data caching method and system
CN117221403A (en) Content caching method based on user movement and federal caching decision
Jiang et al. Asynchronous federated and reinforcement learning for mobility-aware edge caching in IoVs

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant