CN109660598A - A kind of buffer replacing method and system of Internet of Things Temporal Data - Google Patents

A kind of buffer replacing method and system of Internet of Things Temporal Data Download PDF

Info

Publication number
CN109660598A
CN109660598A CN201811370683.4A CN201811370683A CN109660598A CN 109660598 A CN109660598 A CN 109660598A CN 201811370683 A CN201811370683 A CN 201811370683A CN 109660598 A CN109660598 A CN 109660598A
Authority
CN
China
Prior art keywords
data
caching
request
edge
data item
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811370683.4A
Other languages
Chinese (zh)
Other versions
CN109660598B (en
Inventor
曹洋
褚磊
竺浩
江涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huazhong University of Science and Technology
Original Assignee
Huazhong University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huazhong University of Science and Technology filed Critical Huazhong University of Science and Technology
Priority to CN201811370683.4A priority Critical patent/CN109660598B/en
Publication of CN109660598A publication Critical patent/CN109660598A/en
Application granted granted Critical
Publication of CN109660598B publication Critical patent/CN109660598B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/12Protocols specially adapted for proprietary or special-purpose networking environments, e.g. medical networks, sensor networks, networks in vehicles or remote metering networks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/56Provisioning of proxy services
    • H04L67/568Storing data temporarily at an intermediate stage, e.g. caching
    • H04L67/5682Policies or rules for updating, deleting or replacing the stored data

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Information Transfer Between Computers (AREA)
  • Computer And Data Communications (AREA)

Abstract

The invention discloses the buffer replacing methods and system of a kind of Internet of Things Temporal Data, use the cache policy of depth enhancing learning method study Temporal Data, the temperature trend of mining data from the request of data history of edge caching nodes, in conjunction with Temporal Data information, input as depth enhancing study, immediately reward is set as the opposite number of integrated communication cost, utilize reviewer's e-learning cost function, actor networks learning strategy function, by constantly carrying out caching replacement operation, adaptively learn cache policy, solve the problems, such as that Temporal Data buffer efficiency is not high in storage resource confined condition lower edge cache node;Integrated communication cost is included in the consumption of the data carry mechanism of internet of things data and the communication resource, minimize the long-term overall cost for obtaining internet of things data, can offloading network flow to network edge, reduce time delay, solve the problems, such as to a certain extent time delay present in the transmission of Internet of Things magnanimity Temporal Data is big, the communication resource consume it is big.

Description

A kind of buffer replacing method and system of Internet of Things Temporal Data
Technical field
The invention belongs to wireless communication fields, more particularly, to a kind of buffer replacing method of Internet of Things Temporal Data And system.
Background technique
As Internet of Things (IoT) is in the rapid of the fields such as intelligent transportation, smart grid, smart home and industrial automation The internet of things data flow of development and extensive use, magnanimity brings huge pressure and challenge to communication network now.For It solves the above problems, a kind of common thinking is the addition edge cache mechanism in Internet of Things, utilizes network edge caching section The idle storage resource of point, caches hot spot data, and request end directly can obtain data from corresponding edge caching nodes, and nothing It need to be obtained from data source, so as to avoid a large amount of unnecessary end-to-end communications.Edge cache in Internet of things system can be with Offloading network flow reduces network delay, provides better service quality and user experience.Due to the storage of edge caching nodes Generally than relatively limited, an efficient cache replacement policy can be improved caching and hits rate capacity, so that more efficient utilization is slow Space is deposited, more network flows are unloaded.It widely applies in Internet of things system and also the timeliness of data is required, only one Data in timing effect are just available data, thus data cached freshness be also when caching is replaced one important consider Standard.Thus, edge cache replacement policy in Internet of things system needs to consider simultaneously data cached temperature and freshness letter Breath, thus the buffer size under better meeting scenes of internet of things.
Traditional buffer replacing method, such as first in first out method, least recently used method, least commonly using etc., by Distribution is requested in the temperature trend for not considering content and user, buffer efficiency is lower.The caching of existing edge cache replaces plan It slightly include: being delayed using user-data correlation by the temperature of collaborative filtering prediction data for E.Bastug et al. proposition Deposit strategy start when wolfishly cache hot spot data, until edge caching nodes caching exhaust, then according to predict come Temperature information carries out caching replacement;What P.Blasco et al. was proposed is solved in small base station by knapsack Method (Knapsack) The problem of data are placed, wherein data temperature is that the request rate arrived according to data receiver is estimated;What J.Song et al. was proposed Multi-arm fruit machine (multi-armed bandit, the MAB) method of introducing, data temperature is combined with data buffer storage process and is examined Consider;The temperature by constructing neural network estimated data that S.Tanzil et al. is proposed, uses mixed integer linear programming method Calculate the placement location and size of caching, but the temperature forecast period of this method, need using video content keyword and Classification information, this is not applicable to general IoT data.
The above method focuses primarily upon the edge cache of non-transient data, the temperature that edge caching nodes pass through estimated data To decide whether how caching replaces caching.On the one hand, as it is assumed that the temperature distribution of data and user request to obey specifically It is distributed (such as Poisson distribution), so that data temperature and the fast-changing scene of user's request distribution can not be adapted to;On the other hand, only It is absorbed in the caching of non-transient data, does not account for the imeliness problem of Temporal Data.
Summary of the invention
In view of the drawbacks of the prior art, it is an object of the invention to solve the transmission of prior art Internet of Things magnanimity Temporal Data Present in time delay is big, communication resource consumption is big, Temporal Data caching effect in storage resource confined condition lower edge cache node The not high technical problem of rate.
To achieve the above object, in a first aspect, the embodiment of the invention provides a kind of cachings of Internet of Things Temporal Data to replace Method is changed, the spatial cache of current edge cache node has been expired, method includes the following steps:
S1. edge caching nodes receive the new Temporal Data item request of user's sending;
S2. the Temporal Data item content of request is judged whether in the caching of edge caching nodes, if so, entering step Otherwise S3 enters step S6;
S3. the Temporal Data item for judging request is that fresh data or stale data enter step if fresh data S4 enters step S5 if stale data;
S4. the data directly are read from the buffer area of edge caching nodes, by the data forwarding to user;
S5. user's request is transmitted to data source by edge caching nodes, and new data is read from data source, is replaced with new data The stale data in the buffer area of edge caching nodes is changed, and the new data is transmitted to user;
S6. user's request is transmitted to data source by edge caching nodes, and new data is read from data source, is increased using depth Data to be replaced in the buffer area of strong study selection edge caching nodes replace the data to be replaced with new data, and will The new data is transmitted to user.
Specifically, in step S2, if fk∈Fk, then the data item content requested in the caching of edge caching nodes, ifThe data item content then requested is not in the caching of edge caching nodes, wherein fkRequest k corresponding for data item Data content unique identifier CID, FkThe corresponding CID set of the data item cached in edge caching nodes when being reached for request k.
Specifically, in step S3, if tage(p(fk))≤Tlife(p(fk)), then the data item requested is fresh data, if tage(p(fk)) > Tlife(p(fk)), then the data item requested is stale data, wherein fkRequest k corresponding for data item CID, p () are the mapping function from request content CID to data item, Tlife() is effective life cycle of data item, tage The age of () expression data item.
Specifically, to be replaced in the buffer area described in step S6 using depth enhancing study selection edge caching nodes Data specifically include:
1) at the n moment, edge caching nodes status information is observed, n moment state s is obtainedn
2) according to cache policy π (an|sn) selection caching movement anAnd execute caching movement;
3) it executes caching and acts anAfterwards, reward r immediately is calculatedn, edge caching nodes status information is by snBecome sn+1
4) r will be rewarded immediatelynFeed back edge caching nodes, and by this state conversion process < sn,an,rn,sn+1> make It is repeated the above process for training sample for training actor-reviewer's network of depth enhancing study.
Specifically, r is rewarded immediatelynCalculation formula it is as follows:
Wherein, ReqnIndicate that caching acts anAfter execution, a is acted to caching is executed next timen+1Between edge cache section All request of data set that point receives, C (dk) it is to obtain data dkOverall cost.
Specifically, overall cost C (dk) calculation formula it is as follows:
C(dk)=α c (dk)+(1-α)·l(dk)
Wherein, α ∈ [0,1] indicates compromise coefficient, c (dk) indicate communications cost, l (dk) indicate data age cost, c1 Indicate the communication overhead that data are directly obtained from edge caching nodes, c2Indicate the communication overhead that data are obtained from data source, c1 < c2, and c1、c2It is normal number;fkK corresponding CID, F are requested for data itemkDelay in edge caching nodes when being reached for request k The corresponding CID set of the data item deposited, p () is the mapping function from request content CID to data item, Tlife() is data Effective life cycle of item, tageThe age of () expression data item.
Specifically, rising according to gradient for actor networks parameter θ updates in depth enhancing study are as follows:
Wherein, λ is the learning rate of actor networks,Indicate gradient operator, tactful π (an|sn;θ) indicate in state sn Under, selection caching replacement acts anProbability,For advantage function, γ ∈ [0,1] indicates discount factor,Expression state-cost function;
The network parameter θ of reviewer's network in depth enhancing studyvDecline according to gradient and update are as follows:
Wherein, λ ' is the learning rate of reviewer's network.
Specifically, rising according to gradient for actor networks parameter θ updates in depth enhancing study are as follows:
Wherein, λ is the learning rate of actor networks,Indicate gradient operator, tactful π (an|sn;θ) indicate in state sn Under, selection caching replacement acts anProbability,For advantage function, γ ∈ [0,1] indicates discount factor,Expression state-cost function;H(π(·|sn;It θ)) is state snUnder, strategy πθThe strategy of the motion space of output Entropy, β indicate to explore coefficient;
The network parameter θ of reviewer's network in depth enhancing studyvDecline according to gradient and update are as follows:
Wherein, λ ' is the learning rate of reviewer's network.
Second aspect, the embodiment of the present invention provide a kind of caching replacement system of Internet of Things Temporal Data, and current edge is slow The spatial cache for depositing node has been expired, which includes: condition judgment module, read module, request forwarding module and caching replacement Module;
The condition judgment module, for judging that the state of the requested Temporal Data of user, the state include: state One: the Temporal Data item content of request is in the caching of edge caching nodes and the Temporal Data item of request is fresh data;Shape State two: the Temporal Data item content of request is in the caching of edge caching nodes and the Temporal Data item of request is stale data; State three: the Temporal Data item content of request is not in the caching of edge caching nodes;
The read module, for being state a period of time in the condition judgment module judging result, directly from edge cache The buffer area of node reads the data, by the data forwarding to user;
The request forwarding module, for when the condition judgment module judging result is state two or three, edge to be slow It deposits node and user's request is transmitted to data source, new data is read from data source;
The caching replacement module is used for when the condition judgment module judging result is state two, with the request The stale data in the buffer area for the new data replacement edge caching nodes that forwarding module is read, and the new data is transmitted to use Family;When the condition judgment module judging result is state three, enhance the slow of study selection edge caching nodes using depth Data to be replaced in area are deposited, replace the data to be replaced with the new data that the request forwarding module is read, and this is new Data forwarding is to user.
The third aspect, the embodiment of the invention provides a kind of computer readable storage medium, the computer-readable storage mediums Computer program is stored in matter, which realizes that caching described in above-mentioned first aspect is replaced when being executed by processor Method.
In general, through the invention it is contemplated above technical scheme is compared with the prior art, have below beneficial to effect Fruit:
1. the present invention considers effective life cycle of Temporal Data, by the data carry mechanism and the communication resource of internet of things data Consumption be included in integrated communication cost, to give the target of Internet of Things Temporal Data cache policy: minimize obtain Internet of Things The long-term overall cost of network data.It, being capable of offloading network flow by caching Temporal Data in the edge caching nodes of network To network edge, time delay is reduced, solves that time delay present in the transmission of Internet of Things magnanimity Temporal Data is big, communication to a certain extent The big problem of resource consumption.
2. the present invention is specifically replaced caching using the cache policy of the method study Temporal Data of depth enhancing study The problem of changing is modeled as markoff process, the temperature trend letter of mining data from the request of data history of edge caching nodes Breath, in conjunction with the life cycle and data carry mechanism information of Temporal Data, the input of the ambient condition as depth enhancing study.It is short Phase rewards the opposite number for being set as integrated communication cost.Learn plan using reviewer's e-learning cost function, actor networks Slightly function adaptively learns cache policy, long-term reward is maximized, thus most by constantly carrying out caching replacement operation The long-term overall cost that Temporal Data is obtained in small compound networking, solves in storage resource confined condition lower edge cache node The not high problem of Temporal Data buffer efficiency.
Detailed description of the invention
Fig. 1 is a kind of buffer replacing method flow chart of Internet of Things Temporal Data provided in an embodiment of the present invention.
Specific embodiment
In order to make the objectives, technical solutions, and advantages of the present invention clearer, with reference to the accompanying drawings and embodiments, right The present invention is further elaborated.It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, and It is not used in the restriction present invention.
Firstly, being explained to some terms used in the present invention.
Edge caching nodes refer to the network node with caching capabilities in Internet of Things close to user side.
Data carry mechanism is equal to the effective time of data redundancy and the ratio of effective life cycle, and ratio is bigger, uses Data are more timely, and the freshness of data is higher.
Data age refers to the time interval and efficiency that the data are obtained from data source generation data to user terminal.It obtains Time interval is shorter, and timeliness is stronger.
The popularity degree of data Thermometer registration evidence, i.e., requested number in certain time, number is higher, data temperature It is higher.
Temporal Data refers to the data required data time-effectiveness.
Design of the invention are as follows: firstly, will acquire the comprehensive of internet of things data for the overall cost of research Temporal Data caching Synthesis is originally divided into two parts: communications cost (including bandwidth consumption, time delay etc.) and data timeliness cost.Internet of Things network edge is slow Depositing and caching the target of replacement in node is to minimize the long-term overall cost for obtaining data, i.e., considers communications cost and data simultaneously Timeliness cost.Then, caching replacement problem is modeled as Markov process problem, is learned to construct and be enhanced based on depth Practise the cache policy of (Deep Reinforcement Learning, DRL): according to the request of data that Internet of Things is interior for a period of time History and current buffer status learn cache policy automatically, minimize and obtain the long-term overall cost that internet of things data obtains.
As shown in Figure 1, a kind of buffer replacing method of Internet of Things Temporal Data, the spatial cache of current edge cache node It has been expired that, method includes the following steps:
S1. edge caching nodes receive the new Temporal Data item request of user's sending;
S2. the Temporal Data item content of request is judged whether in the caching of edge caching nodes, if so, entering step Otherwise S3 enters step S6;
S3. the Temporal Data item for judging request is that fresh data or stale data enter step if fresh data S4 enters step S5 if stale data;
S4. the data directly are read from the buffer area of edge caching nodes, by the data forwarding to user;
S5. user's request is transmitted to data source by edge caching nodes, and new data is read from data source, is replaced with new data The stale data in the buffer area of edge caching nodes is changed, and the new data is transmitted to user;
S6. user's request is transmitted to data source by edge caching nodes, and new data is read from data source, is increased using depth Data to be replaced in the buffer area of strong study selection edge caching nodes replace the data to be replaced with new data, and will The new data is transmitted to user.
Step S1. edge caching nodes receive the new Temporal Data item request of user's sending.
Data item d CID (Content ID) unique identification in Internet of Things, each data item include two fields: raw At time tgen(d) and effective life cycle Tlife(d).T is expressed as at the age of t moment, data item dage(d)=t-tgen (d).If the age of data item d is less than its effective life cycle, i.e. tage(d) < Tlife(d), then data item d is claimed to be fresh number According in effective life cycle;Otherwise, data item d is claimed to be stale data, the overaging phase.
It will be denoted as k from the request of the data item of Internet of Things user terminal, the corresponding CID of requested data item content is fk, ask The time for asking arrival is tk.In tkMoment, the collection of data items cached in Internet of Things edge caching nodes are denoted asThe corresponding CID collection of the data item of caching is combined intoWherein, I Indicate the largest buffered data item capacity of edge caching nodes.Mapping functionBy the cid information of request content With the data item of cachingIt connects.
After data item request k is reached, Internet of Things edge caching nodes first check for whether there is CID in caching being fkAnd Fresh data cached item.Three kinds of situations are divided to consider:
Situation 1:fk∈FkAnd tage(p(fk))≤Tlife(p(fk)), that is, the data item content requested in the buffer, and is somebody's turn to do Cache entry be it is fresh, meet timeliness requirement.Therefore, the data item p (f of the direct return cache of edge caching nodesk) to number According to requestor.
Situation 2:fk∈FkAnd tage(p(fk)) > Tlife(p(fk)), that is, the data item content requested in the buffer, and is somebody's turn to do Cache entry is expired.Therefore, edge caching nodes obtain new data from data source and return to data requester, and with newly The data of acquisition replace the stale data in caching.
Situation 3:The data item content requested is not in the buffer.Therefore, edge caching nodes are from data source Place obtains new data and returns to data requester, while using in the buffer area of depth enhancing study selection edge caching nodes Data to be replaced replace the data to be replaced with new data.
By the analysis of three of the above situation, after user issues request k, the returned data item d that receiveskIt can indicate are as follows:
Wherein, fkRequesting the corresponding CID of k, p () for data item is the mapping function from request content CID to data item, FkThe corresponding CID set of the data item cached in edge caching nodes when being reached for request k, tage(d) year of data item d is indicated Age, Tlife(d) effective life cycle of data item d is indicated.
Step S2. judges the Temporal Data item content of request whether in the caching of edge caching nodes, if so, entering step Otherwise rapid S3 enters step S6.
fk∈FkShow request data item content in the caching of edge caching nodes,Show the data of request Item content is not in the caching of edge caching nodes.
The Temporal Data item of step S3. judgement request is fresh data or stale data, if fresh data, into step Rapid S4 enters step S5 if stale data.
tage(p(fk))≤Tlife(p(fk)) show that the data item of request is fresh data, tage(p(fk)) > Tlife(p (fk)) show that the data item of request is stale data.
For the cache replacement policy of edge caching nodes, edge caching nodes first wolfishly cache all arrival when initial Data item, until spatial cache fills up.After caching has been expired, if newly arrived request, corresponding situation 1, without being delayed Deposit replacement;Corresponding situation 2, it is data cached out of date as corresponding to known current request, directly with the number newly obtained According to replacing the stale data;Corresponding situation 3, new data reach, and buffer replacing method is needed to decide whether to use newly Data item replaces the buffered data item of buffer area, if replacement, specifically replaces the data item which has been cached.
Specifically, to data item dkThe caching movement that cache policy provides is denoted as ak, motion space is A={ a0,a1,…, aI}。ak=a0It indicates to replace without caching, ak=ai(1≤i≤I) is indicated with the fresh data item d obtained from data sourcek Replace buffer areaThe corresponding cache entry in position.A is acted when buffer area executes caching replacementkAfter, DkAnd FkIt will become Dk+1With Fk+1
User's request is transmitted to data source by step S6. edge caching nodes, and new data is read from data source, uses depth Data to be replaced, replace the data to be replaced with new data in the buffer area of degree enhancing study selection edge caching nodes, And the new data is transmitted to user.
Step S6 corresponds to situation 3, enhances number to be replaced in the buffer area of study selection edge caching nodes using depth According to definition obtains Temporal Data item dkOverall cost C (dk)。
It will acquire Temporal Data item dkOverall cost C (dk) it is divided into two parts, a part is communications cost c (dk), separately A part is data age cost l (dk)。
Communications cost c (dk) calculation formula it is as follows:
Wherein, c1Indicate the communication overhead that data are directly obtained from edge caching nodes, c2It indicates to obtain number from data source According to communication overhead, c1< c2, and c1、c2It is normal number.
Data age cost l (dk) calculation formula it is as follows:
Overall cost C (dk) calculation formula it is as follows:
C(dk)=α c (dk)+(1-α)·l(dk)
Wherein, α ∈ [0,1] indicates compromise coefficient, is weighted to the importance of two kinds of costs, and biggish α indicates to use Loss of communications is more taken notice of at family, and otherwise, user more takes notice of data age.
In order to optimize the overall cost that Internet of Things obtains data, caching replacement problem is modeled as Markov process and is asked Topic.Corresponding front caching scenario 1 and situation 2 are all deterministic rules, therefore it may only be necessary to optimize the caching replacement in scene 3 Movement.
Markov process problem can pass through { S, A, M (sn+1|sn,an),R(sn,an) definition, wherein S indicates Internet of Things The state set of net system edges cache node, snIndicate the state of n moment edge caching nodes;A indicates cache replacement policy Set of actions, anIndicate the caching movement at n moment;M(sn+1|sn,an) indicate that execution acts anAfterwards, the shape of edge caching nodes State is from snTransfer is sn+1State transition probability matrix, R (sn,an) indicate instant reward function, in state snExecution acts anAfterwards System award feedback.Therefore, entirely caching replacement process can indicate are as follows:
1) at the n moment, edge caching nodes observing system status information obtains system n moment state sn∈S。
2) edge caching nodes are according to cache policy π (an|sn) selection caching movement anAnd it executes.
3) it executes caching and acts anAfterwards, system returns to a reward r immediatelyn=R (sn,an), and system mode is by snTransfer For sn+1
4) r is rewarded immediatelynEdge caching nodes are fed back, by this state conversion process < sn,an,rn,sn+1> as instruction The experience pond for practicing sample addition deeply study is repeated the above process for training actor-reviewer's network.
Wherein, cache policy π (an|sn) indicate, in state snUnder, selection caching replacement acts anProbability, can be abbreviated as π。
The long-term accumulation reward of whole process is denoted asWherein, γ ∈ [0,1] is discount factor, is determined Influence degree of the reward in future to cache decision at this stage.Therefore, it is an object of the present invention to find optimal cache policy π*, it maximizes the stateful lower long-term accumulated reward of institute and it is expected:Wherein, E [Rn| π] indicate slow Deposit the long-term accumulated reward R of tactful πnExpectation.
In order to measure the superiority and inferiority of cache policy π, cost function V is definedπ(sn)=E [Rn|sn;π] indicate that cache policy π exists State snUnder long-term accumulation reward RnExpectation.State snUnder optimum value function be represented by V*(sn)=maxπVπ(sn)。
By Markov property, by Vπ(sn) bring Bellman equation into, it obtains:
Wherein, p (sn+1,rn|sn,an) indicate in state snLower execution acts anAfterwards, it is transferred to state sn+1And obtain system Immediately reward rnProbability, γ ∈ [0,1] indicate discount factor.This equation gives the iteration of cost function in markoff process Calculation method.
If providing the state transition probability matrix M (s of Markov processn+1|sn,an), Dynamic Programming can be passed through Method solves optimal policy.However in edge cache replacement process, state transition probability matrix is unknown, and therefore, the present invention uses The method of depth enhancing study, intelligently the mined information from historical data, learns fitting state-valence using deep neural network Value function, to obtain optimal cache replacement policy.
The present invention enhances learning method using depth, learns efficient cache policy, adaptively with actor-reviewer (Actor-Critic) for depth enhancing learning method, but it is not limited to actor-reviewer's method, introduced of the invention Technical detail.Wherein:
Input: during edge caching nodes constantly handle user's request, there is situation 3, that is, the data item requested Content not in the buffer when, edge caching nodes by forward the request, requested data item d is got from data sourcen。 At this point, cache policy intelligent body (Agent) observes current ambient conditionInput as deep neural network.Wherein,Table Show current request data item dnFeature vector,Indicate i-th of data item in the buffer area of edge caching nodes Feature vector.Feature vectorIt indicates in J group request in the past, it is each To data item content in groupRequest number of times, this feature vector reflects the temperature of the data item.It indicates Data item contentEffective life cycle,Indicate data item contentFreshness, freshness be equal to number According to item contentRemaining effective time and effective life cycle ratio.In addition to these data request informations, the information of input In can also be comprising scene information, edge network information etc..
Strategy: the present invention characterizes replacement policy π with actor networksθ(an|sn), also referred to as π (an|sn;It θ), can letter Referred to as πθ, actor networks parameter is θ.The input of actor networks is the status information s of n moment edge caching nodesn, action The output of person's network is the probability of each caching movement of selection.Depth enhancing study is finally according to tactful πθ(an|sn) select to delay Replacement movement is deposited to execute replacement that is, to the data in the caching zone position for selecting each caching movement maximum probability and act an.Example Such as, there is 3 buffer areas (mark 1,2,3, then indicate not replace with 0) in edge caching nodes.Actor networks export general Rate is (0.1,0.1,0.7,0.1), and the position then replaced according to this probability selection, very possible output is exactly a herein2, So being exactly to replace No. 2 buffer areas.The target of actor networks is to export corresponding caching according to the cache policy learnt Replacement movement, to maximize long-term accumulated reward expectation E [Rnθ]。
State-cost function: the present invention characterizes state-cost function with reviewer's networkReviewer Network parameter is θv.The input of reviewer's network is the status information s of n moment edge caching nodesn, the output of reviewer's network It is the state in current strategies πθLower represented value.The target of reviewer's network is accurately to estimate tactful π as far as possibleθ Lower state snValue.
Strategies Training: every primary caching replacement of execution acts an, system feedback is to cache policy intelligent decision module one Immediately reward rn:
Wherein, ReqnIndicate that caching acts anAfter execution, a is acted to caching is executed next timen+1Between edge cache section All request of data set that point receives, C (dk) it is to obtain data dkOverall cost.Due to target be in order to minimize obtain The long-term overall cost of data, so having added negative sign before cost.
Track is shifted according to the state of Bellman equation and Markov process, it is estimated that the phase always rewarded for a long time It hopes, thus come learning network parameter by way of gradient updating.Ladder of the expectation always rewarded relative to actor networks parameter θ Degree may be calculated:
In order to maximize the expectation always rewarded, rising according to gradient for actor networks parameter θ updates are as follows:
Wherein, λ is the learning rate of actor networks, can be adjusted according to the actual situation;Indicate gradient operator;Advantage FunctionIt has measured in state snUnder, selection acts anIt is how well.
Reviewer's network can be set as the output of reviewer's network by the training of the method for time difference, loss functionWith target valueSquare error.The network parameter θ of reviewer's networkvAccording to gradient Decline updates are as follows:
Wherein, λ ' is the learning rate of reviewer's network, can be adjusted according to the actual situation;Indicate gradient operator.
In order to solve " exploration-utilizes predicament " (exploration-exploitation in enhancing study Dilemma), " utilization " indicates to take the currently optimal movement that has learnt, and " exploration " attempt by take it is current it is non-most Motion space is fully explored in good movement.Learning strategy falls into local optimum in order to prevent, and the technical program is by by plan The updated of actor networks parameter θ is added in the entropy of slightly (the movement probability distribution of actor networks output) in the form of regular terms Journey, to realize the encouragement to " exploration " process:
Wherein, H (π (| sn;It θ)) is state snUnder, strategy πθThe tactful entropy of the motion space of output, β indicate to explore system Number,Indicate gradient operator.It is specific to calculate are as follows:
By gradient ascent method, θ is updated towards the direction that entropy increases, and encourages " exploration " process.Explore factor beta be one just Number, to balance the degree of " exploration " and " utilization ".Bigger β value indicates more to encourage " exploration ", and when specific implementation can basis It needs to adjust.
Cache policy supports on-line study and off-line learning two ways.On-line study can directly be deployed to edge cache Node learns cache policy according to the Internet of Things request data that edge caching nodes are handled, periodically update network parameter. It is under off-line learning is first online that cache policy pre-training is good, edge caching nodes are then deployed to, are remained unchanged.
The present invention provides a kind of caching replacement system of Internet of Things Temporal Data, which includes: condition judgment module, reads Modulus block, request forwarding module and caching replacement module;
The condition judgment module, for judging that the state of the requested Temporal Data of user, the state include: state One: the Temporal Data item content of request is in the caching of edge caching nodes and the Temporal Data item of request is fresh data;Shape State two: the Temporal Data item content of request is in the caching of edge caching nodes and the Temporal Data item of request is stale data; State three: the Temporal Data item content of request is not in the caching of edge caching nodes;
The read module, for being state a period of time in the condition judgment module judging result, directly from edge cache The buffer area of node reads the data, by the data forwarding to user;
The request forwarding module, for when the condition judgment module judging result is state two or three, edge to be slow It deposits node and user's request is transmitted to data source, new data is read from data source;
The caching replacement module is used for when the condition judgment module judging result is state two, with the request The stale data in the buffer area for the new data replacement edge caching nodes that forwarding module is read, and the new data is transmitted to use Family;When the condition judgment module judging result is state three, enhance the slow of study selection edge caching nodes using depth Data to be replaced in area are deposited, replace the data to be replaced with the new data that the request forwarding module is read, and this is new Data forwarding is to user.
The system further includes training module, acted in every execution once caching replacement movement, collecting caching replacement, Immediately the state of the edge caching nodes of reward, replacement front and back brought by replacement movement, and based on described in above-mentioned parameter training Cache the network parameter of depth enhancing study in replacement module.
More than, the only preferable specific embodiment of the application, but the protection scope of the application is not limited thereto, and it is any Within the technical scope of the present application, any changes or substitutions that can be easily thought of by those familiar with the art, all answers Cover within the scope of protection of this application.Therefore, the protection scope of the application should be subject to the protection scope in claims.

Claims (10)

1. a kind of buffer replacing method of Internet of Things Temporal Data, which is characterized in that the spatial cache of current edge cache node It has been expired that, method includes the following steps:
S1. edge caching nodes receive the new Temporal Data item request of user's sending;
S2. judge the Temporal Data item content of request whether in the caching of edge caching nodes, it is no if so, enter step S3 Then, S6 is entered step;
S3. judge that the Temporal Data item of request is that fresh data or stale data if fresh data enter step S4, if It is stale data, enters step S5;
S4. the data directly are read from the buffer area of edge caching nodes, by the data forwarding to user;
S5. user's request is transmitted to data source by edge caching nodes, and new data is read from data source, replaces side with new data The stale data in the buffer area of edge cache node, and the new data is transmitted to user;
S6. user's request is transmitted to data source by edge caching nodes, and new data is read from data source, is enhanced using depth and is learned Data to be replaced in the buffer area of selection edge caching nodes are practised, replace the data to be replaced with new data, and this is new Data forwarding is to user.
2. buffer replacing method as described in claim 1, which is characterized in that in step S2, if fk∈Fk, then the data requested Content in the caching of edge caching nodes, ifThe data item content then requested is not in the slow of edge caching nodes In depositing, wherein fkK corresponding data content unique identifier CID, F are requested for data itemkEdge cache section when being reached for request k The corresponding CID set of the data item cached in point.
3. buffer replacing method as described in claim 1, which is characterized in that in step S3, if tage(p(fk))≤Tlife(p (fk)), then the data item requested is fresh data, if tage(p(fk)) > Tlife(p(fk)), then the data item requested is expired Data, wherein fkRequesting the corresponding CID of k, p () for data item is the mapping function from request content CID to data item, Tlife () is effective life cycle of data item, tageThe age of () expression data item.
4. buffer replacing method as described in claim 1, which is characterized in that use depth enhancing study choosing described in step S6 Data to be replaced in the buffer area of edge caching nodes are selected, are specifically included:
1) at the n moment, edge caching nodes status information is observed, n moment state s is obtainedn
2) according to cache policy π (an|sn) selection caching movement anAnd execute caching movement;
3) it executes caching and acts anAfterwards, reward r immediately is calculatedn, edge caching nodes status information is by snBecome sn+1
4) r will be rewarded immediatelynFeed back edge caching nodes, and by this state conversion process < sn,an,rn,sn+1> as training Sample is repeated the above process for training actor-reviewer's network of depth enhancing study.
5. buffer replacing method as claimed in claim 4, which is characterized in that reward r immediatelynCalculation formula it is as follows:
Wherein, ReqnIndicate that caching acts anAfter execution, a is acted to caching is executed next timen+1Between edge caching nodes receive All request of data set arrived, C (dk) it is to obtain data dkOverall cost.
6. buffer replacing method as claimed in claim 5, which is characterized in that overall cost C (dk) calculation formula it is as follows:
C(dk)=α c (dk)+(1-α)·l(dk)
Wherein, α ∈ [0,1] indicates compromise coefficient, c (dk) indicate communications cost, l (dk) indicate data age cost, c1It indicates The communication overhead of data, c are directly obtained from edge caching nodes2Indicate the communication overhead that data are obtained from data source, c1< c2, and c1、c2It is normal number;fkK corresponding CID, F are requested for data itemkIt is cached in edge caching nodes when being reached for request k Data item corresponding CID set, p () is the mapping function from request content CID to data item, Tlife() is data item Effective life cycle, tageThe age of () expression data item.
7. buffer replacing method as claimed in claim 4, which is characterized in that actor networks parameter θ in depth enhancing study According to gradient rise update are as follows:
Wherein, λ is the learning rate of actor networks,Indicate gradient operator, tactful π (an|sn;θ) indicate in state snUnder, Selection caching replacement acts anProbability,For advantage function, γ ∈ [0,1] indicates discount factor,Expression state-cost function;
The network parameter θ of reviewer's network in depth enhancing studyvDecline according to gradient and update are as follows:
Wherein, λ ' is the learning rate of reviewer's network.
8. buffer replacing method as claimed in claim 4, which is characterized in that actor networks parameter θ in depth enhancing study According to gradient rise update are as follows:
Wherein, λ is the learning rate of actor networks,Indicate gradient operator, tactful π (an|sn;θ) indicate in state snUnder, Selection caching replacement acts anProbability,For advantage function, γ ∈ [0,1] indicates discount factor,Expression state-cost function;H(π(·|sn;It θ)) is state snUnder, strategy πθThe strategy of the motion space of output Entropy, β indicate to explore coefficient;
The network parameter θ of reviewer's network in depth enhancing studyvDecline according to gradient and update are as follows:
Wherein, λ ' is the learning rate of reviewer's network.
9. a kind of caching replacement system of Internet of Things Temporal Data, which is characterized in that the spatial cache of current edge cache node It has been expired that, which includes: condition judgment module, read module, request forwarding module and caching replacement module;
The condition judgment module, for judging that the state of the requested Temporal Data of user, the state include: state one: asking The Temporal Data item content asked is in the caching of edge caching nodes and the Temporal Data item of request is fresh data;State two: The Temporal Data item content of request is in the caching of edge caching nodes and the Temporal Data item of request is stale data;State Three: the Temporal Data item content of request is not in the caching of edge caching nodes;
The read module, for being state a period of time in the condition judgment module judging result, directly from edge caching nodes Buffer area read the data, by the data forwarding to user;
The request forwarding module is used for when the condition judgment module judging result is state two or three, edge cache section User's request is transmitted to data source by point, and new data is read from data source;
The caching replacement module, for being forwarded with the request when the condition judgment module judging result is state two The stale data in the buffer area for the new data replacement edge caching nodes that module is read, and the new data is transmitted to user; When the condition judgment module judging result is state three, enhance the buffer area of study selection edge caching nodes using depth In data to be replaced, replace the data to be replaced with the new data that the request forwarding module is read, and by the new data It is transmitted to user.
10. a kind of computer readable storage medium, which is characterized in that be stored with computer on the computer readable storage medium Program, the computer program realize buffer replacing method as claimed in any one of claims 1 to 8 when being executed by processor.
CN201811370683.4A 2018-11-17 2018-11-17 Cache replacement method and system for transient data of Internet of things Active CN109660598B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811370683.4A CN109660598B (en) 2018-11-17 2018-11-17 Cache replacement method and system for transient data of Internet of things

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811370683.4A CN109660598B (en) 2018-11-17 2018-11-17 Cache replacement method and system for transient data of Internet of things

Publications (2)

Publication Number Publication Date
CN109660598A true CN109660598A (en) 2019-04-19
CN109660598B CN109660598B (en) 2020-05-19

Family

ID=66111253

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811370683.4A Active CN109660598B (en) 2018-11-17 2018-11-17 Cache replacement method and system for transient data of Internet of things

Country Status (1)

Country Link
CN (1) CN109660598B (en)

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110456647A (en) * 2019-07-02 2019-11-15 珠海格力电器股份有限公司 A kind of intelligent home furnishing control method and intelligent home control device
CN111277666A (en) * 2020-02-21 2020-06-12 南京邮电大学 Online collaborative caching method based on freshness
CN111292001A (en) * 2020-02-24 2020-06-16 清华大学深圳国际研究生院 Joint decision method and device based on reinforcement learning
CN113038616A (en) * 2021-03-16 2021-06-25 电子科技大学 Frequency spectrum resource management and allocation method based on federal learning
CN113055721A (en) * 2019-12-27 2021-06-29 中国移动通信集团山东有限公司 Video content distribution method and device, storage medium and computer equipment
CN113115362A (en) * 2021-04-16 2021-07-13 三峡大学 Cooperative edge caching method and device
CN113115368A (en) * 2021-04-02 2021-07-13 南京邮电大学 Base station cache replacement method, system and storage medium based on deep reinforcement learning
CN113395333A (en) * 2021-05-31 2021-09-14 电子科技大学 Multi-edge base station joint cache replacement method based on intelligent agent depth reinforcement learning
CN113438315A (en) * 2021-07-02 2021-09-24 中山大学 Internet of things information freshness optimization method based on dual-network deep reinforcement learning
CN113630742A (en) * 2020-08-05 2021-11-09 北京航空航天大学 Mobile edge cache replacement method adopting request rate and dynamic property of information source issued content
CN113676513A (en) * 2021-07-15 2021-11-19 东北大学 Deep reinforcement learning-driven intra-network cache optimization method
WO2021253168A1 (en) * 2020-06-15 2021-12-23 Alibaba Group Holding Limited Managing data stored in a cache using a reinforcement learning agent
CN114170560A (en) * 2022-02-08 2022-03-11 深圳大学 Multi-device edge video analysis system based on deep reinforcement learning
CN115914388A (en) * 2022-12-14 2023-04-04 广东信通通信有限公司 Resource data fresh-keeping method based on monitoring data acquisition

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090070533A1 (en) * 2007-09-07 2009-03-12 Edgecast Networks, Inc. Content network global replacement policy
CN102291447A (en) * 2011-08-05 2011-12-21 中国电信股份有限公司 Content distribution network load scheduling method and system
CN106452919A (en) * 2016-11-24 2017-02-22 济南浪潮高新科技投资发展有限公司 Fog node optimization method based on fussy theory
CN106888270A (en) * 2017-03-30 2017-06-23 网宿科技股份有限公司 Return the method and system of source routing scheduling
CN107479829A (en) * 2017-08-03 2017-12-15 杭州铭师堂教育科技发展有限公司 A kind of Redis cluster mass datas based on message queue quickly clear up system and method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090070533A1 (en) * 2007-09-07 2009-03-12 Edgecast Networks, Inc. Content network global replacement policy
CN102291447A (en) * 2011-08-05 2011-12-21 中国电信股份有限公司 Content distribution network load scheduling method and system
CN106452919A (en) * 2016-11-24 2017-02-22 济南浪潮高新科技投资发展有限公司 Fog node optimization method based on fussy theory
CN106888270A (en) * 2017-03-30 2017-06-23 网宿科技股份有限公司 Return the method and system of source routing scheduling
CN107479829A (en) * 2017-08-03 2017-12-15 杭州铭师堂教育科技发展有限公司 A kind of Redis cluster mass datas based on message queue quickly clear up system and method

Cited By (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110456647A (en) * 2019-07-02 2019-11-15 珠海格力电器股份有限公司 A kind of intelligent home furnishing control method and intelligent home control device
CN113055721B (en) * 2019-12-27 2022-12-09 中国移动通信集团山东有限公司 Video content distribution method and device, storage medium and computer equipment
CN113055721A (en) * 2019-12-27 2021-06-29 中国移动通信集团山东有限公司 Video content distribution method and device, storage medium and computer equipment
WO2021164378A1 (en) * 2020-02-21 2021-08-26 南京邮电大学 Freshness based online cooperative caching method
CN111277666A (en) * 2020-02-21 2020-06-12 南京邮电大学 Online collaborative caching method based on freshness
CN111277666B (en) * 2020-02-21 2021-06-01 南京邮电大学 Online collaborative caching method based on freshness
CN111292001A (en) * 2020-02-24 2020-06-16 清华大学深圳国际研究生院 Joint decision method and device based on reinforcement learning
CN111292001B (en) * 2020-02-24 2023-06-02 清华大学深圳国际研究生院 Combined decision method and device based on reinforcement learning
CN115398877B (en) * 2020-06-15 2024-03-26 阿里巴巴集团控股有限公司 Managing data stored in a cache using reinforcement learning agents
WO2021253168A1 (en) * 2020-06-15 2021-12-23 Alibaba Group Holding Limited Managing data stored in a cache using a reinforcement learning agent
CN115398877A (en) * 2020-06-15 2022-11-25 阿里巴巴集团控股有限公司 Managing data stored in a cache using reinforcement learning agents
CN113630742A (en) * 2020-08-05 2021-11-09 北京航空航天大学 Mobile edge cache replacement method adopting request rate and dynamic property of information source issued content
CN113038616A (en) * 2021-03-16 2021-06-25 电子科技大学 Frequency spectrum resource management and allocation method based on federal learning
CN113115368A (en) * 2021-04-02 2021-07-13 南京邮电大学 Base station cache replacement method, system and storage medium based on deep reinforcement learning
CN113115362A (en) * 2021-04-16 2021-07-13 三峡大学 Cooperative edge caching method and device
CN113395333B (en) * 2021-05-31 2022-03-25 电子科技大学 Multi-edge base station joint cache replacement method based on intelligent agent depth reinforcement learning
CN113395333A (en) * 2021-05-31 2021-09-14 电子科技大学 Multi-edge base station joint cache replacement method based on intelligent agent depth reinforcement learning
CN113438315A (en) * 2021-07-02 2021-09-24 中山大学 Internet of things information freshness optimization method based on dual-network deep reinforcement learning
CN113676513B (en) * 2021-07-15 2022-07-01 东北大学 Intra-network cache optimization method driven by deep reinforcement learning
CN113676513A (en) * 2021-07-15 2021-11-19 东北大学 Deep reinforcement learning-driven intra-network cache optimization method
CN114170560A (en) * 2022-02-08 2022-03-11 深圳大学 Multi-device edge video analysis system based on deep reinforcement learning
CN115914388A (en) * 2022-12-14 2023-04-04 广东信通通信有限公司 Resource data fresh-keeping method based on monitoring data acquisition

Also Published As

Publication number Publication date
CN109660598B (en) 2020-05-19

Similar Documents

Publication Publication Date Title
CN109660598A (en) A kind of buffer replacing method and system of Internet of Things Temporal Data
Zhu et al. Caching transient data for Internet of Things: A deep reinforcement learning approach
CN109639760B (en) It is a kind of based on deeply study D2D network in cache policy method
Huang et al. FedParking: A federated learning based parking space estimation with parked vehicle assisted edge computing
Tong et al. Adaptive computation offloading and resource allocation strategy in a mobile edge computing environment
Zhang et al. Joint optimization of cooperative edge caching and radio resource allocation in 5G-enabled massive IoT networks
CN114143891A (en) FDQL-based multi-dimensional resource collaborative optimization method in mobile edge network
CN112866015B (en) Intelligent energy-saving control method based on data center network flow prediction and learning
Archi et al. Applications of Deep Reinforcement Learning in Wireless Networks-A Recent Review
Hribar et al. Utilising correlated information to improve the sustainability of internet of things devices
CN110290510A (en) Support the edge cooperation caching method under the hierarchical wireless networks of D2D communication
Shaghluf et al. Spectrum and energy efficiency of cooperative spectrum prediction in cognitive radio networks
CN116346837A (en) Internet of things edge collaborative caching method based on deep reinforcement learning
Samikwa et al. Adaptive early exit of computation for energy-efficient and low-latency machine learning over iot networks
Somesula et al. Cooperative cache update using multi-agent recurrent deep reinforcement learning for mobile edge networks
Jiang et al. A reinforcement learning-based computing offloading and resource allocation scheme in F-RAN
Xiong et al. Distributed caching in converged networks: A deep reinforcement learning approach
Zhang et al. Dual-timescale resource allocation for collaborative service caching and computation offloading in IoT systems
CN114554495A (en) Federal learning-oriented user scheduling and resource allocation method
Shui et al. Cell-free networking for integrated data and energy transfer: Digital twin based double parameterized dqn for energy sustainability
Sun et al. Knowledge-driven deep learning paradigms for wireless network optimization in 6G
CN117473616A (en) Railway BIM data edge caching method based on multi-agent reinforcement learning
Su et al. Outage performance analysis and resource allocation algorithm for energy harvesting D2D communication system
CN116009990B (en) Cloud edge collaborative element reinforcement learning computing unloading method based on wide attention mechanism
He [Retracted] Application of Neural Network Sample Training Algorithm in Regional Economic Management

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant