CN109617991A - Based on value function approximate super-intensive heterogeneous network small station coding cooperative caching method - Google Patents
Based on value function approximate super-intensive heterogeneous network small station coding cooperative caching method Download PDFInfo
- Publication number
- CN109617991A CN109617991A CN201811634918.6A CN201811634918A CN109617991A CN 109617991 A CN109617991 A CN 109617991A CN 201811634918 A CN201811634918 A CN 201811634918A CN 109617991 A CN109617991 A CN 109617991A
- Authority
- CN
- China
- Prior art keywords
- state
- small station
- time slot
- base station
- function
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/50—Network services
- H04L67/56—Provisioning of proxy services
- H04L67/568—Storing data temporarily at an intermediate stage, e.g. caching
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/06—Protocols specially adapted for file transfer, e.g. file transfer protocol [FTP]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W28/00—Network traffic management; Network resource management
- H04W28/02—Traffic management, e.g. flow control or congestion control
- H04W28/10—Flow control between communication endpoints
- H04W28/14—Flow control between communication endpoints using intermediate storage
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Mobile Radio Communication Systems (AREA)
Abstract
The invention discloses be based on value function approximate super-intensive heterogeneous network small station coding cooperative caching method.The intensified learning method of adopted value approximation to function, value function is expressed as to the function of state and movement, to maximize file request number that average accumulated small station directly services as optimization aim, by constantly with environmental interaction, adapt to the dynamic change of environment, potential file request transfer mode is excavated, the approximate expression of value function is obtained, and then obtains the cooperation caching decision to match with file request transfer mode;Macro base station encodes cooperation caching decision, and coding cooperative buffered results are communicated to each small station.The transfer mode of file request formulates cache decision in the live network that the present invention is excavated by intensified learning, without any pair of data prior distribution it is assumed that be more applicable for real system;And by with environment real-time, interactive, the file popularity of traceable time-varying makes corresponding cache policy, and process simple possible is not required to solution NP-hard problem.
Description
Technical field
The invention belongs to the wireless network deployment technical fields in mobile communication, in particular to a kind of super-intensive heterogeneous network
Network small station coding cooperative caching method.
Background technique
With the popularization of intelligent terminals with the development of Internet service, serviced to meet user for high data rate and height
The requirement of quality, super-intensive heterogeneous network will become one of the key technology of the 5th Generation Mobile Communication System (5G), by macro
The intensive small station of deployment, can effectively promote the communication quality of network edge user in base station range, to improve frequency spectrum effect
Rate and throughput of system.However, the small station of dense deployment is to wireless since small station by wireless backhaul link is connected to macro base station
Backhaul link causes huge pressure, and the wireless backhaul link of high load becomes the bottleneck of network at this time.The super-intensive network architecture
It is urgently blended with other network architectures or technology preferably to service user, mobile network's marginalisation is exactly a kind of suitable
Selection.Edge storage is an important concept in mobile network's marginalisation framework, i.e., reduces peak in small station cache file
The mass data transfers of phase can effectively mitigate system backhaul link load, reduce propagation delay time, promote user experience.Super-intensive
Heterogeneous network middle and small stations number is more, distance is close, and user generally lies in the coverage area in multiple small stations, if small station is user association
Make transmission file, then the limited spatial cache in small station can be made more fully to be utilized.Therefore in super-intensive heterogeneous network
Edge cache problem be worth further investigation.
Cache decision is often modeled as an optimization problem by existing caching technology.Firstly, often thinking file prevalence
Degree does not change over time, and the file popularity in real network changes constantly, this popular based on constant file
The method that degree carrys out solving optimization problem is unable to the continuous variation of trace files popularity, so that the cache decision gone out can not
It is perfectly suitable for real network;Even secondly, constant file popularity is changed into instantaneous file popularity, file stream
The transformation of row degree is primary, and optimization problem will rerun once, brings huge network overhead, moreover the optimization problem modeled is past
Toward being NP-hard (Non-Polynomial hard) problem, solve extremely difficult;Finally, due to which cache problem itself is basis
The file request behavior having occurred and that in network, makes cache decision, and the file request behavior for that will occur is prepared, base
The transfer mode of file request in network cannot be excavated in the method that tradition solution optimization problem formulates cache decision, to make to make
Cache decision not be optimal to the file request that will occur.
Summary of the invention
In order to solve the technical issues of above-mentioned background technique proposes, the present invention provides different based on the approximate super-intensive of value function
Network forming network small station coding cooperative caching method, adopted value function approximation method excavate the potential transfer mode of file request, obtain
Obtain the cooperation caching strategy better than conventional method.
In order to achieve the above technical purposes, the technical solution of the present invention is as follows:
Based on value function approximate super-intensive heterogeneous network small station coding cooperative caching method, by macro base station and its covering model
As machine, macro base station is responsible for determining the small station movement to be executed under each time slot state and is assigned to each small in small station in enclosing
It stands, each small station is responsible for executing movement, and the state includes the file popularity of this time slot and the cooperation caching that previous time slot is made
Decision, it is next time slot file request service collaboration cache decision that the movement, which refers to that this time slot is made,;It is close using value function
As intensified learning method, value function is expressed as to the function of state and movement, is directly serviced with maximizing average accumulated small station
File request number be optimization aim, by constantly with environmental interaction, adapting to the dynamic change of environment, excavate potential
File request transfer mode obtains the approximate expression of value function, and then obtains slow with cooperating of matching of file request transfer mode
Deposit decision;Macro base station encodes cooperation caching decision, and coding cooperative buffered results are communicated to each small station.
Further, comprising the following steps:
Step 1, the acquisition network information, are arranged parameter:
Acquire macro base station set M, the small station set P, file request set C in network1And m-th of macro base station covers model
Enclose interior small station number pm,m∈M;Small station spatial cache K is obtained, operator determines according to network operation situation and hardware cost
Stand spatial cache K;One time was divided into T time slot according to the file request situation in super-intensive heterogeneous network by operator,
And the start time of each time slot is set, each time slot is successively divided into three phases according to time of origin: file transmits rank
Section, information exchange stage and caching decision phase;
The base station collaboration buffering scheme that step 2, formulation are encoded based on MDS:
The cooperation caching decision vector in small station is denoted as a (t), each element a in a (t)c(t) [0,1] ∈, c ∈ C1, generation
Table caches the ratio of c-th of file, a in t time slot small stationc(t) ≠ 0 file set is the file set of t slot buffer,
It is denoted as C'(t), c-th of file includes B information bit, and m-th of macro base station is encoded by MDS B information bit coding life
AtA check bit:
In above formula, d is the number in the small station that received signal power is greater than a threshold value, and threshold value is transported by operator according to network
Market condition determines ownA check bit is divided into small station candidate's bit and macro base station candidate's bit two parts, medium and small
Candidate bit of standing includes pmB bit, i.e., there is mutually unduplicated B candidate bit in each small station, in each small station of t time slot from each
From candidate bit in select before ac(t) B bit is cached;
Macro base station arbitrarily chooses (1-da from its candidate bitc(t)) B bit is cached, Encoding according to MDS
Matter, a file request, which obtains at least B check bit, can restore entire file;
Step 3 formulates base station collaboration transmission plan:
Each file request of user obtains da from d small station for covering it firstc(t) B bit, if dac(t) >=1,
Then macro base station is not required to transmit data again;Otherwise macro base station selects a small station nearest apart from user, transmission from d small station
(1-dac(t)) B bit gives the small station, then by the small station these bit transfers to user, the data of macro base station transmission claim
For backhaul link load;
Step 4 describes intensified learning task with markov decision process MDP:
Establish intensified learning four-tupleWherein X represents state space, and A representative movement is empty
Between,State transition probability is represented, execution movement a is transferred to x ' shape probability of state under x state,Represent the transfer
Bring award;
Intensified learning four-tuple concrete form is as follows:
Motion space: since the element number that cache decision vector includes is equal to set C1Element number C, therefore act
Space is C dimension continuous space, every dimension ac(t) it is quantized into L discrete value, L is determined by operator according to macro station computing capability, then
The motion space of discretization is A={ a1,a2,…,a|A|, wherein any one acts vectorj∈
1,2 ..., | A | need to meetThe movement vector total number for meeting the condition is | A |, the caching of t time slot is determined
Plan a (t) ∈ A;
State space: the p in t time slot, m-th of macro station coverage areamA small station file request total degree be denoted as to
Measure N (t)=[N1(t),N2(t),…,NC(t)], general act popularity is denoted as vector theta (t)=[θ1(t),θ2(t),…,θC
(t)], whereinC ∈ C, then the state of t time slot is denoted as x (t)=[Θ (t), a (t-
1)];Enable H={ Θ1,Θ2,…,Θ|H|It is general act popularity set, it is a member in set H after Θ (t) is quantified
Element, then state space is denoted as X={ x1,x2,…,x|H||A|, state x (t) ∈ X;
State transition probability: after t time slot execution acts a (t), which is applied on current state x (t), ring
Border is from current state with potential transition probabilityIt is transferred to next state x (t+1), which is unknown
's;
Award: while environment is transferred to x (t+1), environment can be to one, machine award, awardHerein
It is defined as the file request number that small station directly services:
In above formula, u [] represents jump function,For in t
The cache decision stage of gap updates small station and caches the number of files that need to be transmitted,For in the information exchange stage of (t+1) time slot by macro base
Stand transmission number of files;
Step 5, clear intensified learning target:
It defines deterministic policy function π (x), x ∈ X knows, the movement a (t) to be executed at state x (t) according to the strategy
=π (x (t)), then state value function:
In above formula,It represents from state x (t), is awarded using being accumulated brought by tactful π, 0≤γ < 1
It is measurement of the movement π (x (t)) to state influence degree in future of t time slot execution;
After obtaining state value function, state-movement value function, i.e. Q function are just obtained:
In above formula,A'(t)) represent from state x (t), execution acts a'(t) after reuse tactful π band
The accumulation award come;
X (t), x (t+1), a'(t are replaced respectively with x, x', a), target is to find that expectation is made to accumulate awardMaximum strategy is denoted as π*(x), optimal value function isIt is obtained according to optimal policy:
Namely:
Step 6 is formulated based on the approximate Q-learning process of value function:
(601) Q function is indicated with the approximate method of value function, i.e., is the function of state and movement by Q function representation, by
Instantaneous awardInspiration, at state x (t), execution act a'(t), Q approximation to function indicate are as follows:
In above formula, ω1And ω2Two-part weight is represented, ω is set1> > ω2, β, ηi, ξiIt is unknown parameter, needs
It is obtained by study;
(602) cooperation caching decision is solved:
(603) target of Q-learning is established:
It is calculated at state x (t) according to above formula, execution movement a (t) brings accumulation award true value:
In above formula,For the motion estimation value under state x (t+1);
(604) loss function is defined:
In above formula, η=[η1,η2,…,ηC], ξ=[ξ1,ξ2,…,ξC], EπExpression asks expectation to tactful π;
According to loss function undated parameter β, η, ξ;
Step 7, setting current time slots t=1, are randomly provided initial state x (t)=[Θ (t), a (t-1)], parameter is initial
Value βp=0, ηp=0, ξp=0, operator according to network change speed be arranged γ value, range be [0,1), according to what is updated
The order of magnitude of parameter determines the value for updating step-length δ, range be (0,1], the number t of training time slot is set according to network sizetotal;
Step 8, the cache decision stage in t time slot take the association to be executed under state x (t) using ε-greedy method strategy
Make cache decision a (t);
The file for needing to cache is carried out MDS coding according to step 2 by step 9, macro base station, and the data packet after coding is passed
It is defeated by small station caching;
Step 10, the file transmission stage in t+1 time slot, user's demand file, base station is to use according to step 3 cooperation transmission
Family service;
Step 11, the information exchange stage in t+1 time slot, all small stations in each macro base station coverage area are by it in t+
File request number reports that macro base station summary file request total degree is denoted as vector N (t+1), and calculates to macro base station in 1 time slot
General act popularity is denoted as vector theta (t+1);
Step 12, the state being transferred to are x (t+1)=[Θ (t+1), a (t)], calculate reward functions
Step 13, estimation movement to be executed at state x (t+1):
Step 14 updates the parameter in Q approximation to function formula according to step (604);
If step 15, t=ttotal, then deconditioning, enters step 16;Otherwise, t=t+1 is returned into next time slot
To step 8, continue to train;
Step 16, since t time slot, cooperation caching decision is determined based on the obtained Q approximation to function formula of training, is served down
The file request of one time slot.
Further, in step 3, the determination method of d is as follows:
If the probability that user is serviced by d' small station is pd', it is primarily based on the base station deployment situation of operator, according to user
P is calculated in the historical data of positiond': in period τ, record the position of U user respectively every τ ' time interval, τ with
τ ' is voluntarily determined that record user u ∈ { 1,2 ..., U } receives signal at each position by operator according to network operation situation
Power is greater than the base station number d' of a threshold value, then the position number that base station number is d' is denoted asUtilize U user's
Historical position is calculated:
In above formula,It indicates in the historical position of user u, there is i base station to provide the position number of service for user u;
Then, choosing d is to make probability value pd' maximum d':
Further, in step (602), due to ω1> > ω2, omitObtain cache decision:
The solution procedure of above formula is as follows:
1. according to lmaxD/L >=1 determines the maximum value of element in cache decision vector, lmaxIt is the denominator of greatest member, by
In the l in the value range for meeting inequalitymaxIt is the smaller the better, therefore Expression rounds up;
2. calculating the number z of each element i/L in cache decision vector according to node B cache spacei, i=1,2 ...,
lmax:
WhereinIt indicates to be rounded downwards;
3. determining the position of each element: coefficient ηiθi(t), i=1,2 ..., C are arranged in descending order, and j-th after sequence
Element is denoted asCorresponding to the h before sequencejA file primarily determines the position of each element first:
Then, it adjustsIn meet condition 1-lmaxThe element of d/L < 0, fromStart to
J=1 terminates, and recycles following step to adjust the element in movement vector: fromIn
Find the condition of satisfactionWithMinimum j',Subtract 1/L,Add 1/L;
Equally using in above-mentioned method for solving estimating step 13
Further, in step 8, with probability 1- ε according to step (602) selecting collaboration cache decision;It is random with probability ε
Selection one meets conditionWithCooperation caching decision.
Further, it in step (604), is updated using stochastic gradient descent method in Q approximation to function expression
Parameter beta, η, ξ:
β in above formulac,Represent the parameter of current time slots, βp,Represent the parameter of previous time slot, 0 < δ≤
1 represents update step-length.
By adopting the above technical scheme bring the utility model has the advantages that
The present invention provides service using small station cooperative coding caching and cooperation transmission for user, passes through intensified learning and excavates receipts
The transfer mode of file request formulates cache decision in the live network collected, a kind of machine learning side as data-driven
Method, without any pair of data prior distribution it is assumed that being more applicable for real system;And by with environment real-time, interactive, can chase after
The file popularity of track time-varying, makes corresponding cache policy, and process simple possible is not required to solution NP-hard problem.
The present invention is based on value function approximation methods to formulate cooperation caching decision, macro base station by the continuous interaction with environment,
Collection status information makes corresponding cooperation caching decision, and decision is communicated to each small station, and it is limited to efficiently use small station
Memory space caches most accurate file, significantly improves the file request number directly serviced by small station, reduces system backhaul chain
Road load.
Detailed description of the invention
Fig. 1 is flow chart of the method for the present invention.
Specific embodiment
Below with reference to attached drawing, technical solution of the present invention is described in detail.
The present invention proposes a kind of to maximize file request number that average accumulated small station directly services as target, in small station
Under the premise of cache file total size is no more than small station spatial cache, compiled based on value function approximate super-intensive heterogeneous network small station
Code cooperation caching method.This method excavates the transfer mode of file request by intensified learning, and according to the mode system excavated
Determine small station coding cooperative caching method.Intensified learning is described as a MDP (Markov Decision Process), macro base
Stand and its coverage area in small station as machine, macro base station is responsible for determining the movement to be executed and to be assigned to each small station, each small station
It is responsible for executing movement, changes environment, environment feeds back to one, machine award according to reward functions, by constantly handing over environment
Mutually, study obtains the small station movement to be executed in the state of each time slot, and state here is the environment that macro base station is observed
Part describes, the cooperation caching decision that file popularity and previous time slot including this time slot are made, movement here
Referring to that this time slot is made is the cooperation caching decision of next time slot file request service.Reward functions are determined according to caching
The target of plan is defined as the file request number that small station directly services come what is defined here.Value function approximation (value
Function approximation) it is a kind of intensified learning method, it is in huge discrete shape suitable for intensified learning task
The case where carrying out on state space or continuous state space is expressed as value function the function of state and movement, average to maximize
The file request number that accumulation small station directly services is optimization aim, by constantly with environmental interaction, adapting to the dynamic of environment
Variation, can excavate potential file request transfer mode, obtain the approximate expression of value function, and then obtain shifting with file request
The cooperation caching decision that mode matches.Macro base station combination MDS (Maximum Distance Separable) coding method, will
Coding cooperative buffered results are finally communicated to each small station by document No., significantly improve the file request directly serviced by small station
Number reduces system backhaul link load.
A kind of embodiment is hereafter provided by taking lte-a system as an example, as shown in Figure 1, the specific steps are as follows:
Step 1: the acquisition network information, is arranged parameter:
Acquire macro base station set M, the small station set P, file request set C in network1And m-th of macro base station covers model
Enclose interior small station number pm, m ∈ M, set C1Include C file;Small station spatial cache K is obtained, operator is according to network operation feelings
Condition and hardware cost determine station spatial cache K;Operator according to the file request situation in super-intensive heterogeneous network by one when
Between be divided into T time slot, and the start time of each time slot is set, each time slot is successively divided into three according to time of origin
Stage: file transmits stage, information exchange stage and caching decision phase.
Step 2: formulating the base station collaboration buffering scheme encoded based on MDS:
The cooperation caching decision vector in small station is denoted as a (t)=[a1(t),a2(t),…,aC(t)], wherein 0≤ac(t)≤
1, c ∈ C, which is represented, caches the ratio of c-th of file, a in t time slot small stationc(t) ≠ 0 file set (the i.e. text of t slot buffer
Part set) it is denoted as C'(t), file c includes B information bit, and macro base station m is encoded by MDS and B information bit coding is generatedA check bit:
Wherein d is the number in the small station that received signal power is greater than a threshold value, and threshold value is by operator according to the network operation
Situation voluntarily determines ownA check bit is divided into small station candidate's bit and macro base station candidate's bit two parts, wherein
Small station candidate's bit includes pmB bit, i.e., each small station have mutually it is unduplicated B candidate bit, each small station of t time slot from
A before being selected in respective candidate's bitc(t) B bit is cached;
Macro base station arbitrarily chooses (1-da from its candidate bitc(t)) B bit is cached, Encoding according to MDS
Matter, a file request, which obtains at least B check bit, can restore entire file.
Step 3: formulating base station collaboration transmission plan:
Each file request of user obtains da from d small station for covering it firstc(t) B bit, if dac(t) >=1,
Then macro base station is not required to transmit data again;Otherwise macro base station selects a small station nearest apart from user, transmission from d small station
(1-dac(t)) B bit gives the small station, then by the small station these bit transfers to user, the data of macro base station transmission claim
For backhaul link load.The determination method of d:
The probability that user is serviced by d' small station is pd', it is primarily based on the base station deployment situation of operator, according to user position
P is calculated in the historical data setd': in period τ, the position of U user, τ and τ ' are recorded respectively every τ ' time interval
It is voluntarily determined by operator according to network operation situation, record user u ∈ { 1,2 ..., U } receives signal function at each position
Rate is greater than the base station number d' of a threshold value, then the position number that base station number is d' is denoted asUtilize the history of U user
Position is calculated:
WhereinIt indicates in the historical position of user u, there is i base station to provide the position number of service for user u.
Choosing d is to make probability value pd'Maximum d':
Step 4: describing intensified learning task with MDP:
Wherein X represents state space, and A represents motion space,Represent state transfer
Probability, execution movement a is transferred to x ' shape probability of state under x state,Represent transfer bring award;
Concrete form of the intensified learning four-tuple in the problem is as follows:
1, motion space: action definition is the cooperation caching decision vector in small station, and the movement that machine can be taken constitutes dynamic
Make space, since the element number that cache decision vector includes is equal to the number C of file, motion space is that C dimension connects here
Continuous space, 0≤a of every dimensionc≤ 1, c ∈ C are quantized into L discrete value, and L is voluntarily determined by operator according to macro station computing capability,
Then the motion space of discretization is A={ a1,a2,…,a|A|, wherein any one acts vectorj∈
1,2 ..., | A | need to meetThe movement vector total number for meeting the condition is | A |, the caching of t time slot is determined
Plan a (t) ∈ A.
2, state space: state is the description that machine perceives its local environment, and state is small by file popularity vector sum
The cooperation caching decision vector composition stood, such as the p in t time slot, m-th of macro station coverage areamA small station file request
Total degree is denoted as vector N (t)==[N1(t),N2(t),…,NC(t)], general act popularity is denoted as vector theta (t)=[θ1
(t),θ2(t),…,θC(t)], whereinC ∈ C, then the state of t time slot be denoted as x (t)=
[Θ(t),a(t-1)];Enable H=={ Θ1,Θ2,…,Θ|H|It is general act popularity set, it is to collect after Θ (t) is quantified
An element in H is closed, then state space is denoted as X=={ x1,x2,…,x|H||A|, state x (t) ∈ X.
3, state transition probability: after t time slot execution acts a (t), which is applied on current state x (t),
Environment is from current state with potential transition probabilityIt is transferred to next state x (t+1), which is unknown
's.
4, award: while environment is transferred to x (t+1), environment can be to one, machine award, awardAt this
In be defined as the file request number that small station directly services:
Wherein, u [] represents jump function, and when the value in bracket is greater than 0, otherwise functional value 1 is 0;It needs to pass to update small station caching in the cache decision stage of t time slot
Defeated number of files,For in the information of (t+1) time slot
The number of files that switching phase is transmitted by macro base station.
Step 5: clear intensified learning target:
It defines deterministic policy function π (x), x ∈ X, according to this strategy, it is known that the movement to be executed at state x (t)
A (t)=π (x (t));Define the state value function that award is accumulated in the expectation of γ discount:
Wherein EπExpression asks expectation to tactful π,It represents from state x (t), tires out using brought by tactful π
Product award, 0≤γ < 1 are measurement of the movement π (x (t)) to state influence degree in future of t time slot execution.
After obtaining state value function, state-movement value function (Q function) can be obtained:
Represent from state x (t), execution acts a'(t) after reuse tactful π bring accumulation
Award, (4) formula and (5) formula are known as Bellman equation.
X (t), x (t+1), a'(t are replaced respectively with x, x', a), target is to find that expectation is made to accumulate awardMaximum strategy is denoted as π*(x), optimal value function isAccording to (4) under optimal policy
Formula and (5) formula can be obtained:
Namely:
(6) (7) two formulas disclose the improved procedure of non-optimal strategy, i.e., change into the movement of policy selection current optimal
Movement:
In the situation known to intensified learning four-tuple, Iteration algorithm or Policy iteration algorithm solution can be used based on (8) formula
Bellman equation obtains optimal policy.
Step 6: being based on the approximate Q-learning process of value function under state transition probability unknown situation:
Since state transition probability is unknown, so can not be obtained by Policy iteration algorithm or Iteration algorithm optimal
Strategy;Simultaneously because state transition probability is unknown to be caused from state value function to the conversion of Q function difficulty, therefore consider direct
Estimate Q function;
1, Q approximation to function: to solve the storage of Q-table caused by big state space and motion space and traversal search
Difficulty indicates Q function with the approximate method of value function, i.e., is the function of state and movement by Q function representation, is instantaneously awardedInspiration, by taking t time slot as an example, in state x (t), execution acts a'(t), Q approximation to function indicates are as follows:
Wherein ω1And ω2Two-part weight is represented, ω is set1> > ω2, β, ηi, ξiIt is unknown parameter, needs to lead to
Overfitting obtains.
2, the selection of cooperation caching decision:
Due to ω1> > ω2, omitObtain cache decision:
(11) formula, which is asked, makes the maximum cooperation caching strategy of value in bracket, can be seen that from the expression formula in bracket and
(1-da'i(t)) factor η being multipliediθi(t) size of value bracket Nei, η are directly related toiθi(t) bigger, corresponding (1-
da'i(t)) should be smaller, just the value in bracket can be made bigger in this way.Therefore the solution procedure of (11) formula is as follows:
1. according to lmaxD/L >=1 determines the maximum value of element in cache decision vector, lmaxIt is the denominator of greatest member, by
In the l in the value range for meeting inequalitymaxIt is the smaller the better, therefore Expression rounds up;
2. each element i/L, i=1,2 in cache decision vector are calculated ..., lmaxNumber zi:
WhereinIt indicates to be rounded downwards;
3. determining the position of each element: coefficient ηiθi(t), i=1,2 ..., C are arranged in descending order, and j-th after sequence
Element is denoted asCorresponding to the h before sequencejA file primarily determines the position of each element first:
Then, it adjustsIn meet condition 1-lmaxThe element of d/L < 0, fromStart to
J=1 terminates, and recycles following step to adjust the element in movement vector: fromIn
Find the condition of satisfactionWithMinimum j',Subtract 1/L,Add 1/L.
3, the target of Q-learning:
(6) formula is substituted into (5) Shi Ke get:
(14) formula is disclosed at state x (t), and execution movement a (t) brings the calculation method of accumulation award true value:
WhereinFor the motion estimation value under state x (t+1), according to step
Rapid 2 estimate.
Define loss function:
Wherein parameter vector η=[η1,η2,…,ηC], ξ=[ξ1,ξ2,…,ξC], the target of Q-learning is exactly to make Q letter
Several estimated values and true value close to, namely minimize loss function.
4, the parameter beta in Q approximation to function expression is updated using stochastic gradient descent method, η, ξ:
Wherein βc,Represent the parameter of current time slots, βp,Represent the parameter of previous time slot, 0 generation of < δ≤1
Table updates step-length.
Step 7: setting current time slots t=1, is randomly provided initial state x (t)=[Θ (t), a (t-1)], parameter is initial
Value βp=0, ηp=0, ξp=0, operator according to network change speed be arranged γ value, range be [0,1), according to what is updated
The order of magnitude of parameter determines the value of δ, range be (0,1], the number t of training time slot is set according to network sizetotal。
Step 8: taking the association to be executed under state x (t) using ε-greedy method strategy in the cache decision stage of t time slot
Make cache decision a (t): with probability 1- ε according to step 2 selecting collaboration cache decision in the 6th step;One is randomly choosed with probability ε
Meet conditionWithCooperation caching decision.
Step 9: macro base station will need the file that caches to carry out MDS coding according to second step, and by the data packet after coding
It is transferred to small station caching.
Step 10: the file in (t+1) time slot transmits the stage, user's demand file, base station cooperates according to third step and passes
Defeated is user service.
Step 11: all small stations in each macro base station coverage area will in the information exchange stage of (t+1) time slot
Its file request number in (t+1) time slot reports that, to macro base station, macro base station summary file request total degree is denoted as vector N (t
+ 1) it, and calculates general act popularity and is denoted as vector theta (t+1).
Step 12: the state being transferred to is x (t+1)=[Θ (t+1), a (t)], reward functions are calculated according to (3) formula
Step 13: estimating the movement to be executed at state x (t+1) according to step 2 in the 6th step:
Step 14: updating the parameter in Q approximation to function formula according to (17) formula.
Step 15: if t=ttotal, then deconditioning, into the 16th step;Otherwise, t=t+1, into lower a period of time
Gap returns to the 8th step, continues to train.
Step 16: the Q approximation to function formula obtained based on training determines association according to step 2 in the 6th step since t time slot
Make cache decision, serves the file request of next time slot.
According to the above process it is found that during Q function learning, in macro base station and its coverage area small station as machine,
It is using cooperation caching decision as movement, small station is direct using the cooperation caching decision in file popularity and small station as state
The file request number of service is as reward functions, by constantly interacting with environment, is to maximize accumulation reward functions
Target, study obtains Q approximation to function formula, and then obtains the cooperation caching decision under each state, and then macro base station is encoded with MDS
The file that will be cached is encoded, and coding result is communicated to each small station and carries out cooperation caching.This method utilizes extensive chemical
Learning method looks for mode from data, is not necessarily based on data distribution solving optimization problem.The file that real-time change can be tracked is popular
Degree sufficiently excavates and formulates cooperation caching decision using potential file request transfer mode, is more suitable for real system, shows
The file request number for improving and directly being serviced by small station is write, system backhaul link load is effectively reduced, system performance is provided, is promoted
User experience.
Embodiment is merely illustrative of the invention's technical idea, and this does not limit the scope of protection of the present invention, it is all according to
Technical idea proposed by the present invention, any changes made on the basis of the technical scheme are fallen within the scope of the present invention.
Claims (6)
1. being based on value function approximate super-intensive heterogeneous network small station coding cooperative caching method, it is characterised in that: by macro base station
And its small station in coverage area is as machine, macro base station be responsible for determining under each time slot state the small station movement to be executed and under
Up to each small station, each small station is responsible for executing movement, and the state includes that the file popularity of this time slot and previous time slot are made
Cooperation caching decision, it is next time slot file request service collaboration cache decision that the movement, which refers to that this time slot is made,;Using
Value function, is expressed as the function of state and movement by the approximate intensified learning method of value function, to maximize average accumulated small station
The file request number directly serviced is optimization aim, by constantly with environmental interaction, adapting to the dynamic change of environment, is excavated
Potential file request transfer mode out, obtains the approximate expression of value function, and then obtains matching with file request transfer mode
Cooperation caching decision;Macro base station encodes cooperation caching decision, and coding cooperative buffered results are communicated to each small station.
2. it is based on value function approximate super-intensive heterogeneous network small station coding cooperative caching method according to claim 1,
It is characterized in that, comprising the following steps:
Step 1, the acquisition network information, are arranged parameter:
Acquire macro base station set M, the small station set P, file request set C in network1And in m-th of macro base station coverage area
Small station number pm,m∈M;Small station spatial cache K is obtained, operator determines that station is slow according to network operation situation and hardware cost
Deposit space K;One time was divided into T time slot according to the file request situation in super-intensive heterogeneous network by operator, and was set
Each time slot is successively divided into three phases according to time of origin by the start time for setting each time slot: file transmits stage, letter
Cease switching phase and caching decision phase;
The base station collaboration buffering scheme that step 2, formulation are encoded based on MDS:
The cooperation caching decision vector in small station is denoted as a (t), each element a in a (t)c(t) [0,1] ∈, c ∈ C1,It represents
T time slot small station caches the ratio of c-th of file, ac(t) ≠ 0 file set is the file set of t slot buffer, is denoted as
C'(t), c-th of file includes B information bit, and m-th of macro base station is encoded by MDS and B information bit coding is generatedA check bit:
In above formula, d is the number in the small station that received signal power is greater than a threshold value, and threshold value is by operator according to network operation feelings
Condition determines ownA check bit is divided into small station candidate's bit and macro base station candidate's bit two parts, and middle and small stations are waited
Selecting bit includes pmB bit, i.e., there is mutually unduplicated B candidate bit in each small station, in each small station of t time slot from respective
A before being selected in candidate bitc(t) B bit is cached;Macro base station arbitrarily chooses (1-da from its candidate bitc(t)) B
Bit is cached, and according to MDS coding properties, a file request, which obtains at least B check bit, can restore entire text
Part;
Step 3 formulates base station collaboration transmission plan:
Each file request of user obtains da from d small station for covering it firstc(t) B bit, if dac(t) >=1, then macro
Base station is not required to transmit data again;Otherwise macro base station selects a small station nearest apart from user from d small station, transmits (1-dac
(t)) B bit gives the small station, then by the small station these bit transfers to user, the data of macro base station transmission are known as backhaul
Link load;
Step 4 describes intensified learning task with markov decision process MDP:
Establish intensified learning four-tupleWherein X represents state space, and A represents motion space,State transition probability is represented, execution movement a is transferred to x ' shape probability of state under x state,The transfer is represented to bring
Award;
Intensified learning four-tuple concrete form is as follows:
Motion space: since the element number that cache decision vector includes is equal to set C1Element number C, therefore motion space
It is C dimension continuous space, every dimension ac(t) it is quantized into L discrete value, L is determined by operator according to macro station computing capability, then discrete
The motion space of change is A={ a1,a2,…,a|A|, wherein any one acts vectorj∈{1,
2 ..., | A | need to meetThe movement vector total number for meeting the condition is | A |, the cache decision a of t time slot
(t)∈A;
State space: the p in t time slot, m-th of macro station coverage areamA small station file request total degree is denoted as vector N (t)
=[N1(t),N2(t),…,NC(t)], general act popularity is denoted as vector theta (t)=[θ1(t),θ2(t),…,θC(t)], whereinThe state of so t time slot is denoted as x (t)=[Θ (t), a (t-1)];Enable H=
{Θ1,Θ2,…,Θ|H|It is general act popularity set, it is an element in set H, then state after Θ (t) is quantified
Space is denoted as X={ x1,x2,…,x|H|A|, state x (t) ∈ X;
State transition probability: t time slot execution act a (t) after, which is applied on current state x (t), environment from
Current state is with potential transition probabilityIt is transferred to next state x (t+1), which is unknown;
Award: while environment is transferred to x (t+1), environment can be to one, machine award, awardIt defines herein
The file request number directly serviced at small station:
In above formula, u [] represents jump function,For in the slow of t time slot
It deposits decision phase update small station and caches the number of files that need to be transmitted,
For the number of files transmitted in the information exchange stage of (t+1) time slot by macro base station;
Step 5, clear intensified learning target:
It defines deterministic policy function π (x), x ∈ X knows, the movement a (t) to be executed at state x (t)=π according to the strategy
(x (t)), then state value function:
In above formula,It represents from state x (t), is awarded using being accumulated brought by tactful π, when 0≤γ < 1 is t
Measurement of the movement π (x (t)) that gap executes to state influence degree in future;
After obtaining state value function, state-movement value function, i.e. Q function are just obtained:
In above formula,A'(t)) represent from state x (t), execution acts a'(t) after to reuse tactful π bring tired
Product award;
X (t), x (t+1), a'(t are replaced respectively with x, x', a), target is to find that expectation is made to accumulate awardMost
Big strategy is denoted as π*(x), optimal value function isIt is obtained according to optimal policy:
Namely:
Step 6 is formulated based on the approximate Q-learning process of value function:
(601) Q function is indicated with the approximate method of value function, i.e., is the function of state and movement by Q function representation, by instantaneous
AwardInspiration, at state x (t), execution act a'(t), Q approximation to function indicate are as follows:
In above formula, ω1And ω2Two-part weight is represented, ω is set1> > ω2, β, ηi, ξiIt is unknown parameter, needs to pass through
Study obtains;
(602) cooperation caching decision is solved:
(603) target of Q-learning is established:
It is calculated at state x (t) according to above formula, execution movement a (t) brings accumulation award true value:
In above formula,For the motion estimation value under state x (t+1);
(604) loss function is defined:
In above formula, η=[η1,η2,…,ηC], ξ=[ξ1,ξ2,…,ξC], EπExpression asks expectation to tactful π;
According to loss function undated parameter β, η, ξ;
Step 7, setting current time slots t=1, are randomly provided initial state x (t)=[Θ (t), a (t-1)], initial parameter value βp=
0, ηp=0, ξp=0, operator according to network change speed be arranged γ value, range be [0,1), according to the parameter to be updated
The order of magnitude determine update step-length δ value, range be (0,1], according to network size be arranged training time slot number ttotal;
Step 8, the cache decision stage in t time slot take the cooperation to be executed under state x (t) slow using ε-greedy method strategy
Deposit decision a (t);
The file for needing to cache is carried out MDS coding according to step 2 by step 9, macro base station, and the data packet after coding is transferred to
Small station caching;
Step 10, the file transmission stage in t+1 time slot, user's demand file, base station are user's clothes according to step 3 cooperation transmission
Business;
Step 11, the information exchange stage in t+1 time slot, all small stations in each macro base station coverage area are by it in t+1
File request number reports that macro base station summary file request total degree is denoted as vector N (t+1), and is calculated total to macro base station in gap
File popularity is denoted as vector theta (t+1);
Step 12, the state being transferred to are x (t+1)=[Θ (t+1), a (t)], calculate reward functions
Step 13, estimation movement to be executed at state x (t+1):
Step 14 updates the parameter in Q approximation to function formula according to step (604);
If step 15, t=ttotal, then deconditioning, enters step 16;Otherwise, t=t+1 returns to step into next time slot
Rapid 8, continue to train;
Step 16, since t time slot, cooperation caching decision is determined based on the obtained Q approximation to function formula of training, serves lower a period of time
The file request of gap.
3. it is based on value function approximate super-intensive heterogeneous network small station coding cooperative caching method according to claim 2,
It is characterized in that, in step 3, the determination method of d is as follows:
If the probability that user is serviced by d' small station is pd', it is primarily based on the base station deployment situation of operator, according to user location
Historical data p is calculatedd': in period τ, record the position of U user respectively every τ ' time interval, τ and τ ' by
Operator voluntarily determines according to network operation situation, records user u ∈ { 1,2 ..., U } received signal power at each position
Greater than the base station number d' of a threshold value, then the position number that base station number is d' is denoted asUtilize the history bit of U user
It sets and is calculated:
In above formula,It indicates in the historical position of user u, there is i base station to provide the position number of service for user u;
Then, choosing d is to make probability value pd'Maximum d':
4. it is based on value function approximate super-intensive heterogeneous network small station coding cooperative caching method according to claim 2,
It is characterized in that, in step (602), due to ω1> > ω2, omit
Obtain cache decision:
The solution procedure of above formula is as follows:
1. according to lmaxD/L >=1 determines the maximum value of element in cache decision vector, lmaxThe denominator of greatest member, due to
Meet l in the value range of inequalitymaxIt is the smaller the better, therefore Expression rounds up;
2. calculating the number z of each element i/L in cache decision vector according to node B cache spacei, i=1,2 ..., lmax:
WhereinIt indicates to be rounded downwards;
3. determining the position of each element: coefficient ηiθi(t), i=1,2 ..., C are arranged in descending order, j-th of element after sequence
It is denoted asCorresponding to the h before sequencejA file primarily determines the position of each element first:
Then, it adjustsIn meet condition 1-lmaxThe element of d/L < 0, fromStart to j=1
Terminate, recycles following step to adjust the element in movement vector: fromIn find
Meet conditionWithMinimum j',
Subtract 1/L,Add 1/L;
Equally using in above-mentioned method for solving estimating step 13
5. it is based on value function approximate super-intensive heterogeneous network small station coding cooperative caching method according to claim 4,
It is characterized in that, in step 8, with probability 1- ε according to step (602) selecting collaboration cache decision;One is randomly choosed with probability ε
Meet conditionWithCooperation caching decision.
6. it is based on value function approximate super-intensive heterogeneous network small station coding cooperative caching method according to claim 2,
It is characterized in that, in step (604), updates the parameter beta in Q approximation to function expression using stochastic gradient descent method, η,
ξ:
β in above formulac,Represent the parameter of current time slots, βp,The parameter of previous time slot is represented, 0 < δ≤1 is represented
Update step-length.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811634918.6A CN109617991B (en) | 2018-12-29 | 2018-12-29 | Value function approximation-based cooperative caching method for codes of small stations of ultra-dense heterogeneous network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811634918.6A CN109617991B (en) | 2018-12-29 | 2018-12-29 | Value function approximation-based cooperative caching method for codes of small stations of ultra-dense heterogeneous network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109617991A true CN109617991A (en) | 2019-04-12 |
CN109617991B CN109617991B (en) | 2021-03-30 |
Family
ID=66015366
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811634918.6A Active CN109617991B (en) | 2018-12-29 | 2018-12-29 | Value function approximation-based cooperative caching method for codes of small stations of ultra-dense heterogeneous network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109617991B (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110138836A (en) * | 2019-04-15 | 2019-08-16 | 北京邮电大学 | It is a kind of based on optimization energy efficiency line on cooperation caching method |
CN110381540A (en) * | 2019-07-22 | 2019-10-25 | 天津大学 | The dynamic buffering update method of real-time response time-varying file popularity based on DNN |
CN111311996A (en) * | 2020-03-27 | 2020-06-19 | 湖南有色金属职业技术学院 | Online education informationization teaching system based on big data |
CN112218337A (en) * | 2020-09-04 | 2021-01-12 | 暨南大学 | Cache strategy decision method in mobile edge calculation |
CN112672402A (en) * | 2020-12-10 | 2021-04-16 | 重庆邮电大学 | Access selection method based on network recommendation in ultra-dense heterogeneous wireless network |
CN112911717A (en) * | 2021-02-07 | 2021-06-04 | 中国科学院计算技术研究所 | Method for transmitting MDS (Multi-request System) coded data packet of fronthaul network |
CN113132466A (en) * | 2021-03-18 | 2021-07-16 | 中山大学 | Multi-access communication method, device, equipment and medium based on code cache |
CN115118728A (en) * | 2022-06-21 | 2022-09-27 | 福州大学 | Ant colony algorithm-based edge load balancing task scheduling method |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120166594A1 (en) * | 2010-12-28 | 2012-06-28 | Sony Corporation | Information processing apparatus, reproduction control method, program, and content reproduction system |
CN103929781A (en) * | 2014-04-09 | 2014-07-16 | 东南大学 | Cross-layer interference coordination optimization method in super dense heterogeneous network |
CN104782172A (en) * | 2013-09-18 | 2015-07-15 | 华为技术有限公司 | Small station communication method, device and system |
CN104955077A (en) * | 2015-05-15 | 2015-09-30 | 北京理工大学 | Heterogeneous network cell clustering method and device based on user experience speed |
CN106358308A (en) * | 2015-07-14 | 2017-01-25 | 北京化工大学 | Resource allocation method for reinforcement learning in ultra-dense network |
CN108882269A (en) * | 2018-05-21 | 2018-11-23 | 东南大学 | The super-intensive network small station method of switching of binding cache technology |
CN110445825A (en) * | 2018-05-04 | 2019-11-12 | 东南大学 | Super-intensive network small station coding cooperative caching method based on intensified learning |
-
2018
- 2018-12-29 CN CN201811634918.6A patent/CN109617991B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120166594A1 (en) * | 2010-12-28 | 2012-06-28 | Sony Corporation | Information processing apparatus, reproduction control method, program, and content reproduction system |
CN104782172A (en) * | 2013-09-18 | 2015-07-15 | 华为技术有限公司 | Small station communication method, device and system |
CN103929781A (en) * | 2014-04-09 | 2014-07-16 | 东南大学 | Cross-layer interference coordination optimization method in super dense heterogeneous network |
CN104955077A (en) * | 2015-05-15 | 2015-09-30 | 北京理工大学 | Heterogeneous network cell clustering method and device based on user experience speed |
CN106358308A (en) * | 2015-07-14 | 2017-01-25 | 北京化工大学 | Resource allocation method for reinforcement learning in ultra-dense network |
CN110445825A (en) * | 2018-05-04 | 2019-11-12 | 东南大学 | Super-intensive network small station coding cooperative caching method based on intensified learning |
CN108882269A (en) * | 2018-05-21 | 2018-11-23 | 东南大学 | The super-intensive network small station method of switching of binding cache technology |
Non-Patent Citations (2)
Title |
---|
PO-HAN HUANG: "Cross-Tier Cooperation for Optimal Resource Utilizaition in Ultra-Dense Heterogeneous Networks", 《IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY》 * |
张海波: "OFDMA毫微微小区双层网络中基于分组的资源分配", 《电子与信息学报》 * |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110138836B (en) * | 2019-04-15 | 2020-04-03 | 北京邮电大学 | Online cooperative caching method based on optimized energy efficiency |
CN110138836A (en) * | 2019-04-15 | 2019-08-16 | 北京邮电大学 | It is a kind of based on optimization energy efficiency line on cooperation caching method |
CN110381540B (en) * | 2019-07-22 | 2021-05-28 | 天津大学 | Dynamic cache updating method for responding popularity of time-varying file in real time based on DNN |
CN110381540A (en) * | 2019-07-22 | 2019-10-25 | 天津大学 | The dynamic buffering update method of real-time response time-varying file popularity based on DNN |
CN111311996A (en) * | 2020-03-27 | 2020-06-19 | 湖南有色金属职业技术学院 | Online education informationization teaching system based on big data |
CN112218337A (en) * | 2020-09-04 | 2021-01-12 | 暨南大学 | Cache strategy decision method in mobile edge calculation |
CN112218337B (en) * | 2020-09-04 | 2023-02-28 | 暨南大学 | Cache strategy decision method in mobile edge calculation |
CN112672402A (en) * | 2020-12-10 | 2021-04-16 | 重庆邮电大学 | Access selection method based on network recommendation in ultra-dense heterogeneous wireless network |
CN112911717A (en) * | 2021-02-07 | 2021-06-04 | 中国科学院计算技术研究所 | Method for transmitting MDS (Multi-request System) coded data packet of fronthaul network |
CN112911717B (en) * | 2021-02-07 | 2023-04-25 | 中国科学院计算技术研究所 | Transmission method for MDS (data packet System) encoded data packet of forwarding network |
CN113132466A (en) * | 2021-03-18 | 2021-07-16 | 中山大学 | Multi-access communication method, device, equipment and medium based on code cache |
CN113132466B (en) * | 2021-03-18 | 2022-03-15 | 中山大学 | Multi-access communication method, device, equipment and medium based on code cache |
CN115118728A (en) * | 2022-06-21 | 2022-09-27 | 福州大学 | Ant colony algorithm-based edge load balancing task scheduling method |
CN115118728B (en) * | 2022-06-21 | 2024-01-19 | 福州大学 | Edge load balancing task scheduling method based on ant colony algorithm |
Also Published As
Publication number | Publication date |
---|---|
CN109617991B (en) | 2021-03-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109617991A (en) | Based on value function approximate super-intensive heterogeneous network small station coding cooperative caching method | |
WO2021169577A1 (en) | Wireless service traffic prediction method based on weighted federated learning | |
CN109947545A (en) | A kind of decision-making technique of task unloading and migration based on user mobility | |
Sheraz et al. | Artificial intelligence for wireless caching: Schemes, performance, and challenges | |
CN101977226B (en) | Novel opportunity network data transmission method | |
CN110113190A (en) | Time delay optimization method is unloaded in a kind of mobile edge calculations scene | |
CN109660598A (en) | A kind of buffer replacing method and system of Internet of Things Temporal Data | |
CN114143891A (en) | FDQL-based multi-dimensional resource collaborative optimization method in mobile edge network | |
CN110445825B (en) | Super-dense network small station code cooperation caching method based on reinforcement learning | |
Huang et al. | An overview of intelligent wireless communications using deep reinforcement learning | |
CN105657758B (en) | A kind of more AP adaptive method for switching based on Markov model | |
Yan et al. | Distributed edge caching with content recommendation in fog-rans via deep reinforcement learning | |
Somesula et al. | Cooperative cache update using multi-agent recurrent deep reinforcement learning for mobile edge networks | |
CN112667406A (en) | Task unloading and data caching method in cloud edge fusion heterogeneous network | |
CN108733921A (en) | Coiling hot point of transformer temperature fluctuation range prediction technique based on Fuzzy Information Granulation | |
CN115065728A (en) | Multi-strategy reinforcement learning-based multi-target content storage method | |
Dangi et al. | 5G network traffic control: a temporal analysis and forecasting of cumulative network activity using machine learning and deep learning technologies | |
Yi et al. | DMADRL: A distributed multi-agent deep reinforcement learning algorithm for cognitive offloading in dynamic MEC networks | |
Anagnostopoulos et al. | Path prediction through data mining | |
CN113158544A (en) | Edge pre-caching strategy based on federal learning under vehicle-mounted content center network | |
Feng et al. | Proactive content caching scheme in urban vehicular networks | |
Tao et al. | DRL-Driven Digital Twin Function Virtualization for Adaptive Service Response in 6G Networks | |
CN114786200A (en) | Intelligent data caching method based on cooperative sensing | |
Cui et al. | Resource-Efficient DNN Training and Inference for Heterogeneous Edge Intelligence in 6G | |
CN111556511B (en) | Partial opportunistic interference alignment method based on intelligent edge cache |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |