CN116209015A - Edge network cache scheduling method, system and storage medium - Google Patents

Edge network cache scheduling method, system and storage medium Download PDF

Info

Publication number
CN116209015A
CN116209015A CN202310465386.2A CN202310465386A CN116209015A CN 116209015 A CN116209015 A CN 116209015A CN 202310465386 A CN202310465386 A CN 202310465386A CN 116209015 A CN116209015 A CN 116209015A
Authority
CN
China
Prior art keywords
base station
network
scheduling model
cache
global
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310465386.2A
Other languages
Chinese (zh)
Other versions
CN116209015B (en
Inventor
魏振春
罗子成
吕增威
石雷
徐娟
樊玉琦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Intelligent Manufacturing Institute of Hefei University Technology
Original Assignee
Intelligent Manufacturing Institute of Hefei University Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Intelligent Manufacturing Institute of Hefei University Technology filed Critical Intelligent Manufacturing Institute of Hefei University Technology
Priority to CN202310465386.2A priority Critical patent/CN116209015B/en
Publication of CN116209015A publication Critical patent/CN116209015A/en
Application granted granted Critical
Publication of CN116209015B publication Critical patent/CN116209015B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W28/00Network traffic management; Network resource management
    • H04W28/02Traffic management, e.g. flow control or congestion control
    • H04W28/10Flow control between communication endpoints
    • H04W28/14Flow control between communication endpoints using intermediate storage
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

The present invention relates to the field of wireless communications and edge computing, and in particular, to a method, a system, and a storage medium for scheduling an edge network cache. In the invention, firstly, a cache scheduling model and a global scheduling model with the same network structure are constructed aiming at a small base station and a large base station, and model parameters are distributed to the small base station through the large base station in the training process; simultaneously, carrying out local training on each small base station to obtain a model gradient; the large base station aggregates the model gradients uploaded by each small base station to obtain global aggregated gradients, feeds the global aggregated gradients back to the small base stations, and updates the cache scheduling model in each small base station by using the global aggregated gradients until the global scheduling model converges, so that an optimal cache strategy is formulated according to the cache scheduling model and the global scheduling model. The invention improves the convergence rate and global fairness of the global scheduling model and greatly improves the convergence rate of the cache scheduling model.

Description

Edge network cache scheduling method, system and storage medium
Technical Field
The present invention relates to the field of wireless communications and edge computing, and in particular, to a method, a system, and a storage medium for scheduling an edge network cache.
Background
With the increasing number and quality of multimedia services on mobile networks, the resulting massive data put tremendous strain on backbone networks and mobile networks. Edge computing is an effective way to address the backbone network burden. In edge computation, edge caches use cache servers to store content that is closer to the end user. An effective caching strategy is critical due to the limited communication resources and cache capacity of the caching server.
In recent years, deep reinforcement learning has become an important tool for collaborative caching in edge computing. The federal reinforcement learning can protect the privacy of the terminal equipment, and compared with the traditional learning mode, the mode of only transmitting the weight greatly reduces the network resource consumption. However, the federal reinforcement learning method currently applied to edge caching results in some data being too much representative of all data due to the non-independent co-distribution of the data. The end result is that the global model obtained by federal reinforcement learning training has too much difference in performance on different terminal devices, so that the global model is not available in some devices, which can hit the enthusiasm of the devices to participate in federation.
Disclosure of Invention
In order to solve the defects of slow network convergence, poor global model fairness and the like in the traditional federation reinforcement learning in the prior art, the invention provides an edge network cache scheduling method which can realize good global fairness in edge network cache scheduling and has high training speed of a used scheduling model.
The invention provides an edge network cache scheduling method, which is used for edge network cache scheduling in a wireless communication network comprising a cloud server, a large base station and a small base station, wherein the edge network consists of the large base station and the small base station in the coverage area of the large base station; the edge network cache scheduling method decides the cache content of a large base station through a global scheduling model, and decides the cache content of a small base station through a cache scheduling model corresponding to the small base station one by one; the global scheduling model and the cache scheduling model are both neural network models; the input of the global scheduling model is the global state of the edge network where the large base station is located, and the output is the action probability distribution of the large base station; the input of the cache scheduling model is the state of the corresponding small base station, and the output is the action probability distribution of the corresponding small base station; the cache scheduling model at least comprises a network structure which is the same as the global scheduling model;
the state of the small base station comprises the content storage state of the small base station and the task request state of a client in the coverage area of the small base station; the actions of the small base station comprise content to be cached and content to be deleted; the state of the large base station is a set of states of all small base stations in a coverage area of the large base station, and the actions of the large base station comprise contents to be cached and contents to be deleted of the large base station;
the training process of the global scheduling model and the buffer scheduling model is as follows:
s1, acquiring a cache scheduling model corresponding to each small base station and a global scheduling model corresponding to a large base station;
s2, randomly selecting part of small base stations from the small base stations as reference base stations, and configuring the parameters of the cache scheduling model corresponding to each reference base station as network parameters of the global scheduling model;
s3, carrying out local training on the cache scheduling model corresponding to each reference base station, selecting Z reference base stations which complete the local training of the cache scheduling model as alternative base stations, and acquiring gradients and accumulated rewards of each alternative base station;
s4, calculating cosine distances between gradients of different alternative base stations, clustering according to the cosine distances, and forming K alternative base stations from Z alternative base stationsClusters, 1+.K+.10; calculating the global network parameters of each cluster, and enabling the global network parameters of the kth cluster to be marked as w k T+1
Figure SMS_1
,/>
Figure SMS_2
Representing the average gradient of the candidate base stations contained in the kth cluster; w (w) G T Representing network parameters of the global scheduling model, T representing a round, T+1 representing a next round of T; replacing network parameters of a cache scheduling model of the alternative base station of each cluster with global network parameters w corresponding to the cluster k T+1 The method comprises the steps of carrying out a first treatment on the surface of the The initial value of T is 0;
s5, arranging Z alternative base stations according to a descending order of the accumulated rewards, and marking the gradient of the alternative base station positioned at the Z-th position as G z T 1 +.z, constructing the alternate gradient set { G- z T 1 +.z +.Z +.; acquiring the theta candidate base stations ordered earlier in the descending order of the jackpot prize as target base stations, and the gradient of the sigma target base station as G # σ 1 +.sigma +.theta, constructing the target gradient set { G- # σ 1 +.sigma +.θ }; θ is an integer value obtained by rounding α×z, α is a set fairness and robustness control factor;
s6, let eta # =G # σ The method comprises the steps of carrying out a first treatment on the surface of the The initial value of sigma is 1;
s7, let η=g z T The method comprises the steps of carrying out a first treatment on the surface of the The initial value of z is 1;
s8, judging whether the specification of eta multiplied by eta is satisfied # <0; if yes, G in the target gradient set # σ Updated to eta #2 ×η # /||η|| 2 ,||η|| 2 Representing the two norms of η, and then executing step S9; if not, executing step S9;
s9, judging whether Z is larger than or equal to Z; if not, updating z to z+1, and returning to the step S7; if yes, the following step S10 is executed;
s10, judging whether sigma is larger than or equal to theta; if not, the sigma is updated to sigma+1, z is initialized, and then the step S6 is returned; if yes, then select ladderThe first theta gradients in the degree set are replaced by the corresponding gradients in the target gradient set one by one, and the gradient mean value g in the alternative gradient set is calculated G T+1 Calculating a transition term w G T+1 =w G T +g G T+1 The method comprises the steps of carrying out a first treatment on the surface of the Then updating T to T+1, returning to step S2, and updating the network parameters of the global scheduling model to w when the updating times of T reach the set value c4 G T+1 And fixes the global schedule model and the cache schedule model.
Preferably, in S3, the gradient of the candidate base station is a difference between the network parameter of the cache scheduling model after the candidate base station completes the local training and the network parameter of the current global scheduling model.
Preferably, the cache scheduling model is composed of a current network and a target network, and the current network and the target network have the same network structure as the global scheduling model; n-th small base station B in S3 n The local training of the corresponding cache scheduling model comprises steps S31 to S37; n is equal to or less than 1 and equal to or less than N, wherein N is the number of small base stations in the edge network;
s31, setting cumulative rewards NR n And a sample cumulative value AR; let small base station B n The network parameters of the current network in the corresponding cache scheduling model are denoted as w n The network parameters of the target network are denoted as w n # Updating network parameters of the current network and the target network to network parameters w of the current global scheduling model G The method comprises the steps of carrying out a first treatment on the surface of the Initialization states S, AR and NR n All initialized to 0;
s32, inputting the state S into a current network, outputting action probability distribution by the current network, and selecting a decision action A by combining the action probability distribution;
s33, acquiring a state S after the state S executes the action A # A prize R; updating jackpot NR n Equal to NR n +R; the reward R is action reward for executing the action A when the small base station is in the state S;
s34, updating AR to AR+1, and constructing samples { S, A, S } # R, done is stored in a set experience playback set D, and done is a task completion mark; if AR<c3, done=0; if ar=c3, done=1; c3 is a set value; judging experience returnWhether the number of samples in the set reaches a set value c2 or not; if not, make S updated to S # Then returns to S32; if yes, the following step S35 is executed;
s35, sampling I samples from the experience playback set D, wherein the ith sample is named as { S } i ,A i ,S i # ,R i ,done i },S i Representing the state in the ith sample, A i Representing the action in the ith sample, S i # Representing the next state in the ith sample, R i Indicating the prize in the ith sample, done i Indicating that the task completion flag in the ith sample is 1 +.i; calculating a loss function L (Q) by combining the I samples;
s36, updating the parameter w of the current network by combining gradient back propagation of L (Q) through the neural network n If the number of parameter updating times of the current network is an integer multiple of the set value c1, updating the parameter w of the target network n # So that w n # =w n
S37, judging whether the AR reaches c3; if not, make S updated to S # AR is updated to AR+1, then the state S is input into the current network, the decision action A is selected by combining the action probability distribution output by the current network, and a sample { S, A, S ] is constructed by combining the state S and the decision action A # R, done } and store in experience playback set D; then returning to step S35; the local training of the cache scheduling model is complete.
Preferably, S32 is adopted
Figure SMS_3
The greedy approach selects decision actions based on the action probability distribution.
Preferably, let small base station B n The action prize on slot t is denoted R (t, n), then:
R(t,n)=R c (t,n)-R d (t,n);
R c (t, n) represents small cell B n Hit rewards on time slot t for the content buffered on time slot t+1, i.e. small base station B n Buffered at time slot t and is buffered by small base station B n Content requested by clients within coverage on time slot t+1Is the number of (3);
R d (t, n) represents small cell B n The cache content deleted on time slot t results in a negative penalty on time slot t+1; i.e. small base station B n Buffered in time slot t-1 and deleted in time slot t and is detected by small cell B n The number of content requested by clients in coverage over time slot t+1.
Preferably, the loss function in S35 is calculated as follows:
L(Q)=[Σ i=1 I (y i -Q(S i ,A i ,w n )) 2 +µ||w G -w n || 2 /2]/I
Q(S i ,A i ,w n ) Action A in action probability distribution representing current network output i The corresponding probabilities; sigma represents summation, and the summation range is i E [1, I]The method comprises the steps of carrying out a first treatment on the surface of the The [ mu ] represents a regular term parameter; w (w) G Network parameters, w, representing the current global scheduling model n Network parameters representing a cache scheduling model; i W G -w n || 2 Representing w G -w n Is a binary norm of (2); y is i Is the current target value;
when done i =1,y i =R i
When done i =0,
Figure SMS_4
Q(S i # ,a,w n ) Indicating that the current network is in the state S i # The probability corresponding to the action a in the action probability distribution output at the time,
Figure SMS_5
representing the maximum probability, a, in the probability distribution of actions of the current network output # Representing an action traversal space;
Figure SMS_6
indicating that the target network is in the state S i # Probability corresponding to target action in action probability distribution output at time, and target actionIs input as state S for the current network i # The action corresponding to the maximum probability in the action probability distribution output at the time;
gamma represents an attenuation factor; r is R i Indicating the prize in the ith sample, i.e. the small cell is in state S i Execute action A at time i Is a reward for (a).
Preferably, in S4, a K-means algorithm is adopted to cluster cosine distances between gradients of the alternative base stations.
Preferably, the cache scheduling model and the global scheduling model are trained by adopting historical data in an initial state; in the working process, the global scheduling model and the cache scheduling model are updated and trained regularly, and when the global scheduling model is updated and trained, network parameters of the global scheduling model are used as initialization parameters of the cache scheduling model.
The invention also provides an edge network cache scheduling system and a storage medium, which provide a carrier for the edge network cache scheduling method.
The invention provides an edge network cache scheduling system, which comprises a memory and a processor, wherein a computer program is stored in the memory, the processor is connected with the memory, and the processor is used for executing the computer program to realize the edge network cache scheduling method.
The storage medium provided by the invention stores a computer program, and the computer program is used for realizing the edge network cache scheduling method when being executed.
The invention has the advantages that:
(1) According to the edge network cache scheduling method, the cache scheduling model corresponding to each small base station is combined with the local specificity to perform local training, so that the adaptability to the local situation is ensured; and the gradient of the cache scheduling model of the small base station is combined in the global scheduling model training process of the large base station, and the parameter updating is performed by combining the global aggregation gradient, so that the convergence speed and the global fairness of the global scheduling model are improved.
(2) The buffer scheduling model of the small base station takes the network parameters of the global scheduling model of the large base station as initialization parameters, so that the convergence speed of the buffer scheduling model is greatly improved.
(3) The loss function constructed in the invention considers the hit condition of the small base station on the task request of the client in the coverage area, greatly improves the hit rate of the small base station cache scheduling, and is beneficial to the rapid convergence of the cache scheduling model of the small base station and the establishment of the optimal cache strategy.
(4) According to the invention, the global scheduling model considers the cache scheduling conditions of all the small base stations in the coverage area of the large base station, and the small base stations are enabled to cache the content required by the client in the coverage area by matching the global scheduling model and the cache scheduling model, and the large base station considers the whole network requirement condition of the edge network and the small base station cache condition when selecting the cache content, so that the fairness and the excellence of the edge network cache are greatly improved, and the client requirement under the edge network is fully satisfied.
(5) According to the invention, a simple and clear rewarding algorithm is provided, so that the calculation of rewards is simple and efficient.
(6) In the invention, small base stations serving as candidate base stations are clustered by combining cosine distances among gradients of the cache scheduling model, and then network parameters of the cache scheduling model in the cluster are updated by combining network parameters of the global scheduling model and average gradients of the cache scheduling model in the cluster, so that the convergence rate of the cache scheduling model is further improved. Meanwhile, the situation of each alternative base station is subjected to similarity evaluation by combining the cosine distance, the alternative base stations with similar states are mutually aggregated, and fairness and robustness of the cache scheduling model are further improved through cross reference among the alternative base stations.
(7) In the invention, the small base stations, the reference base station, the alternative base station and the target base station are selected layer by layer and progressive, thereby realizing the updating of all the small base stations and avoiding the network stagnation problem caused by frequent updating of all the small base stations. In the invention, the large base station and the small base station are nested and updated, and the large base station realizes the real-time support of the content requirement of the whole edge network; the small base stations are updated alternately, so that part of the small base stations do not participate in training in the training process, support is provided for clients in the coverage area of the small base stations, the buffer pressure of the large base stations is relieved, and the normal operation of the network of the edge network is guaranteed.
Drawings
FIG. 1 is a flow chart of a method for scheduling edge network buffers;
FIG. 2 is a flow chart of a local training method of a cache scheduling model;
FIG. 3 is a diagram illustrating statistics of cache hit rates for a small base station in an embodiment;
FIG. 4 is a diagram of the statistics of the cache hit rate of a small cell in an embodiment;
FIG. 5 is a diagram showing the comparison of the cache hit rate of a large base station in the embodiment.
Detailed Description
The edge network cache scheduling method provided by the embodiment is suitable for a wireless communication network comprising a cloud server, a large base station and a small base station; the communication coverage of the large base station is larger than that of the small base station; the wireless communication network comprises a plurality of large base stations, wherein each large base station comprises a plurality of small base stations in the coverage area; the large base station and the small base stations in the coverage area form an edge network;
the cache scheduling method decides the cache content of the large base station through the global scheduling model, and decides the cache content of the small base station through the cache scheduling model corresponding to the small base station one by one.
Referring to fig. 1 and 2, the method for scheduling an edge network buffer according to the present embodiment includes the following steps S1 to S10.
S1, acquiring a cache scheduling model corresponding to each small base station and a global scheduling model corresponding to a large base station; the cache scheduling model comprises a current network Q and a target network Q # Current network Q and target network Q # All are Q networks; the input of the Q network of the cache scheduling model is the state of the corresponding small base station, and the output is the action probability distribution of the corresponding small base station; the global scheduling model adopts a Q network, the input of the global scheduling model is the global state of the edge network where the large base station is located, and the output is the action probability distribution of the large base station; the network parameters of the current global scheduling model are denoted as w G The network parameters of the global scheduling model on round T are noted as w G T The method comprises the steps of carrying out a first treatment on the surface of the Current network of nth small cellThe network parameters of Q are denoted as w n Target network Q of nth small cell # The network parameters of (a) are denoted as w n #
Specifically, in this embodiment, the edge network includes N small base stations, and the small base station set b= { B 1 ,B 2 ,…,B n ,…,B N },B n Representing the nth small base station, wherein N is equal to or less than 1 and N is equal to or less than N; small base station B n The state at time slot t is denoted S (t, n) and the global state of the edge network at time slot t is denoted S (t).
Thus, small base station B n The input of the corresponding buffer scheduling model on the time slot t is S (t, n), and the output is the small base station B n The action probability distribution decided on the time slot t; the input of the global scheduling model on the time slot t is S (t), and the output is the action probability distribution decided by the large base station on the time slot t.
S(t)={S(t,1),S(t,2),…,S(t,n),…,S(t,N)}
S(t,n)={M(t,n),Q(t,n)}
M (t, n) is small base station B n The content storage state over time slot t, M (t, n) can be expressed as a binary number set, i.e., M (t, n) = { M 1 (t,n),M 2 (t,n),…,M f (t,n),…,M F (t, n) }, wherein M f (t, n) is a binary number, when the small base station B n Caching content C on time slot t f M is then f (t, n) =1; when small base station B n No cached content C on time slot t f M is then f (t,n)=0;C f Representing the F content, wherein the total content quantity is F, namely 1.ltoreq.fF.ltoreq.F; the content set is denoted as c= { C 1 ,C 2 ,…,C f ,…,C F };
Q (t, n) is small base station B n The task request state of the client in the coverage area on the time slot t can be specifically expressed as: q (t, n) = { Q 1 (t,n),Q 2 (t,n),…,Q kn (t,n),…,Q Kn (t,n)},Q kn (t,n)∈C∪{0},Q kn (t, n) represents the task request of the client u (n, kn) on the time slot t, the client u (n, kn) being the small base station B n An n-th client in the communication coverage area is 1+.kn+.kn, kn represents the small base station B n Coverage areaTotal number of clients within; if Q kn (t,n)=C f Then it means that the task request of client u (n, kn) on time slot t is content C f The method comprises the steps of carrying out a first treatment on the surface of the If Q kn (t, n) =0, then it means that client u (n, kn) has no task request on slot t.
Small base station B n The action on the time slot t is denoted as a (t, n), a (t, n) = { a d (t,n),A c (t,n)};
A c (t, n) represents small cell B n Set of contents to be cached on time slot t, A d (t, n) represents small cell B n A set of cached content to be deleted over time slot t;
the action of the large base station on the time slot t is denoted as A G (t),A G (t) includes a set of cached content to be cached by the large base station on time slot t and a set of cached content to be deleted by the large base station on time slot t.
S2, from N small base stations B 1 ,B 2 ,…,B n ,…,B N The random selection of H small base stations is denoted as set Br= { Br 1 ,Br 2 ,……,Br h ,…,Br H Configuring parameters of a buffer scheduling model corresponding to each small base station in Br as w G T The method comprises the steps of carrying out a first treatment on the surface of the T has an initial value of 0,w G 0 To initialize network parameters; h is an ordinal number, 1 is less than or equal to H is less than or equal to H;
notably, if a new model is trained, the network parameters w are initialized G 0 Initializing network parameters for random; if the model training is performed in the using process, initializing the network parameters w G 0 And the network parameters of the current global scheduling model.
S3, carrying out local training on the cache scheduling model of each small base station in Br, selecting Z small base stations which finish the local training of the cache scheduling model as candidate base stations, and obtaining a gradient set g of the candidate base stations T And jackpot set NR T The method comprises the steps of carrying out a first treatment on the surface of the Let the gradient of the z-th alternative base station be denoted as g z T The jackpot for the z-th alternative base station is noted as NR z T ,1≦z≦Z;g T ={g 1 T ,g 2 T ,…,g z T ,…,g Z T },NR T ={NR 1 T ,NR 2 T ,…,NR z T ,…,NR Z T };
Nth small base station B n The local training of the corresponding cache scheduling model comprises steps S31 to S37;
s31, make small base station B n The network parameters of the current network Q in the corresponding cache scheduling model are denoted as w n Target network Q # The network parameters of (a) are denoted as w n # Let w n =w G T ,w n # =w n The method comprises the steps of carrying out a first treatment on the surface of the Setting cumulative rewards NR n And a sample cumulative value AR; w (w) G T Representing network parameters of a global scheduling model on the round T;
initialization state s=s (t, n), AR and NR n All initialized to 0;
s32, inputting the state S into the current network Q, outputting action probability distribution on the time slot t by the current network Q, and selecting a decision action A by combining probabilities corresponding to the actions, wherein the decision action A is the small base station B n Action a (t, n) on time slot t; in the specific implementation, can be adopted
Figure SMS_7
The greedy method selects the decision action a=a (t, n) on the time slot t according to the action probability distribution.
S33, the state after the acquisition of the state S and the execution of the action A is recorded as the next state S # A prize R; updating jackpot NR n Equal to NR n +R; the reward R is action reward for executing the action A when the small base station is in the state S;
let small base station B n The action prize on slot t is denoted R (t, n), then:
R(t,n)=R c (t,n)-R d (t,n) ;
R c (t, n) represents small cell B n Hit rewards on time slot t for the content buffered on time slot t+1, i.e. small base station B n Content buffered in time slot t and small cell B n The overlapping number of task requests of clients within the coverage area on time slot t+1; it can be seen that R c (t, n) is small base station B n Buffered at time slot t and is buffered by small base station B n The number of content requested by clients within the coverage area on time slot t+1;
R d (t, n) represents small cell B n The cache content deleted on time slot t results in a negative penalty on time slot t+1; i.e. small base station B n Buffer content deleted on time slot t and small base station B n The overlapping number of task requests of clients within the coverage area on time slot t+1; it can be seen that R d (t, n) is small base station B n Buffered in time slot t-1 and deleted in time slot t and is detected by small cell B n The number of content requested by clients in coverage over time slot t+1.
The state S after the action A is executed is recorded as the next state S # Definition of R c (S,S # ) Cached for the small base station in the state S and the client in the coverage area of the small base station in the next state S # The number of content requested at the time, R d (S,S # ) The step S is that the small base station is deleted in the action A and the client in the coverage area of the small base station is in the next state # The amount of content requested at the time, then the prize r=r for action a c (S,S # )-R d (S,S # )。
S34, updating AR to AR+1, and constructing samples { S, A, S } # R, done is stored in a set experience playback set D; done is a task completion flag; if AR<c3, done=0; if ar=c3, done=1; c3 is a set value; judging whether the number of samples in the experience playback set reaches a set value c2 or not; if not, make S updated to S # Then returns to S32; if yes, the following step S35 is executed;
s35, sampling I samples from the experience playback set D, wherein the ith sample is named as { S } i ,A i ,S i # ,R i ,done i },S i Representing the state in the ith sample, A i Representing the action in the ith sample, S i # Representing the next state in the ith sample, R i Indicating the prize in the ith sample, done i A task completion flag in the i-th sample, 1I is less than or equal to I; the current target value y is calculated in combination with the following formula i And a loss function L (Q);
when done i =1,y i =R i
When done i =0,
Figure SMS_8
Q(S i # ,a,w n ) Indicating that the current network Q is in the state S i # The probability corresponding to the action a in the action probability distribution output at the time,
Figure SMS_9
representing the maximum probability, a, in the probability distribution of the actions of the Q network output # Representing an action traversal space;
Figure SMS_10
representing a target network Q # In the input state S i # The probability corresponding to the target action in the action probability distribution output at the time, wherein the target action is the state S of the current network Q at the input i # The action corresponding to the maximum probability in the action probability distribution output at the time;
gamma represents an attenuation factor; r is R i Indicating the prize in the ith sample, i.e. the small cell is in state S i Execute action A at time i Is a reward of (a);
L(Q)=[Σ i=1 I (y i -Q(S i ,A i ,w n )) 2 +µ||w G T -w n || 2 /2]/I
Q(S i ,A i ,w n ) Action A in action probability distribution representing current network Q output i The corresponding probabilities; sigma represents summation, and the summation range is i E [1, I]The method comprises the steps of carrying out a first treatment on the surface of the The [ mu ] represents a regular term parameter; i W G T -w n || 2 Representing w G T -w n Is a binary norm of (2);
s36, updating the parameter w of the current network by combining gradient back propagation of L (Q) through the neural network n If when atThe parameter w of the target network is updated if the number of parameter updates of the previous network is an integer multiple of the set value c1 n # So that w n # =w n
S37, judging whether the AR reaches a set c3; if not, make S updated to S # AR is updated to AR+1, then the state S is input into the current network, the decision action A is selected by combining the action probability distribution output by the current network, and a sample { S, A, S ] is constructed by combining the state S and the decision action A # R, done } and store in experience playback set D; then returning to step S35; the local training of the cache scheduling model is complete.
S4, calculating cosine distances between gradients of different alternative base stations, clustering the cosine distances by adopting a K-means algorithm (K-means clustering algorithm), forming K clusters by Z alternative base stations, calculating global network parameters of each cluster, and recording the global network parameters of the kth cluster as w k T+1 Replacing network parameters of a cache scheduling model of the alternative base station of each cluster with global network parameters w corresponding to the cluster k T+1
Figure SMS_11
,/>
Figure SMS_12
Representing the average gradient of the candidate base stations contained in the kth cluster. In a specific operation, K may take a value of 3.
In the implementation, in this step, any existing cluster classification algorithm may be used to form K clusters from the Z candidate base stations.
Specifically, in this embodiment, let the kth cluster be referred to as mBS on the round T k T Cluster mBS k T Including J k T Each alternative base station, cluster mBS k T The average gradient of the candidate base stations is noted as
Figure SMS_13
The method comprises the steps of carrying out a first treatment on the surface of the Then there are:
cluster set mBS T ={mBS 1 T ,mBS 2 T ,…,mBS k T ,…,mBS K T }
Intra-cluster base station number set J T ={J 1 T ,J 2 T ,…,J k T ,…,J K T }
Intra-cluster mean gradient set
Figure SMS_14
Wherein, 1 is less than or equal to K is less than or equal to K.
S5, arranging Z alternative base stations according to a descending order of the accumulated rewards, and marking the gradient of the alternative base station positioned at the Z-th position as G z T 1 +.z, constructing the alternate gradient set { G- z T 1 +.z +.Z +.; the theta alternative base stations with the top order in the alternative gradient set are obtained and are marked as target base stations, and the gradient of the sigma-th target base station is marked as G # σ 1 +.sigma +.theta, constructing the target gradient set { G- # σ 1 +.sigma +.θ }; θ is an integer value obtained by rounding α×z, and specifically rounding, rounding up, rounding down, or the like may be employed; alpha is a set fairness and robustness control factor.
S6, let eta # =G # σ The method comprises the steps of carrying out a first treatment on the surface of the The initial value of sigma is 1;
s7, let η=g z T The method comprises the steps of carrying out a first treatment on the surface of the The initial value of z is 1;
s8, judging whether the specification of eta multiplied by eta is satisfied # <0; if yes, G in the target gradient set # σ Updated to eta #2 ×η # /||η|| 2 ,||η|| 2 Representing the two norms of η, and then executing step S9; if not, executing step S9;
s9, judging whether z=Z is satisfied; if not, updating z to z+1, and returning to the step S7; if yes, the following step S10 is executed;
s10, judging whether sigma=θ is satisfied; if not, the sigma is updated to sigma+1, z is initialized, and then the step S6 is returned; if yes, replacing the first theta gradients in the candidate gradient set one by one to be the corresponding gradients in the target gradient set, and calculating a gradient mean value g in the candidate gradient set G T+1 Calculate w G T+1 =w G T +g G T+1 The method comprises the steps of carrying out a first treatment on the surface of the Then updating T to T+1, returning to step S2, and updating the network parameters of the global scheduling model to w when T reaches a set value c4 G T+1 And fixes the global schedule model and the cache schedule model.
The invention is illustrated below with reference to specific examples.
In this embodiment, an edge network including 1 large base station and 30 small base stations in a wireless communication network is used for scene simulation, the cache capacity of the small base stations is 30 contents, and the total content is 100. In this embodiment, assume that the content request sequence received by the small base stations numbered 1-15 obeys Ji Pufu law zipf (1.2) with a slope of 1.2; the sequence of content requests received by the small base stations numbered 16-25 obeys Ji Pufu law zipf (1.5) with a slope of 1.5; the sequence of content requests received by the small base stations numbered 26-30 obeys Ji Pufu law zipf (0.8) with a slope of 0.8.
In this embodiment, the above-mentioned edge network cache scheduling method is adopted to train the cache scheduling model of the small base station and the global scheduling model of the large base station, and in the training process, let z=25, k=3; c1 =100, c2=128, c3=200, c4=10; i=64;
in this embodiment, for convenience of presentation, the small base stations numbered 1-15 calculate average buffer hit rates under different fairness and robustness control factors α, the small base stations numbered 16-25 calculate average buffer hit rates under different fairness and robustness control factors α, and the small base stations numbered 26-30 calculate average buffer hit rates under different fairness and robustness control factors α, and specific statistical results are shown in fig. 3 and 4. The cache hit rate of the large base station under different fairness and robustness control factors α is shown in fig. 5. The cache hit rate of the base station is the ratio of the number of the content requested by the client in the coverage area of the base station in the cache content of the base station to the total number of the cache content of the base station.
As can be seen from fig. 3 and fig. 4, the larger α is, the lower the standard deviation of the cache hit rate of the small cell is, and the lower the overall average value of the cache hit rate is; this means that the larger the α is, the higher the fairness that different small base stations exhibit in cache hit rate; but the smaller alpha is, the higher the cache accuracy of the base station obeying Ji Pufu law zipf obeying a specific slope is, and when alpha is 0, the small base stations 1-15 obeying Ji Pufu law zipf (1.2) obeying a slope of 1.2 can realize an average cache hit rate of 0.781; when α is 0.6, a small base station 16-25 that follows Ji Pufu law zipf (1.5) with a slope of 1.5 for the content request sequence may achieve an average cache hit rate of 0.753; when α is 0.8, the average cache hit rate of the small base stations 1-15, the average cache hit rate of the small base stations 16-25, and the average cache hit rate of the small base stations 26-30 are all between intervals (0.66,0.7). It can be seen that in this embodiment, the association between the cache hit rate of the small base station and α is linked with each other, which proves that the cache scheduling method provided by the invention achieves a good effect on the small base station.
In fig. 5, when comparing different α values, the cache scheduling method provided by the present invention has a cache hit rate on a large base station; the cache hit rate of the invention on a large base station is compared with that of other existing algorithms. It can be known that when the cache scheduling method provided by the invention is adopted, the smaller the alpha value is, the larger the cache capacity of the large base station is, the larger the hit rate of the large base station is; and the cache hit rate of the cache scheduling method on the large base station is superior to that of the existing edge network caching algorithm, such as least used content (LRU), first-in first-out (FIFO) and Random permutation (Random).
The above embodiments are merely preferred embodiments of the present invention and are not intended to limit the present invention, and any modifications, equivalent substitutions and improvements made within the spirit and principles of the present invention should be included in the scope of the present invention.

Claims (10)

1. An edge network cache scheduling method is used for edge network cache scheduling in a wireless communication network comprising a cloud server, a large base station and a small base station, wherein the edge network consists of the large base station and the small base station within the coverage area of the large base station; the method is characterized in that the edge network cache scheduling method decides the cache content of a large base station through a global scheduling model, and decides the cache content of a small base station through a cache scheduling model corresponding to the small base station one by one; the global scheduling model and the cache scheduling model are both neural network models; the input of the global scheduling model is the global state of the edge network where the large base station is located, and the output is the action probability distribution of the large base station; the input of the cache scheduling model is the state of the corresponding small base station, and the output is the action probability distribution of the corresponding small base station; the cache scheduling model at least comprises a network structure which is the same as the global scheduling model;
the state of the small base station comprises the content storage state of the small base station and the task request state of a client in the coverage area of the small base station; the actions of the small base station comprise content to be cached and content to be deleted; the state of the large base station is a set of states of all small base stations in a coverage area of the large base station, and the actions of the large base station comprise contents to be cached and contents to be deleted of the large base station;
the training process of the global scheduling model and the buffer scheduling model is as follows:
s1, acquiring a cache scheduling model corresponding to each small base station and a global scheduling model corresponding to a large base station;
s2, randomly selecting part of small base stations from the small base stations as reference base stations, and configuring the parameters of the cache scheduling model corresponding to each reference base station as network parameters of the global scheduling model;
s3, carrying out local training on the cache scheduling model corresponding to each reference base station, selecting Z reference base stations which complete the local training of the cache scheduling model as alternative base stations, and acquiring gradients and accumulated rewards of each alternative base station;
s4, calculating cosine distances between gradients of different alternative base stations, clustering according to the cosine distances, and forming K clusters of Z alternative base stations, wherein K is smaller than or equal to 1 and smaller than or equal to 10; calculating the global network parameters of each cluster, and enabling the global network parameters of the kth cluster to be marked as w k T +1
Figure QLYQS_1
,/>
Figure QLYQS_2
Representing the average gradient of the candidate base stations contained in the kth cluster;w G T representing network parameters of the global scheduling model, T representing a round, T+1 representing a next round of T; replacing network parameters of a cache scheduling model of the alternative base station of each cluster with global network parameters w corresponding to the cluster k T+1 The method comprises the steps of carrying out a first treatment on the surface of the The initial value of T is 0;
s5, arranging Z alternative base stations according to a descending order of the accumulated rewards, and marking the gradient of the alternative base station positioned at the Z-th position as G z T 1 +.z, constructing the alternate gradient set { G- z T 1 +.z +.Z +.; acquiring the theta candidate base stations ordered earlier in the descending order of the jackpot prize as target base stations, and the gradient of the sigma target base station as G # σ 1 +.sigma +.theta, constructing the target gradient set { G- # σ 1 +.sigma +.θ }; θ is an integer value obtained by rounding α×z, α is a set fairness and robustness control factor;
s6, let eta # =G # σ The method comprises the steps of carrying out a first treatment on the surface of the The initial value of sigma is 1;
s7, let η=g z T The method comprises the steps of carrying out a first treatment on the surface of the The initial value of z is 1;
s8, judging whether the specification of eta multiplied by eta is satisfied # <0; if yes, G in the target gradient set # σ Updated to eta #2 ×η # /||η|| 2 ,||η|| 2 Representing the two norms of η, and then executing step S9; if not, executing step S9;
s9, judging whether Z is larger than or equal to Z; if not, updating z to z+1, and returning to the step S7; if yes, the following step S10 is executed;
s10, judging whether sigma is larger than or equal to theta; if not, the sigma is updated to sigma+1, z is initialized, and then the step S6 is returned; if yes, replacing the first theta gradients in the candidate gradient set one by one to be the corresponding gradients in the target gradient set, and calculating a gradient mean value g in the candidate gradient set G T+1 Calculating a transition term w G T+1 =w G T +g G T+1 The method comprises the steps of carrying out a first treatment on the surface of the Then updating T to T+1, returning to step S2, and updating the network parameters of the global scheduling model to w when the updating times of T reach the set value c4 G T+1 And fixes the global schedule model and the cache schedule model.
2. The edge network cache scheduling method of claim 1, wherein in S3, the gradient of the candidate base station is a difference between the network parameters of the cache scheduling model after the candidate base station has completed the local training and the network parameters of the current global scheduling model.
3. The edge network cache scheduling method of claim 2, wherein the cache scheduling model is composed of a current network and a target network, and the current network and the target network are the same as the global scheduling model in network structure; n-th small base station B in S3 n The local training of the corresponding cache scheduling model comprises steps S31 to S37; n is equal to or less than 1 and equal to or less than N, wherein N is the number of small base stations in the edge network;
s31, setting cumulative rewards NR n And a sample cumulative value AR; let small base station B n The network parameters of the current network in the corresponding cache scheduling model are denoted as w n The network parameters of the target network are denoted as w n # Updating network parameters of the current network and the target network to network parameters w of the current global scheduling model G The method comprises the steps of carrying out a first treatment on the surface of the Initialization states S, AR and NR n All initialized to 0;
s32, inputting the state S into a current network, outputting action probability distribution by the current network, and selecting a decision action A by combining the action probability distribution;
s33, acquiring a state S after the state S executes the action A # A prize R; updating jackpot NR n Equal to NR n +R; the reward R is action reward for executing the action A when the small base station is in the state S;
s34, updating AR to AR+1, and constructing samples { S, A, S } # R, done is stored in a set experience playback set D; done is a task completion flag; if AR<c3, done=0; if ar=c3, done=1; c3 is a set value; judging whether the number of samples in the experience playback set reaches a set value c2 or not; if not, make S updated to S # Then returns to S32; if yes, the following step S35 is executed;
s35, sampling I samples from the experience playback set D, wherein the ith sample is named as { S } i ,A i ,S i # ,R i ,done i },S i Representing the state in the ith sample, A i Representing the action in the ith sample, S i # Representing the next state in the ith sample, R i Indicating the prize in the ith sample, done i Indicating that the task completion flag in the ith sample is 1 +.i; calculating a loss function L (Q) by combining the I samples;
s36, updating the parameter w of the current network by combining gradient back propagation of L (Q) through the neural network n The method comprises the steps of carrying out a first treatment on the surface of the If the number of parameter updating times of the current network is an integer multiple of the set value c1, updating the parameter w of the target network n # So that w n # =w n
S37, judging whether the AR reaches c3; if not, make S updated to S # AR is updated to AR+1, then the state S is input into the current network, the decision action A is selected by combining the action probability distribution output by the current network, and a sample { S, A, S ] is constructed by combining the state S and the decision action A # R, done } and store in experience playback set D; then returning to step S35; the local training of the cache scheduling model is complete.
4. The edge network cache scheduling method of claim 3, wherein the step of S32 is performed by
Figure QLYQS_3
The greedy approach selects decision actions based on the action probability distribution.
5. The edge network buffer scheduling method of claim 3 wherein small cell B is caused to n The action prize on slot t is denoted R (t, n), then:
R(t,n)=R c (t,n)-R d (t,n) ;
R c (t, n) represents small cell B n Hit rewards on time slot t for the content buffered on time slot t+1, i.e. small base station B n On time slot tCached and by small base station B n The number of content requested by clients within the coverage area on time slot t+1;
R d (t, n) represents small cell B n The cache content deleted on time slot t results in a negative penalty on time slot t+1; i.e. small base station B n Buffered in time slot t-1 and deleted in time slot t and is detected by small cell B n The number of content requested by clients in coverage over time slot t+1.
6. The method for scheduling edge network buffers as recited in claim 3, wherein the loss function in S35 is calculated as follows:
L(Q)=[Σ i=1 I (y i -Q(S i ,A i ,w n )) 2 +µ||w G -w n || 2 /2]/I
Q(S i ,A i ,w n ) Action A in action probability distribution representing current network output i The corresponding probabilities; sigma represents summation, and the summation range is i E [1, I]The method comprises the steps of carrying out a first treatment on the surface of the The [ mu ] represents a regular term parameter; w (w) G Network parameters, w, representing the current global scheduling model n Network parameters representing a cache scheduling model; i W G -w n || 2 Representing w G -w n Is a binary norm of (2); y is i Is the current target value;
when done i =1,y i =R i
When done i =0,
Figure QLYQS_4
;Q(S i # ,a,w n ) Indicating that the current network is in the state S i # Probability corresponding to action a in action probability distribution output at the time,/->
Figure QLYQS_5
Representing the maximum probability, a, in the probability distribution of actions of the current network output # Representing an action traversal space;
Figure QLYQS_6
indicating that the target network is in the state S i # The probability corresponding to the target action in the action probability distribution output at the time, wherein the target action is the state S of the current network at the input i # The action corresponding to the maximum probability in the action probability distribution output at the time;
gamma represents an attenuation factor; r is R i Indicating the prize in the ith sample, i.e. the small cell is in state S i Execute action A at time i Is a reward for (a).
7. The edge network cache scheduling method of claim 1, wherein the cosine distances between gradients of the candidate base stations are clustered in S4 using a K-means algorithm.
8. The edge network cache scheduling method of claim 1, wherein the cache scheduling model and the global scheduling model are trained with historical data in an initial state; in the working process, the global scheduling model and the cache scheduling model are updated and trained regularly, and when the global scheduling model is updated and trained, network parameters of the global scheduling model are used as initialization parameters of the cache scheduling model.
9. An edge network cache scheduling system, comprising a memory and a processor, wherein the memory stores a computer program, the processor is connected to the memory, and the processor is configured to execute the computer program to implement the edge network cache scheduling method according to any one of claims 1 to 8.
10. A storage medium storing a computer program which, when executed, is adapted to implement the edge network cache scheduling method of any one of claims 1 to 8.
CN202310465386.2A 2023-04-27 2023-04-27 Edge network cache scheduling method, system and storage medium Active CN116209015B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310465386.2A CN116209015B (en) 2023-04-27 2023-04-27 Edge network cache scheduling method, system and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310465386.2A CN116209015B (en) 2023-04-27 2023-04-27 Edge network cache scheduling method, system and storage medium

Publications (2)

Publication Number Publication Date
CN116209015A true CN116209015A (en) 2023-06-02
CN116209015B CN116209015B (en) 2023-06-27

Family

ID=86514971

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310465386.2A Active CN116209015B (en) 2023-04-27 2023-04-27 Edge network cache scheduling method, system and storage medium

Country Status (1)

Country Link
CN (1) CN116209015B (en)

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107612987A (en) * 2017-09-08 2018-01-19 浙江大学 A kind of service provision optimization method based on caching towards edge calculations
US20190014488A1 (en) * 2017-07-06 2019-01-10 Futurewei Technologies, Inc. System and method for deep learning and wireless network optimization using deep learning
CN109347925A (en) * 2014-12-31 2019-02-15 华为技术有限公司 Caching method, caching edges server, caching Core server and caching system
CN110995858A (en) * 2019-12-17 2020-04-10 大连理工大学 Edge network request scheduling decision method based on deep Q network
EP3648436A1 (en) * 2018-10-29 2020-05-06 Commissariat à l'énergie atomique et aux énergies alternatives Method for clustering cache servers within a mobile edge computing network
CN113902021A (en) * 2021-10-13 2022-01-07 北京邮电大学 High-energy-efficiency clustering federal edge learning strategy generation method and device
CN114143891A (en) * 2021-11-30 2022-03-04 南京工业大学 FDQL-based multi-dimensional resource collaborative optimization method in mobile edge network
WO2022139879A1 (en) * 2020-12-24 2022-06-30 Intel Corporation Methods, systems, articles of manufacture and apparatus to optimize resources in edge networks
US11595269B1 (en) * 2021-09-13 2023-02-28 International Business Machines Corporation Identifying upgrades to an edge network by artificial intelligence
CN115809147A (en) * 2023-01-16 2023-03-17 合肥工业大学智能制造技术研究院 Multi-edge cooperative cache scheduling optimization method, system and model training method
CN116010054A (en) * 2022-12-28 2023-04-25 哈尔滨工业大学 Heterogeneous edge cloud AI system task scheduling frame based on reinforcement learning

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109347925A (en) * 2014-12-31 2019-02-15 华为技术有限公司 Caching method, caching edges server, caching Core server and caching system
US20190014488A1 (en) * 2017-07-06 2019-01-10 Futurewei Technologies, Inc. System and method for deep learning and wireless network optimization using deep learning
CN107612987A (en) * 2017-09-08 2018-01-19 浙江大学 A kind of service provision optimization method based on caching towards edge calculations
EP3648436A1 (en) * 2018-10-29 2020-05-06 Commissariat à l'énergie atomique et aux énergies alternatives Method for clustering cache servers within a mobile edge computing network
CN110995858A (en) * 2019-12-17 2020-04-10 大连理工大学 Edge network request scheduling decision method based on deep Q network
WO2022139879A1 (en) * 2020-12-24 2022-06-30 Intel Corporation Methods, systems, articles of manufacture and apparatus to optimize resources in edge networks
US11595269B1 (en) * 2021-09-13 2023-02-28 International Business Machines Corporation Identifying upgrades to an edge network by artificial intelligence
CN113902021A (en) * 2021-10-13 2022-01-07 北京邮电大学 High-energy-efficiency clustering federal edge learning strategy generation method and device
CN114143891A (en) * 2021-11-30 2022-03-04 南京工业大学 FDQL-based multi-dimensional resource collaborative optimization method in mobile edge network
CN116010054A (en) * 2022-12-28 2023-04-25 哈尔滨工业大学 Heterogeneous edge cloud AI system task scheduling frame based on reinforcement learning
CN115809147A (en) * 2023-01-16 2023-03-17 合肥工业大学智能制造技术研究院 Multi-edge cooperative cache scheduling optimization method, system and model training method

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
YIN YUFENG ETC.: "Policy Gradient Method based Energy Efficience Task Scheduling in Mobile Edge Blockchain", 《2020 IEEE 6TH INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATIONS (ICCC)》 *
刘浩: "智能化边缘任务卸载与协作调度", 《CNKI》 *
张文献;杜永文;张希权;: "面向多用户移动边缘计算轻量任务卸载优化", 小型微型计算机系统, no. 10 *
房晓阳: "基于Q-learning的分布式基站缓存替换策略", 《信息工程大学学报》 *

Also Published As

Publication number Publication date
CN116209015B (en) 2023-06-27

Similar Documents

Publication Publication Date Title
CN111031102A (en) Multi-user, multi-task mobile edge computing system cacheable task migration method
WO2023159986A1 (en) Collaborative caching method in hierarchical network architecture
CN110602653A (en) Pre-caching method based on track prediction
CN115344395B (en) Heterogeneous task generalization-oriented edge cache scheduling and task unloading method and system
CN115809147B (en) Multi-edge collaborative cache scheduling optimization method, system and model training method
CN115297170A (en) Cooperative edge caching method based on asynchronous federation and deep reinforcement learning
CN113158544B (en) Edge pre-caching strategy based on federal learning under vehicle-mounted content center network
CN113188544A (en) Unmanned aerial vehicle base station path planning method based on cache
CN108521640B (en) Content distribution method in cellular network
CN110913239B (en) Video cache updating method for refined mobile edge calculation
CN113918829A (en) Content caching and recommending method based on federal learning in fog computing network
CN114374949B (en) Information freshness optimization-based power control mechanism in Internet of vehicles
CN113141634B (en) VR content caching method based on mobile edge computing network
CN112702443B (en) Multi-satellite multi-level cache allocation method and device for satellite-ground cooperative communication system
CN116209015B (en) Edge network cache scheduling method, system and storage medium
CN111447506B (en) Streaming media content placement method based on delay and cost balance in cloud edge environment
CN115756873B (en) Mobile edge computing and unloading method and platform based on federation reinforcement learning
CN116362345A (en) Edge caching method and system based on multi-agent reinforcement learning and federal learning
CN116249162A (en) Collaborative caching method based on deep reinforcement learning in vehicle-mounted edge network
CN115587266A (en) Air-space-ground integrated internet intelligent edge caching method
Zhang et al. Cache-enabled adaptive bit rate streaming via deep self-transfer reinforcement learning
CN114980324A (en) Slice-oriented low-delay wireless resource scheduling method and system
CN113342529A (en) Mobile edge calculation unloading method based on reinforcement learning under cell-free large-scale multi-antenna architecture
CN117811846B (en) Network security detection method, system, equipment and medium based on distributed system
CN114035858B (en) Distributed computing unloading method for mobile edge computation under cell-free large-scale MIMO based on deep reinforcement learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant