CN109803292A

CN109803292A - A method of the mobile edge calculations of more secondary user's based on intensified learning

Info

Publication number: CN109803292A
Application number: CN201811597091.6A
Authority: CN
Inventors: 葛颂阳; 肖亮; 龚杰; 陈翔
Original assignee: SYSU CMU Shunde International Joint Research Institute; Research Institute of Zhongshan University Shunde District Foshan; National Sun Yat Sen University
Current assignee: Sun Yat Sen University; SYSU CMU Shunde International Joint Research Institute; Research Institute of Zhongshan University Shunde District Foshan; National Sun Yat Sen University
Priority date: 2018-12-26
Filing date: 2018-12-26
Publication date: 2019-05-24
Anticipated expiration: 2038-12-26
Also published as: CN109803292B

Abstract

The method of the mobile edge calculations of more secondary user's based on intensified learning that the invention discloses a kind of, this method is based on priority scan method under Dyna structure, suitable for frequency spectrum resource anxiety, time delay and the high mobile edge calculations wireless communications environment of horsepower requirements belong to wireless communication field.This method mainly includes four steps: the selection of main user progress Edge Server first；Secondly secondary user's propose to occupy Edge Server application to control centre；Then control centre's processing application, and allocations of edge server, secondary user's carry out part and calculate unloading or entirely local calculating；Finally calculate the utility of each secondary user's.Intensified learning is applied to mobile edge calculations cordless communication network by this method, and combine the model-free of Q study and the advantage of priority scan method preferentially updated, while guaranteeing system overall utility performance, the demand of each secondary user's time delay and energy consumption is met, the utilization rate of resource is improved.

Description

A method of the mobile edge calculations of more secondary user's based on intensified learning

Technical field

The present invention relates to wireless communication fields, are to be related to a kind of more secondary user's based on intensified learning more specifically Mobile edge calculations method is a kind of nervous for frequency spectrum resource, time delay and the high mobile edge calculations channel radio of horsepower requirements Believe the method for environment.

Background technique

It is only applicable to deterministic environment using traditional priority scan method, ideal can not be obtained for circumstances not known As a result.And the method learnt based on traditional model-free Q, it will do it the update of all state action values, repeatedly calculate all shapes State action value, the consuming time is longer, can not quickly obtain selection Edge Server and determine mobile edge unloading task amount most Dominant strategy.In order to improve the frequency spectrum resource utilization rate of whole system, researcher proposes some based on model hypothesis and known logical Believe the methods of frequency spectrum share and the frequency spectrum resource configuration of environment, but these methods can not take into account each secondary user's effectiveness, With the convergence rate of Edge Server selection optimal policy.

Summary of the invention

Present invention aim to address traditional mobile edge calculations methods cannot take into account user's time delay horsepower requirements and be Unite utility, optimal policy convergence rate the problem of, provide based under Dyna structure in intensified learning priority scan it is more The mobile edge calculations method of secondary user's.

To achieve the above object, technical solution provided by the invention is as follows:

The mobile edge calculations method of more secondary user's based on intensified learning, which is characterized in that the exchange method packet Include the following steps:

S1, system initialization parameter determine the number N of main user_P, the number N of secondary user's_S, of Edge Server Several and control node number N_M, the transmission power P of secondary user's, the task amount of secondary user's is Task, each is secondary The channel capacity C of channel between user and Edge Server, communication urgency Em is zero；Initial method parameter, edge service The state of channel corresponding to deviceIt is initially vacant, value zero, initialization Q value is zero, learning rate For α, discount factor δ, pri function value is zero, priority threshold value θ, and priority query is sky, starts iteration；

S2, main user select to occupy Edge Server M_P, the state value of the server is 1；

S3, secondary user's i propose to occupy the application M of Edge Server resource according to ε-greedy strategy_i1, and determine and calculate The task amount x of unloading；

S4, control centre handle the application of each secondary user's, and are its allocations of edge server；

S5, the secondary user's for obtaining Edge Server resource realize that unloading calculates, and do not obtain the secondary user's for occupying qualification, Then carry out entirely local calculating；

S6, the current effectiveness of each secondary user's is calculated as return, and the Edge Server that will be returned and be successfully connected immediately It is updated in models of priority；

S7, more new traffic urgency, and Q value and pri function are updated, if pri function is higher than threshold θ, by this State and selection are added in priority query, and update corresponding Q value according to priority orders；

S8, judge whether to meet stopping criterion for iteration, if satisfied, it is average then to calculate each secondary user's after entire method executes Effectiveness；If not satisfied, the S2 that then gos to step.

Preferably, the step S4 is specifically included:

S41, control centre handle the application of each secondary user's；

If channel status corresponding to S42, Edge Server is 1, apply for that the secondary user's of the server are unavailable The right to use jumps to step S44；

If S43, an Edge Server are only applied once, which is accounted for by applied secondary user's With, i.e., successfully occupy original application server；

If S44, Edge Server are applied twice or more, to be arranged according to the communication urgency of each secondary user's Sequence, the higher preferential right to use for obtaining Edge Server of communication urgency, that is, occupy non-original application server resource；

S45, the secondary user's for not obtaining the Edge Server right to use, then be probabilistically assigned the edge service not being applied Device guarantees that all Edge Servers are occupied or secondary user's obtain server resource；Edge Server if it exists It has been taken that, part secondary user's do not obtain server resource, then secondary user's can only carry out entirely local calculating.

Preferably, the step S6 is specifically included:

S61, each secondary user's utility include two parts, mainly have the delay of calculating and the energy consumption of calculating, with effect It is inversely proportional with property；

In the value of utility of the secondary user's of S62, progress local computing, include local computing delay and local computing energy consumption； Carry out calculating section unloading secondary user's value of utility in, comprising local computing delay, local computing energy consumption, unload time delay and Unload energy consumption.

Preferably, the step S7 is specifically included:

S71, more new traffic degree are mainly determined according to whether secondary user's successfully obtain Edge Server resource, if secondary User successfully occupies original application server, then it communicates urgency and remains unchanged；If non-original application side only can be used in secondary user's Edge server, then E_m+1；If secondary user's do not obtain the resource of server, it is only capable of entirely local calculating, then its E_m+2；

S72, update Q value are the following discount learnt with learning rate α to be returned the error between current Q value, i.e., It predicts error, is superimposed on current Q value；Updating pri function is to take prediction error and current priority functional value most Big value is as new pri function value.

Compared with prior art, the beneficial effects of the present invention are:

1, the mobile edge calculations method of more secondary user's disclosed by the invention based on intensified learning is compared to major part For traditional mobile edge calculations method, it is suitable for the environment of time-varying, has both and promotes effectiveness, the advantage of fast convergence rate, greatly It is big to improve resource utilization ratio, secondary user's delay can be met while meeting main user resources demand and is wanted with energy consumption It asks, is suitable for frequency spectrum resource anxiety, the high mobile edge calculations wireless environment of time delay horsepower requirements.

2, the mobile edge calculations method of more secondary user's disclosed by the invention based on intensified learning, compared to major part Traditional Q study Edge Server and randomly selected spectrum resource allocation method, method disclosed by the invention can be with faster Speed convergence to optimal policy.

Detailed description of the invention

The invention will be further described with example with reference to the accompanying drawing.

Fig. 1 is the process step of the mobile edge calculations method of more secondary user's proposed by the present invention based on intensified learning Figure；

The process step figure of control centre's processing secondary user's application in Fig. 2 present invention；

The process step figure of secondary user's value of utility is calculated in Fig. 3 present invention；

Fig. 4 is the process step figure of more new traffic urgency, Q function and pri function in the present invention；

Fig. 5 (a) is the effectiveness comparison diagram based on simple Q study and the method for the present invention of secondary user's 1；

Fig. 5 (b) is the effectiveness comparison diagram based on simple Q study and the method for the present invention of secondary user's 2；

Fig. 5 (c) is the energy consumption comparison figure based on simple Q study and the method for the present invention of secondary user's 1；

Fig. 5 (d) is the energy consumption comparison figure based on simple Q study and the method for the present invention of secondary user's 2；

Fig. 5 (e) is the time delay comparison diagram based on simple Q study and the method for the present invention of secondary user's 1；

Fig. 5 (f) is the time delay comparison diagram based on simple Q study and the method for the present invention of secondary user's 2.

Specific embodiment

To make the objectives, technical solutions, and advantages of the present invention clearer and more explicit, right as follows in conjunction with drawings and embodiments The present invention is further described.It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, and do not have to It is of the invention in limiting.

Embodiment one

The present embodiment devises a kind of mobile edge calculations method of more secondary user's based on intensified learning.In the present invention The process flow of exchange method includes:

S41, the secondary user's for not obtaining the Edge Server right to use, then be probabilistically assigned the edge service not being applied Device guarantees that all Edge Servers are occupied or secondary user's obtain server resource；Edge Server if it exists It has been taken that, part secondary user's do not obtain server resource, then secondary user's can only carry out entirely local calculating.

S42, the secondary user's for obtaining Edge Server resource realize that unloading calculates, and do not obtain the secondary of occupancy qualification and use Family then carries out entirely local calculating；

S6, the current effectiveness of each secondary user's is calculated as return, and the Edge Server that will be returned and be successfully connected immediately It is updated in models of priority；In the value of utility for carrying out the secondary user's of local computing, include local computing delay and local meter Calculate energy consumption；In the value of utility for carrying out the secondary user's of calculating section unloading, comprising local computing delay, local computing energy consumption, unload Carry time delay and unloading energy consumption.

S7, more new traffic urgency, if secondary user's successfully occupy original application server, communication urgency is kept not Become；If non-original application Edge Server, E only can be used in secondary user's_m+1；If secondary user's do not obtain the money of server Source is only capable of entirely local calculating, then its E_m+2；And update Q value, by following discount learn with learning rate α return and Error between current Q value, i.e. prediction error, are superimposed on current Q value；And take prediction error and current priority functional value Maximum value as new pri function value, if pri function is higher than threshold θ, priority is added in this state and selection In queue, and corresponding Q value is updated according to priority orders；

S8, setting 1000 time slot of iteration and 500 independent experiments.Judge whether to meet stopping criterion for iteration, if satisfied, then Calculate each secondary user's average utility after entire method executes；If not satisfied, the S2 that then gos to step.

The key of this movement edge calculations method is that user is allow to select mobile edge using intensified learning method Changeable environment is adapted to during server.Wherein, using the priority scanning algorithm under the Dyna structure in intensified learning, Learn in conjunction with the higher state of preferential update priority and Q of priority scan method excellent with environmental interaction acquisition experience Gesture obtains the selection strategy of optimal mobile Edge Server in the case where meeting the requirement of user's time delay and energy consumption.

Embodiment two

In terms of the present embodiment combination Figure of description 1 to Fig. 5 and the mobile edge for specifically containing two secondary user's by one Embodiment is calculated the mobile edge calculations method proposed by the present invention based on intensified learning is once described in detail.

Consider that system model is as follows: during mobile edge calculations, 1 main user, 2 secondary user's, selection Using 3 Edge Servers, unloading task is carried out.The task amount of secondary user's 1 is Task₁, the task amount of secondary user's 2 is Task₂, the channel capacity between two secondary user's and three Edge Servers is Matrix C, and communication urgency is zero.Edge clothes The corresponding channel status of business device is zero, i.e., unoccupied.Q value is zero, learning rate 0.8, discount factor 0.02, preferentially Grade functional value zero, priority threshold value 0.15, and priority query are sky, start iteration.

In each time slot, main user occupies Edge Server first, makes the state 1 of its respective channels, and secondary is used Family proposes to use the application of Edge Server using ε-greedy strategy, and determines the amount of calculating task unloading.Control centre's processing The application of each secondary user's, and allocations of edge server guarantee that all Edge Servers are occupied or secondary user's Obtain server resource；Edge Server has taken if it exists, and part secondary user's do not obtain server resource, then secondary User can only carry out entirely local calculating.

If carrying out local computing, local time delay and energy consumption are calculated；If carrying out calculating unloading, two parts are calculated, The time delay and energy consumption that time delay energy consumption, unloading including local computing calculate.And update value of utility, Q value and pri function value.

In each time slot, two secondary user's are influenced by the selection of other side's Edge Server mutually, and target is maximum Change value of utility.The termination condition of iteration is to carry out 300 time slots altogether, 700 independent experiments.

The simulation result of Fig. 5 (a) shows in terms of value of utility, proposed by the present invention based on strong for secondary user's 1 The mobile edge calculations method that chemistry is practised is than the fast convergence rate about 50% that is learnt based on simple Q.The simulation result of Fig. 5 (b) Illustrate, for secondary user's 2, fast convergence rate about 30% of the method for the present invention on value of utility than being learnt based on simple Q.By Shown in Fig. 5 (c) and Fig. 5 (d), method proposed by the present invention all learns the energy consumption of two secondary user's than being based on simple Q With better performance.Fig. 5 (e) and Fig. 5 (f) show that in terms of calculation delay, method proposed by the present invention has better property Energy.

To sum up, method proposed by the present invention, in terms of value of utility, calculating energy consumption and calculation delay, compared to based on simple Q study, guarantee optimal value it is constant and combine two secondary user's performance in the case where, have faster convergence speed Degree.

The above embodiment is a preferred embodiment of the present invention, but embodiments of the present invention are not by above-described embodiment Limitation, other any changes, modifications, substitutions, combinations, simplifications made without departing from the spirit and principles of the present invention, It should be equivalent substitute mode, be included within the scope of the present invention.

Claims

1. a kind of mobile edge calculations method of more secondary user's based on intensified learning, which is characterized in that the calculation method Include the following steps:

S1, system initialization parameter determine the number N of main user_P, the number N of secondary user's_S, the number of Edge Server with And the number N of control node_M, the transmission power P of secondary user's, the task amount of secondary user's is Task, each secondary user's The channel capacity C of channel between Edge Server, communication urgency Em is zero；Initial method parameter, Edge Server institute The state of corresponding channelIt is initially vacant, value zero, initialization Q value is zero, learning rate α, Discount factor is δ, and pri function value is zero, priority threshold value θ, and priority query is sky, starts iteration；

S3, secondary user's i propose to occupy the application M of Edge Server resource according to ε-greedy strategy_i1, and determine and calculate unloading Task amount x；

S5, obtain Edge Server resource secondary user's realize unloading calculate, do not obtain occupy qualification secondary user's, then into The entirely local calculating of row；

S6, the current effectiveness of each secondary user's of calculating are updated as return immediately, and by the Edge Server returned and be successfully connected Into models of priority；

S7, more new traffic urgency, and Q value and pri function are updated, if pri function is higher than threshold θ, by this state It is added in priority query with selection, and updates corresponding Q value according to priority orders；

S8, judge whether to meet stopping criterion for iteration, averagely be imitated if satisfied, then calculating each secondary user's after entire method executes With；If not satisfied, the S2 that then gos to step.

2. a kind of mobile edge calculations method of more secondary user's based on intensified learning according to claim 1, special Sign is that the step S4 is specifically included:

S41, control centre handle the application of each secondary user's；

If channel status corresponding to S42, Edge Server is 1, apply for the unavailable use of the secondary user's of the server Power, jumps to step S44；

If S43, an Edge Server are only applied once, which is occupied by applied secondary user's, i.e., Success occupies original application server；

If S44, Edge Server are applied twice or more, to be ranked up according to the communication urgency of each secondary user's, The higher preferential right to use for obtaining Edge Server of urgency is communicated, that is, occupies non-original application server resource；

S45, the secondary user's for not obtaining the Edge Server right to use are then probabilistically assigned the Edge Server not being applied, protect Demonstrate,prove that all Edge Servers are occupied or secondary user's obtain server resource；Edge Server has accounted for if it exists Full, part secondary user's do not obtain server resource, then secondary user's can only carry out entirely local calculating.

3. a kind of mobile edge calculations method of more secondary user's based on intensified learning according to claim 1, special Sign is that the step S6 is specifically included:

S61, each secondary user's utility include two parts, mainly have the delay of calculating and the energy consumption of calculating, with utility It is inversely proportional；

In the value of utility of the secondary user's of S62, progress local computing, include local computing delay and local computing energy consumption；It carries out In the value of utility of the secondary user's of calculating section unloading, include local computing delay, local computing energy consumption, unloading time delay and unloading Energy consumption.

4. a kind of mobile edge calculations method of more secondary user's based on intensified learning according to claim 1, special Sign is that the step S7 is specifically included:

S71, more new traffic degree are mainly determined according to whether secondary user's successfully obtain Edge Server resource, if secondary user's Success occupies original application server, then it communicates urgency and remains unchanged；If only non-original application edge can be used to take for secondary user's It is engaged in device, then E_m+1；If secondary user's do not obtain the resource of server, it is only capable of entirely local calculating, then its E_m+2；

S72, update Q value are by the error between the following discount learnt with learning rate α return and current Q value, that is, to predict Error is superimposed on current Q value；Updating pri function is the maximum value for taking prediction error and current priority functional value As new pri function value.