CN115174681A - Method, equipment and storage medium for scheduling edge computing service request - Google Patents

Method, equipment and storage medium for scheduling edge computing service request Download PDF

Info

Publication number
CN115174681A
CN115174681A CN202210685149.2A CN202210685149A CN115174681A CN 115174681 A CN115174681 A CN 115174681A CN 202210685149 A CN202210685149 A CN 202210685149A CN 115174681 A CN115174681 A CN 115174681A
Authority
CN
China
Prior art keywords
service request
network
edge server
reward
edge
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210685149.2A
Other languages
Chinese (zh)
Other versions
CN115174681B (en
Inventor
李兵
赵玉琦
姜德纶
王健
李段腾川
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan University WHU
Original Assignee
Wuhan University WHU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan University WHU filed Critical Wuhan University WHU
Priority to CN202210685149.2A priority Critical patent/CN115174681B/en
Publication of CN115174681A publication Critical patent/CN115174681A/en
Application granted granted Critical
Publication of CN115174681B publication Critical patent/CN115174681B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/147Network analysis or design for predicting network behaviour

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention relates to a method, a device and a storage medium for scheduling edge computing service requests, which are characterized by comprising the following steps: making a decision according to the execution sequence of the plurality of service requests queued in the edge server by the pointer network; optimizing the pointer network according to the utilization condition of the edge server resources, the service request running time and the service request waiting time; the pointer network comprises an actor network and a critic network, wherein the actor network is used for deciding the execution sequence of service requests, and the critic network is used for predicting subsequent decisions according to the decisions made by the actor network and assisting parameter updating of the actor network based on predicted values. The resource utilization rate of the edge server can be effectively improved, the time required by the completion of the execution of the service request sequence is shortened, and the average waiting time of the request is reduced.

Description

Edge computing service request scheduling method, device and storage medium
Technical Field
The present invention relates to the field of edge computing application technologies, and in particular, to a method, a device, and a storage medium for scheduling an edge computing service request.
Background
In recent years, with the rapid development of the internet, cloud computing technology has been widely applied in various industries. However, in some scenarios, the traditional cloud computing method also exposes its disadvantages. For example, in the internet of things (IoT) that is currently and rapidly developed, since the conventional cloud computing method transmits data to the cloud for processing by means of task offloading and returns a result to the user terminal, many services with high sensitivity to communication delay cannot be responded to in time.
With the development of Docker and Kubernets, micro-services can be deployed in a more flexible and convenient manner. By adopting the edge computing network architecture as shown in fig. 2, in the edge computing environment, by deploying the micro-service to the network edge, the task delay is greatly shortened, and the problem of delay sensitivity can be effectively solved. However, the resources of the edge server are still very limited compared to the cloud, and how to fully utilize the service resources in the network edge environment and satisfy the service requests as much as possible is one of the main problems faced by edge computing.
For an edge server, there are usually multiple users requesting its service at the same time, and when the query rate (QPS) is high to a certain extent, the limited resources of the edge server make it often unable to satisfy a large number of service requests at the same time. The related technology carries out a micro-service request scheduling strategy, and needs to queue the service requests after the service requests reach an edge server and arrange the execution sequence of the waiting service requests so as to achieve the effect of ensuring the service quality. But have problems including:
first, only a single index is concerned, such as the running time and the request waiting time required by a task, and the overall consideration of multiple indexes is lacking.
Secondly, a heuristic algorithm is adopted to optimize the strategy, and as the heuristic algorithm needs long-time iteration to obtain a better result, the requirement of quick response under edge calculation is not met.
Disclosure of Invention
The embodiment of the invention provides a method and a device for scheduling edge computing service requests, which are used for solving the problems in the related art.
In one aspect, an embodiment of the present invention provides a method for scheduling an edge computing service request, where the method includes:
making a decision according to the execution sequence of the plurality of service requests queued in the edge server by the pointer network;
optimizing the pointer network according to the utilization condition of the edge server resources, the service request running time and the service request waiting time;
the pointer network comprises an actor network and a critic network, wherein the actor network is used for deciding the execution sequence of service requests, and the critic network is used for predicting subsequent decisions according to the decisions made by the actor network and assisting parameter updating of the actor network based on predicted values.
In some embodiments, the optimizing the actor network based on edge server resource utilization, service request runtime, and service request latency includes:
defining a reward function for reinforcement learning according to the resource utilization condition of the edge server, the service request running time and the service request waiting time, and training the actor network based on the reward function;
and taking the predicted value of the critic network as the value of the baseline function in the training process of the actor network so as to optimize the parameters of the actor network.
In some embodiments, the defining the reinforcement learning reward function according to the edge server resource utilization, the service request running time and the service request waiting time comprises the following steps: based on reward = α reward 1 +β*reward 2 +γ*reward 3 Determining the reward function raward, wherein,
the above-mentioned
Figure BDA0003693543220000021
The above-mentioned
Figure BDA0003693543220000022
The above-mentioned
Figure BDA0003693543220000023
Alpha, beta, gamma are weighting coefficients, C j Is the CPU capacity of the edge server, O j Is the I/0 capacity of the edge server, B j For the bandwidth capacity of the edge server, M j Is the memory capacity of the edge server, m is the total number of edge servers, W i Waiting time for service request i, T _ map j The total time required for the edge server j to run all service requests.
In some embodiments, said training said actor network based on said reward function comprises the steps of:
defining a trained strategy gradient, the strategy gradient being executed by
Figure BDA0003693543220000031
Figure BDA0003693543220000032
Wherein, θ is a parameter of the actor network,
Figure BDA0003693543220000033
represents the gradient of theta, J (theta | Q) is the optimization target, E C~pθ(.|Q) Represents the mathematical expectation of all policies with the set of known service requests Q, p θ represents the set of policies, reward (C) Q Q) is a known set of service requests Q, policy C is taken Q Function value of time reward, b (Q) is AND strategy C Q An independent baseline function, which is used to estimate the value of reward to reduce the variance of the gradient.
In some embodiments, the method comprises the steps of: training the critic network by adopting a random gradient descent mode, wherein the random gradient descent mode is as follows:
Figure BDA0003693543220000034
wherein the content of the first and second substances,
Figure BDA0003693543220000035
in order to be a predicted reward value,
Figure BDA0003693543220000036
actually decided reward value, l (θ), for the actor network v ) For a random gradient, θ v Are network parameters.
In some embodiments, the actor network comprises an encoder and a decoder, and the encoder and decoder each comprise a recurrent neural network composed of a plurality of long-short term memory networks, the actor network, in deciding on the order of execution of service requests, comprising the steps of:
taking a service request sequence which is queuing as an input sequence and converting the service request sequence into a first intermediate vector to be input into an encoder of the actor network to obtain the state of each hidden layer corresponding to the encoder;
inputting the state of each hidden layer of the encoder into a decoder to obtain the state of each hidden layer of the decoder, and obtaining a second intermediate vector from the state of each hidden layer of the decoder through the attention mechanism of the pointer network;
acquiring the probability that a decoder selects each service request in a certain hidden layer as the output of the layer based on the second intermediate vector;
and each hidden layer selects a service request with the highest probability as the output of the layer, and defines the sequence of executing the service requests according to the output of all the hidden layers as the output sequence of the edge server.
In some embodiments, before said deciding an execution order of the plurality of service requests queued in the edge server, further comprising:
for each service request, acquiring a set of edge servers capable of receiving the service request, and randomly selecting one edge server from the set of edge servers as an edge server for processing the service request.
In some embodiments, for each service request, obtaining a set of edge servers capable of receiving the service request includes:
according to pi i ={s j |r j ≥||p i -p j || 2 ,s j S) obtaining the set of edge servers that can receive the service request, wherein pi i For the set of all edge servers that can receive the service request, p i Coordinates requested for the service; p is a radical of formula j As coordinates of the edge server, r j Coverage radius of edge server, s j Is the jth edge server and S is the set of all edge servers.
In a second aspect, an embodiment of the present invention further provides a computer-readable storage medium, where at least one instruction is stored in the storage medium, and the instruction is loaded and executed by a processor to implement the method according to any one of claims 1 to 8.
In a third aspect, an embodiment of the present invention further provides an apparatus, where the apparatus includes: at least one processor; and a memory coupled to the at least one processor, the memory containing instructions stored therein that, when loaded and executed by the processor, perform the method of any of claims 1-8. The technical scheme provided by the invention has the beneficial effects that:
the embodiment of the invention provides a method and a device for scheduling edge computing service requests, which are used for remarkably improving the service quality, effectively improving the resource utilization rate of an edge server, shortening the time required by the completion of the execution of a service request sequence and reducing the average waiting time of requests by extracting the main characteristics of task operation and combining and considering three indexes of resource utilization rate, operation time and waiting time in the service request process.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a schematic flowchart of a method for scheduling an edge computing service request according to an embodiment of the present invention;
FIG. 2 is a block diagram of an edge computing network architecture according to an embodiment of the present invention;
fig. 3 is a flowchart illustrating an overall implementation of a scheduling policy according to an embodiment of the present invention;
fig. 4 is a flowchart illustrating a method for scheduling an edge computing service request according to an embodiment of the present invention;
FIG. 5 is a data comparison chart of experiment result 1 provided in the embodiment of the present invention;
fig. 6 is a data comparison graph of experimental result 2 provided in the embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, are within the scope of the present invention.
As shown in fig. 1, an embodiment of the present invention provides a method for scheduling an edge computing service request, including the steps of:
s100, making a decision according to the execution sequence of the plurality of service requests queued in the edge server by the pointer network;
s200, optimizing the pointer network according to the utilization condition of the edge server resources, the service request running time and the service request waiting time;
and the pointer network comprises an actor network and a critic network, wherein the actor network is used for deciding the execution sequence of the service requests, and the critic network is used for predicting subsequent decisions according to the decisions made by the actor network and assisting the parameter update of the actor network based on the predicted values.
It should be noted that, the edge server resource utilization condition may include CPU capacity, I/O capacity, bandwidth capacity, memory capacity, and the like of the edge server; the service request runtime may refer to the total time required for the edge server to perform all queued service requests; the service request latency may refer to the time interval from the arrival of the service request at the edge server to the completion of the processing.
It can be understood that the embodiment of the invention is different from the traditional heuristic algorithm in long-time iteration, and the pointer network based on artificial intelligence can make a quick decision and meet the requirement of time delay sensitivity in the edge computing environment. Meanwhile, in the optimization of the pointer network, a plurality of optimization targets including the resource utilization rate of the edge server, the service request running time and the service request waiting time are fully considered, optimization is carried out from a plurality of dimensions, and the service quality is improved.
In some embodiments, S200 includes the steps of:
s210, defining a reward function of reinforcement learning according to the utilization condition of the edge server resources, the service request running time and the service request waiting time, and training the actor network based on the reward function;
and S220, taking the predicted value of the critic network as the value of a baseline function in the training process of the actor network so as to optimize the parameters of the actor network.
It can be appreciated that the actor network can be trained continuously while making intelligent decisions, thereby avoiding the problem of reduced model effect due to fluctuations in service request data over time.
In the embodiment of the invention, the actor network is trained according to a plurality of optimization targets, the critic network is introduced to carry out auxiliary optimization on the actor network, and the predicted value of the critic network is used as a baseline function value to influence the parameter update of the actor network.
Further, S110 includes the steps of: based on reward = α reward 1 +β*reward 2 +γ*reward 3 A reward function reward is determined, wherein,
Figure BDA0003693543220000071
Figure BDA0003693543220000072
Figure BDA0003693543220000073
alpha, beta, gamma are weighting coefficients, C j Is the CPU capacity of the edge server, O j Is the I/O capacity of the edge server, B j For the bandwidth capacity of the edge server, M j Is the memory capacity of the edge server, m is the total number of edge servers, W i Waiting time for service request i, T _ map j The total time required for the edge server j to run all the service requests.
Further, the training of the actor network based on the reward function in S110 includes the steps of:
defining a strategy gradient of the training, and executing the strategy gradient as
Figure BDA0003693543220000074
Figure BDA0003693543220000075
Wherein, θ is a parameter of the actor network,
Figure BDA0003693543220000076
represents the gradient of theta, J (theta | Q) is the optimization target, E C~pθ(.|Q) Representing the mathematical expectation of all policies in the case of a known set of service requests Q, p θ represents the set of policies, reward(C Q Q) is taken policy C given a set of service requests Q Q The function value of time reward, b (Q) is and strategy C Q An independent baseline function, which is used to estimate the value of reward to reduce the variance of the gradient.
Further, when predicting a subsequent decision according to the decision made by the actor network in S120, the critic network is trained in a random gradient descent mode, where the random gradient descent mode is:
Figure BDA0003693543220000077
wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0003693543220000078
in order to be a predicted reward value,
Figure BDA0003693543220000079
actually decided reward value, l (θ), for the actor network v ) For a random gradient, θ v Are network parameters.
As shown in fig. 3, in some embodiments, the actor network includes an encoder and a decoder, and the encoder and the decoder each include a recurrent neural network composed of a plurality of long-short term memory networks, and the actor network makes a decision on the execution order of a plurality of service requests queued in the edge server, including the steps of:
s110, taking the service request sequence which is queued as an input sequence and converting the service request sequence into a first intermediate vector to be input into an encoder of the actor network to obtain the state of each hidden layer corresponding to the encoder;
s120, inputting the state of each hidden layer of the encoder into a decoder to obtain the state of each hidden layer of the decoder, and obtaining a second intermediate vector from the state of each hidden layer of the decoder through the attention mechanism of the actor network;
s130, acquiring the probability that the decoder selects each service request in a certain hidden layer as the output of the hidden layer based on the second intermediate vector;
and S140, each hidden layer selects a service request with the highest probability as the output of the layer, and defines the sequence of executing the service requests according to the output of all the hidden layers as the output sequence of the edge server.
It should be noted that in S101, an embedding operation may be performed on the input sequence to convert the input sequence into a first intermediate vector, and the first intermediate vector is used to compress the data dimension so as to match the data dimension with the network model. Furthermore, the number of encoder and decoder concealment layers depends on the input sequence length.
Preferably, the first intermediate vector may be represented as:
Figure BDA0003693543220000081
Figure BDA0003693543220000082
wherein e is j Representing the encoder layer j hidden layer state, d i Representing the i-th hidden layer state of the decoder, v T 、W 1 、W 2 Are all parameters to be trained by the pointer network.
Preferably, in S102 can be paired
Figure BDA0003693543220000083
Performing softmax operation to obtain the probability p (C) that the decoder selects each service request in the ith hidden layer as the output of the layer i |C 1 ,...,C i-1 ,Q)=softmax(u i ) Wherein p (C) i |C 1 ,...,C i-1 Q) represents the probability vector of the input sequence in the i-th hidden layer of the decoder, with dimensions equal to the input sequence length.
In some embodiments, before the decision, for each service request, a set of edge servers capable of receiving the service request is obtained, and one edge server is randomly selected from the set of edge servers as an edge server for processing the service request.
Preferably, the step of obtaining a set of edge servers capable of receiving the service request comprises the steps of: according to pi i ={s j |r j ≥||p i -p j || 2 ,s j S) obtaining the set of edge servers that can receive the service request, where pi i For the set of all edge servers that can receive the service request, p i Coordinates requested for the service; p is a radical of formula j As coordinates of the edge server, r j Coverage radius of edge server, s j Is the jth edge server, and S is the set of all edge servers.
As shown in fig. 4, in a specific embodiment, an edge computing service request scheduling method includes the steps of:
step S1, edge server information and service request data under a real environment are obtained, data preprocessing is respectively carried out on the characteristics of the edge server information and the service request data, and preprocessed edge server information and service request data representation are obtained.
And S2, on the basis of the step S1, distributing the service request to different edge servers for processing according to the coverage area of the edge servers and the service request initiating position.
And S3, on the basis of the step S2, when a plurality of service requests are queued inside the edge server, making a decision on the execution sequence of the service requests by using a pointer network model.
And S4, on the basis of the step S3, training the pointer network model by using a reinforcement learning mode while making a decision on the pointer network model, so that a better decision effect is achieved.
Further, step S1 includes:
s1.1, analyzing edge server information in a real scene, and enabling a single edge server S j Represented as a quadruple: s j =(C j ,O j ,B j ,M j ) Wherein, C j Representing CPU capacity of edge servers, O j Indicating the I/O capacity of the edge server, B j Representing the bandwidth capacity of the edge server, M j The memory capacity of the edge server is represented, and the edge server set S is defined as: s = { S = 1 ,s 2 ,...,s m }。
S1.2, analyzing service request data in a real scene and making a single service request q i Represented as a seven-tuple: q. q of i =(c i ,o i ,b i ,m i ,T i ,t ii ) Wherein the first four dimensions (c) i ,o i ,b i ,m i ) Respectively represent execution q i CPU, I/O, bandwidth, memory, T, required for a request i Denotes q i Timestamp of request initiation, t i Represents q i Time required for the requested operation, pi i Represents the set of all edge servers that can receive the service request, and defines the set of service requests Q as: q = { Q = 1 ,q 2 ,...,q N }。
Further, step S2 includes:
s2.1, on the basis of step S1.2, pi i The statistical result of the coverage relation of the edge server to the service request is pi i Expressed as: pi i ={s j |r j ≥||p i -p j || 2 ,s j E.g., S }, wherein p i Coordinates, p, representing the service request j Representing edge servers s j Coordinate of (a), r j Representing edge servers s j The radius of coverage of.
S2.2, for each service request q i From it pi i One edge server is randomly selected from the set to serve as a target edge server for processing the service request, and a plurality of micro service requests distributed to the same edge server are queued inside the edge server.
Further, as shown in fig. 3, the pointer network model is divided into two parts, an actor network and a critic network, wherein the actor network is used for deciding the execution sequence of the service request and consists of an encoder and a decoder, wherein the encoder and the decoder are respectively a Recurrent Neural Network (RNN), and the critic network is used for assisting the actor network to train and consists of a recurrent neural network and a Deep Neural Network (DNN). The step S3 comprises the following steps:
step S3.1, the sequence of service requests being queued is input into the pointer network model as input data.
Step S3.2, based on step S3.1, the input data is finally decided by the actor network to obtain an execution sequence of the service request, and the specific decision process is as follows:
s3.2_1, an Embedding operation is carried out on the input sequence to convert the input sequence into an intermediate vector representation form.
S3.2_2, the intermediate vector passes through a Recurrent Neural Network (RNN) consisting of a plurality of Long Short Term Memory (LSTM) networks to obtain the state of each hidden layer of the corresponding encoder, and the number of the hidden layers of the encoder and the decoder depends on the length of an input sequence.
S3.2_3 takes the encoder state as the input of the decoder, and the encoder state is obtained through the attention mechanism of the pointer network model:
Figure BDA0003693543220000111
wherein e j Representing the state of the encoder layer j hidden layer, d i Representing the decoder ith hidden layer state; then, performing sof tmax operation on the product to obtain: p (C) i |C 1 ,...,C i-1 ,Q)=soffmax(u i ). Wherein, p (C) i |C 1 ,...,C i-1 Q) denotes the probability vector of the input sequence in the i-th hidden layer of the decoder, whose dimensions are equal to the length of the input sequence, meaning the probability that the decoder will pick the respective service request as the output of this layer in the i-th hidden layer.
And S3.2_4, each hidden layer selects a service request with the highest probability as the output of the layer, and as the number of the hidden layers of the decoder is equal to the length of the input sequence, a new sequence with the length equal to that of the input sequence can be finally formed as an output sequence.
S3.2_5 the edge server processes the service requests in the order of execution defined by this output sequence.
Further, in step S4, the pointer network model is trained in a reinforcement learning manner while making a decision, specifically including the following substeps:
s4.1, firstly, performing functional expression on the optimization target to ensure that the optimization target comprises the optimization of three aspects of resource utilization rate, running time and average waiting time:
s4.1_1: the resource utilization optimization objective is expressed as:
Figure BDA0003693543220000112
wherein the resource utilization rate is used for representing the average use efficiency condition, use (C) of the edge server in the process of processing the service request j ,O j ,B j ,M j ) Respectively representing the average CPU, I/0, bandwidth and memory utilization rate of the jth edge server when processing the whole input sequence.
S4.1_2: the runtime optimization goal is expressed as:
Figure BDA0003693543220000113
where runtime represents the total time required for the edge server to execute a fully queued service request.
S4.1_3: the latency optimization objective is expressed as:
Figure BDA0003693543220000121
wherein the waiting time represents the time from the service request arriving at the edge server to the completion of the processing, and is represented by W i Indicating the waiting time for service request i.
And S4.2, on the basis of the step S4.1, obtaining a reward function of reinforcement learning by using a weighted average mode, wherein the reward function is as follows: reward = α reward 1 +β*reward 2 +γ*reward 3
S4.3, training the actor network on the basis of the step S4.2, and selecting a reinforcement learning mode based on strategy gradient to optimize the pointer networkNetwork parameters, defining parameters in pointer network as theta, using reward (C) Q Q) represents a known set of service requests Q, policy C is taken Q Time rewarded function value. The desired definition of the reward function value is as follows:
Figure BDA0003693543220000122
where J (θ | Q) represents the optimization goal of the pointer network: the expectation of minimizing reward is that the implementation process of the strategy gradient is as follows:
Figure BDA0003693543220000123
wherein b (Q) is a policy C associated with the taking Q The effect of the independent baseline function is to reduce the variance of the gradient by estimating the value of reward. In the embodiment of the invention, the predicted value of the critic network is selected as the value of b (Q), and the predicted value is used for assisting the training of the actor network.
And S4.4, training the critic network on the basis of the step S4.3, predicting the result of the critic network on the basis of the decision made by the actor network, and taking the predicted value as the value of the baseline function in the actor network training process so as to assist the actor network in updating the parameters of the network. Preferably, the training is performed in a random gradient descent mode:
Figure BDA0003693543220000124
wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0003693543220000125
predicted reward values for critic networks;
Figure BDA0003693543220000126
actual reward value determined for the actor network.
The embodiment of the invention has the following advantages:
1) And the data characteristics of the edge server and the service request under the real environment are extracted, so that the model has universality and is closer to the actual situation.
2) Multiple optimization objectives are fully considered, including resource utilization, run time, latency. And optimization is performed from multiple dimensions, so that the service quality is improved.
3) Different from the traditional heuristic algorithm which needs long-time iteration, the pointer network model based on artificial intelligence can make a quick decision and meet the requirement of time delay sensitivity in the edge computing environment.
4) The model can be continuously trained while carrying out intelligent decision making, so that the problem of reduced model effect caused by fluctuation of service request data along with time is solved.
In a specific embodiment, a service request scheduling method facing edge computing is adopted to make a decision and verify the effect. Firstly, building an edge computing simulation environment, and generating edge server data and service request data according to real geographical position information provided by an EUA data set. Then, the google cluster trace data set was used as simulation data of the service request in this case, for a total of 40 ten thousand pieces of data. And training the pointer network model by using 30 ten thousand pieces of simulation data as a training set and 10 ten thousand pieces of simulation data as a test set.
Applying the pointer network model to an actual scheduling process, designing two groups of experiments, wherein the first group of experiments fixedly selects 5 edge servers, and the number of service requests is changed to 300-350, 350-400, 400-450 and 450-500 respectively; the second set of experiments fixes the number of service requests to 500 and changes the number of edge servers to 5, 7, 9, 11, 13, 15, respectively.
And averagely and randomly distributing the service requests in the coverage range to the edge servers which can process according to the coverage range of the edge servers.
And in the edge server, scheduling the service request sequence in queue by applying a scheduling strategy, calculating three indexes of the resource utilization rate of the edge server, the service request running time and the average service request waiting actual, and finally verifying the effectiveness of the experimental scheme.
When the effectiveness of the experimental scheme is verified, the effectiveness of the scheme is verified by comparing the scheduling strategy (named as RLPNet) with the scheduling strategies of a plurality of benchmarks. The reference method for comparison includes: the method comprises a first-come first-serve scheduling algorithm (FCFS), a high-response-ratio first-serve scheduling algorithm (HRRN), an online reinforcement learning scheduling algorithm (OnPQ) based on Q-learning and an online delay-sensitive task scheduling algorithm (OnDisc).
As shown in fig. 4 and 5, the experimental results show that the scheduling policy RLPNet represented by the present invention is superior to the other four comparison methods in terms of resource utilization, runtime, and average latency.
In addition, an embodiment of the present invention further provides a computer-readable storage medium, where at least one instruction is stored, and the instruction is loaded and executed by a processor to implement all the method steps in the method embodiment of the present invention.
In addition, an embodiment of the present invention further provides an apparatus, where the apparatus includes: at least one processor; and a memory coupled to the at least one processor, the memory containing instructions stored therein which when loaded and executed by the processor implement all of the method steps in a method embodiment of the invention.
It will be understood by those of ordinary skill in the art that all or some of the steps of the methods, systems, functional modules/units in the devices disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof. In a hardware implementation, the division between functional modules/units mentioned in the above description does not necessarily correspond to the division of physical components; for example, one physical component may have multiple functions, or one function or step may be performed by several physical components in cooperation. Some or all of the physical components may be implemented as software executed by a processor, such as a central processing unit, digital signal processor, or microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit. Such software may be distributed on computer readable storage media, which may include computer readable storage media (or non-transitory media) and communication media (or transitory media).
The above embodiments are only specific embodiments of the present invention, but the scope of the embodiments of the present invention is not limited thereto, and any person skilled in the art can easily think of various equivalent modifications or substitutions within the technical scope of the embodiments of the present invention, and these modifications or substitutions should be covered by the scope of the embodiments of the present invention. Therefore, the protection scope of the embodiments of the present invention shall be subject to the protection scope of the claims.

Claims (10)

1. An edge computing service request scheduling method, comprising the steps of:
making a decision according to the execution sequence of the plurality of service requests queued in the edge server by the pointer network;
optimizing the pointer network according to the utilization condition of the edge server resources, the service request running time and the service request waiting time;
the pointer network comprises an actor network and a critic network, wherein the actor network is used for deciding the execution sequence of service requests, and the critic network is used for predicting subsequent decisions according to the decisions made by the actor network and assisting parameter updating of the actor network based on predicted values.
2. The method as claimed in claim 1, wherein the optimization of the actor network based on edge server resource utilization, service request runtime, and service request latency comprises:
defining a reward function of reinforcement learning according to the resource utilization condition of the edge server, the service request running time and the service request waiting time, and training the actor network based on the reward function;
and taking the predicted value of the critic network as the value of a baseline function in the training process of the actor network so as to optimize the parameters of the actor network.
3. The method for scheduling edge computing service request according to claim 2, wherein the step of defining the reinforcement learning reward function according to the edge server resource utilization, the service request running time and the service request waiting time comprises the steps of:
based on reward = α reward 1 +β*reward 2 +γ*reward 3 Determining the reward function reward, wherein,
the above-mentioned
Figure FDA0003693543210000011
The above-mentioned
Figure FDA0003693543210000012
The above-mentioned
Figure FDA0003693543210000013
Alpha, beta, gamma are weighting coefficients, C j Is the CPU capacity of the edge server, O j Is the I/O capacity of the edge server, B j For the bandwidth capacity of the edge server, M j Is the memory capacity of the edge server, m is the total number of edge servers, W i Waiting time for service request i, T _ map j The total time required for the edge server j to run all service requests.
4. The method of claim 2, wherein the training the actor network based on the reward function comprises:
defining a trained strategy gradient, the strategy gradient being executed by
Figure FDA0003693543210000021
Figure FDA0003693543210000022
Wherein, θ is a parameter of the actor network,
Figure FDA0003693543210000023
represents the gradient of theta, J (theta | Q) is the optimization target, E C~pθ(.|Q) Represents the mathematical expectation of all policies with the set of known service requests Q, p θ represents the set of policies, reward (C) Q Q) is a known set of service requests Q, policy C is taken Q Function value of time reward, b (Q) is AND strategy C Q An independent baseline function, which is used to estimate the value of reward to reduce the variance of the gradient.
5. The method for scheduling edge computing service requests as claimed in claim 2, comprising the steps of: training the critic network by adopting a random gradient descent mode, wherein the random gradient descent mode is as follows:
Figure FDA0003693543210000024
wherein the content of the first and second substances,
Figure FDA0003693543210000025
in order to be a predicted reward value,
Figure FDA0003693543210000026
actually decided reward value, l (θ), for the actor network v ) For a random gradient, θ v Are network parameters.
6. The method as claimed in claim 1, wherein the actor network comprises an encoder and a decoder, and the encoder and the decoder each comprise a recurrent neural network composed of a plurality of long-short term memory networks, and the actor network, when deciding the execution sequence of the service requests, comprises the steps of:
taking a service request sequence which is queuing as an input sequence and converting the service request sequence into a first intermediate vector to be input into an encoder of the actor network to obtain the state of each hidden layer corresponding to the encoder;
inputting the state of each hidden layer of the encoder into a decoder to obtain the state of each hidden layer of the decoder, and obtaining a second intermediate vector from the state of each hidden layer of the decoder through the attention mechanism of the pointer network;
based on the second intermediate vector, acquiring the probability that a decoder selects each service request in a certain hidden layer as the output of the layer;
and each hidden layer selects a service request with the highest probability as the output of the layer, and defines the sequence of executing the service requests according to the output of all the hidden layers as the output sequence of the edge server.
7. The method as claimed in claim 1, wherein before said deciding the execution order of the plurality of service requests queued in the edge server, further comprising:
for each service request, acquiring a set of edge servers capable of receiving the service request, and randomly selecting one edge server from the set of edge servers as an edge server for processing the service request.
8. The method for scheduling edge computing service requests according to claim 7, wherein the step of obtaining, for each service request, a set of edge servers capable of receiving the service request comprises the steps of:
according to pi i ={s j |r j ≥||p i -p j || 2 ,s j S) obtaining the set of edge servers that can receive the service request, where pi i For the set of all edge servers that can receive the service request, p i Coordinates of the service request; p is a radical of j As coordinates of the edge server, r j Coverage radius of edge server, s j Is the jth edge server and S is the set of all edge servers.
9. A computer-readable storage medium having stored therein at least one instruction which is loaded and executed by a processor to implement the method of any one of claims 1-8.
10. An apparatus, characterized in that the apparatus comprises: at least one processor; and a memory coupled to the at least one processor, the memory containing instructions stored therein that when loaded and executed by the processor, perform the method of any of claims 1-8.
CN202210685149.2A 2022-06-14 2022-06-14 Method, equipment and storage medium for scheduling edge computing service request Active CN115174681B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210685149.2A CN115174681B (en) 2022-06-14 2022-06-14 Method, equipment and storage medium for scheduling edge computing service request

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210685149.2A CN115174681B (en) 2022-06-14 2022-06-14 Method, equipment and storage medium for scheduling edge computing service request

Publications (2)

Publication Number Publication Date
CN115174681A true CN115174681A (en) 2022-10-11
CN115174681B CN115174681B (en) 2023-12-15

Family

ID=83486015

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210685149.2A Active CN115174681B (en) 2022-06-14 2022-06-14 Method, equipment and storage medium for scheduling edge computing service request

Country Status (1)

Country Link
CN (1) CN115174681B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109976909A (en) * 2019-03-18 2019-07-05 中南大学 Low delay method for scheduling task in edge calculations network based on study
CN112637806A (en) * 2020-12-15 2021-04-09 合肥工业大学 Transformer substation monitoring system based on deep reinforcement learning and resource scheduling method thereof
CN113778648A (en) * 2021-08-31 2021-12-10 重庆理工大学 Task scheduling method based on deep reinforcement learning in hierarchical edge computing environment
CN113822456A (en) * 2020-06-18 2021-12-21 复旦大学 Service combination optimization deployment method based on deep reinforcement learning in cloud and mist mixed environment
US20210400277A1 (en) * 2021-09-01 2021-12-23 Intel Corporation Method and system of video coding with reinforcement learning render-aware bitrate control
CN114328291A (en) * 2021-12-18 2022-04-12 中国科学院深圳先进技术研究院 Industrial Internet edge service cache decision method and system
CN114500405A (en) * 2021-12-27 2022-05-13 天翼云科技有限公司 Resource allocation and acquisition method and device for multi-type service application

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109976909A (en) * 2019-03-18 2019-07-05 中南大学 Low delay method for scheduling task in edge calculations network based on study
CN113822456A (en) * 2020-06-18 2021-12-21 复旦大学 Service combination optimization deployment method based on deep reinforcement learning in cloud and mist mixed environment
CN112637806A (en) * 2020-12-15 2021-04-09 合肥工业大学 Transformer substation monitoring system based on deep reinforcement learning and resource scheduling method thereof
CN113778648A (en) * 2021-08-31 2021-12-10 重庆理工大学 Task scheduling method based on deep reinforcement learning in hierarchical edge computing environment
US20210400277A1 (en) * 2021-09-01 2021-12-23 Intel Corporation Method and system of video coding with reinforcement learning render-aware bitrate control
CN114328291A (en) * 2021-12-18 2022-04-12 中国科学院深圳先进技术研究院 Industrial Internet edge service cache decision method and system
CN114500405A (en) * 2021-12-27 2022-05-13 天翼云科技有限公司 Resource allocation and acquisition method and device for multi-type service application

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
JIALE DENG: "Microservice Pre-Deployment Based on Mobility Prediction and Service Composition in Edge", IEEE *
QI WANG: "Deep reinforcement learning for transportation network combinatorial optimization: A survey", ELSEVIER *

Also Published As

Publication number Publication date
CN115174681B (en) 2023-12-15

Similar Documents

Publication Publication Date Title
KR102076257B1 (en) Calculation Graphs Processing
CN113950066A (en) Single server part calculation unloading method, system and equipment under mobile edge environment
WO2018068421A1 (en) Method and device for optimizing neural network
CN109983480A (en) Use cluster loss training neural network
WO2022063247A1 (en) Neural architecture search method and apparatus
CN113141317B (en) Streaming media server load balancing method, system, computer equipment and terminal
CN113434212A (en) Cache auxiliary task cooperative unloading and resource allocation method based on meta reinforcement learning
CN111176820A (en) Deep neural network-based edge computing task allocation method and device
CN112436992B (en) Virtual network mapping method and device based on graph convolution network
CN112764936A (en) Edge calculation server information processing method and device based on deep reinforcement learning
CN111049903A (en) Edge network load distribution algorithm based on application perception prediction
CN116263701A (en) Computing power network task scheduling method and device, computer equipment and storage medium
CN112860402A (en) Dynamic batch processing task scheduling method and system for deep learning inference service
CN113867843A (en) Mobile edge computing task unloading method based on deep reinforcement learning
CN116579418A (en) Privacy data protection method for model segmentation optimization under federal edge learning environment
CN114650321A (en) Task scheduling method for edge computing and edge computing terminal
CN116915869A (en) Cloud edge cooperation-based time delay sensitive intelligent service quick response method
CN117436485A (en) Multi-exit point end-edge-cloud cooperative system and method based on trade-off time delay and precision
CN116954866A (en) Edge cloud task scheduling method and system based on deep reinforcement learning
CN116700931A (en) Multi-target edge task scheduling method, device, equipment, medium and product
CN115174681B (en) Method, equipment and storage medium for scheduling edge computing service request
CN115345306A (en) Deep neural network scheduling method and scheduler
CN115220818A (en) Real-time dependency task unloading method based on deep reinforcement learning
CN115309521A (en) Marine unmanned equipment-oriented deep reinforcement learning task scheduling method and device
CN113762972A (en) Data storage control method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant