CN112667394B

CN112667394B - Computer resource utilization rate optimization method

Info

Publication number: CN112667394B
Application number: CN202011539370.4A
Authority: CN
Inventors: 蒋锴; 赵宇; 张政伟; 戴大伟; 徐瑞
Original assignee: CETC 28 Research Institute
Current assignee: CETC 28 Research Institute
Priority date: 2020-12-23
Filing date: 2020-12-23
Publication date: 2022-09-30
Anticipated expiration: 2040-12-23
Also published as: CN112667394A

Abstract

The invention provides a method for optimizing the utilization rate of computer resources, which comprises the following steps: step 1, data preprocessing is carried out; step 2, coding the relation between the task characteristic sample and the resource utilization rate by using an LSTM neural network; step 3, learning the resource utilization rate sequence of the real cluster operation data to generate a resource utilization rate prediction function; step 4, inputting a task running state in the test data, predicting by using a resource utilization rate prediction function, comparing the predicted resource utilization rate with the resource utilization rate in the test data, and counting and calculating overhead and prediction errors, wherein the prediction errors comprise mean square errors, average absolute errors and standard deviations of the average absolute errors; and after the prediction is finished, optimizing the utilization rate of the computer resources. The method can be applied to optimization of the utilization rate of the cloud resources and the like. The resource utilization rate optimization method has the advantages of high prediction accuracy, good optimization stability and the like.

Description

Computer resource utilization rate optimization method

Technical Field

The invention relates to a method for optimizing the utilization rate of computer resources.

Background

The problem of resource scheduling is an important research problem in the field of cloud computing, and the difficulty of the problem lies in how to accurately predict cluster resources in real time.

In recent years, many resource scheduling algorithms have been developed, but these algorithms have some disadvantages, some algorithms are too rough for predicting used resources, and some algorithms use a rule-based resource scheduling algorithm, so that the resource usage in a real cluster cannot be predicted. Therefore, the resource utilization rate prediction method based on the inverse reinforcement learning and the LSTM neural network is provided, the change characteristics of the machine resource utilization rate are predicted by using real cluster data according to the task running condition, and the resource utilization rate is convenient to optimize.

Disclosure of Invention

The purpose of the invention is as follows: the technical problem to be solved by the invention is to provide a method for optimizing the utilization rate of computer resources aiming at the defects of the prior art, which comprises the following steps:

step 1, extracting real data of cluster operation in a period of time, and performing data preprocessing;

step 2, encoding the relation between the task characteristic sample and the resource utilization rate by using an LSTM neural network;

step 3, learning the resource utilization rate sequence of the real cluster operation data by using a reverse reinforcement learning algorithm to generate a resource utilization rate prediction function, wherein the reverse reinforcement learning algorithm comprises a strategy network and a judgment network, the strategy network is used for predicting the resource utilization rate, and the judgment network is used for evaluating the quality of the strategy network prediction;

and 4, inputting the task running state in the test data, predicting by using a resource utilization rate prediction function, comparing the predicted resource utilization rate with the resource utilization rate in the test data, and counting the calculation cost and the prediction error, wherein the prediction error comprises a mean square error, an average absolute error and a standard deviation of the average absolute error. After the algorithm training is finished, scheduling the tasks to a computer with lower predicted resource utilization rate by using a peak clipping and valley filling method, thereby optimizing the computer resource utilization rate.

The step 1 comprises the following steps:

step 1-1, original data is extracted in blocks: dividing cluster running data larger than one hour into small segments of running data in more than two continuous time periods according to the timestamp sequence of the real cluster running data and the set time length; the set time length is determined according to the demand of cluster resource utilization rate prediction;

step 1-2, packing small sections of running data in more than two continuous time periods into a whole section of running track, specifically, packing task characteristic samples and resource utilization rate sequences in the running data into state variables and action variables respectively;

step 1-3, setting the running track as a prediction environment capable of interacting with an algorithm, wherein the prediction environment refers to an environment interacting with the algorithm, the algorithm receives a given state of the prediction environment, acts on the current state, and then obtains feedback of the prediction environment.

In step 1-2, the state variables

Is defined as:

wherein i is the machine number, t is the current time, m is the length of the historical data contained in the state variable,

the task running state on the ith machine from t-m to t-1,

the real resource utilization rate of the ith machine from t-m to t-1;

action variable

Is defined as:

wherein

The predicted resource utilization for machine i at time t.

The step 2 comprises the following steps:

step 2-1, inputting task characteristic samples into LSTM neural networkNetwork, the task feature sample

Is defined as:

wherein

For the running task set at time t:

wherein task ₁ ,task ₂ Representing two different tasks;

the training goal of the LSTM neural network is a sequence of resource utilization over the past m moments

Step 2-2, using a back propagation method to

And (3) training the LSTM neural network for the fitting target, iterating for more than two times until the relative error between the LSTM neural network training target and the true value is less than ten percent, and taking out the last hidden layer output of the LSTM neural network as the code of the task characteristic sample.

The step 3 comprises the following steps:

step 3-1, initializing parameters of a strategy network and a judgment network, wherein the strategy network refers to a model for predicting resource utilization rate, and the judgment network refers to a model for judging the fitting effect of the strategy network; preparing real cluster data as a learning target of a strategy network;

step 3-2, sampling the predicted resource utilization rate from the policy network, sampling the real resource utilization rate from the real cluster data, and updating the parameters of the discrimination network by using a gradient descent method according to the error between the predicted resource utilization rate and the real resource utilization rate to serve as a cost function of the policy network;

and 3-3, training the strategy network by using a reinforcement learning method until the strategy network converges, thereby generating a resource utilization rate prediction function (reference: Schulman J, Wolski F, Dhariwal P, et al. Proximal Policy Optimization Algorithms [ J ]. 2017.).

The error calculation method in the step 4 comprises the following steps:

the mean square error MSE calculation method comprises the following steps:

the calculation method of the average absolute error MAE comprises the following steps:

where n is the total number of samples predicted, U _s ⁽ⁱ⁾ For predicted resource utilization, U _R ⁽ⁱ⁾ Is the true resource utilization.

The method of calculating the mean absolute error refers to the method of calculating the total standard deviation. The calculation method of the calculation overhead is to count the time required by the prediction algorithm to finish multiple predictions and calculate the average value.

Compared with the prior art, the invention has the remarkable advantages that: compared with other prediction algorithms, the cloud resource utilization rate prediction algorithm based on the inverse reinforcement learning can be combined with real cluster data prediction to better accord with a resource utilization rate sequence in a production environment, so that the use condition of the computer resource utilization rate is helped to be optimized through real-time accurate prediction, and the resource scheduling strategy is convenient to adjust in real time.

Drawings

The foregoing and/or other advantages of the invention will become further apparent from the following detailed description of the invention when taken in conjunction with the accompanying drawings.

FIG. 1 is a schematic workflow of the process of the present invention;

FIG. 2 is a plot of the mean square error predicted by the method of the present invention on the Alibab and Google datasets;

FIG. 3 is a graph of the mean absolute error predicted by the method of the present invention on the Alibab and Google datasets;

FIG. 4 is a standard deviation of the mean absolute error predicted by the method of the present invention on the Alibab and Google datasets;

FIG. 5 is the computational overhead predicted by the method of the present invention on the Alibab and Google datasets.

Detailed Description

The invention discloses a computer resource utilization rate optimization method, which is applied to the problem of resource utilization rate prediction of a real cloud environment.

As shown in FIG. 1, the present invention provides a method for optimizing the utilization rate of computer resources, which comprises the following steps:

step 1, extracting real data of cluster operation within a period of time, and performing data preprocessing. The data preprocessing comprises three parts of data blocking, operation track packaging, prediction environment setting and the like. The data blocking part divides longer time cluster running data into small sections of running data in a plurality of continuous time periods according to the time stamp sequence of the cluster running data and with a set time length. The set time length is determined according to the demand of cluster resource utilization rate prediction. And the operation track packaging part is used for packaging the task operation information and the resource utilization rate information into variables such as states, actions and the like. The prediction environment setting refers to setting a state action sequence as an interactively predictable environment. The prediction environment refers to an environment interacting with an algorithm, and the algorithm receives a given state of the prediction environment, acts on the current state and then obtains feedback of the prediction environment.

And 2, coding the relation between the task running state and the resource utilization rate by using an LSTM neural network. The task running state comprises task request resources, starting time and ending time. And the resource utilization rate sequence is used as the output of the LSTM neural network, and the neural network is optimized by using a back propagation method. The process is iteratively trained for a plurality of times until the fitting of the neural network to the resource utilization rate is completely converged. Then, the last layer of the LSTM neural network hidden layer is taken out to be used as the code of the current task running state.

And 3, learning the real resource utilization rate sequence by using an inverse reinforcement learning-based algorithm. The inverse reinforcement learning algorithm firstly learns a resource utilization rate cost function in a real environment through an inverse reinforcement learning process, and then performs reinforcement learning from the cost function to generate a resource utilization rate prediction function. The resource utilization rate prediction function is mapped to the cluster resource utilization rate from the cluster state, so that the resource utilization rate of a real cluster can be predicted by an algorithm, the inverse reinforcement learning comprises a strategy network and a judgment network, the strategy network is used for predicting the resource utilization rate, and the judgment network is used for evaluating the strategy network prediction.

And 4, inputting the task running state in the test data, predicting by using a resource utilization rate prediction function generated by a strategy network, comparing the resource utilization rate output by a prediction result with the resource utilization rate in the test data, and counting the calculation cost and the prediction error, wherein the prediction error comprises a mean square error, an average absolute error and a standard deviation of the average absolute error. After the algorithm training is finished, optimizing the resource utilization rate by using a peak clipping and valley filling method, namely scheduling the task to a computer with lower predicted resource utilization rate.

The step 1 comprises the following steps:

step 1-1, performing block extraction on original data: dividing cluster operation data larger than one hour into small sections of operation data in more than two continuous time periods according to the timestamp sequence of the real cluster operation data and the set time length; the set time length is determined according to the demand of cluster resource utilization rate prediction;

The step 2 comprises the following steps:

step 2-1, inputting task characteristic samples into an LSTM neural network, wherein the task characteristic samples

Is defined as:

wherein

For the running task set at time t

The goal of the LSTM neural network is a sequence of resource utilizations over the past m moments

Step 2-2, using a back propagation method to

And training the LSTM neural network for the fitting target, iterating for multiple times until the relative error between the LSTM neural network training target and the true value is less than ten percent, and taking out the last hidden layer output of the LSTM neural network as the encoding of the task feature sample.

The step 3 comprises the following steps:

step 3-1, initializing a strategy network and judging network parameters, wherein the strategy network refers to a model for predicting the resource utilization rate, and the judging network refers to a model for judging the fitting effect of the strategy network. Real cluster data is prepared as a learning target for the policy network.

And 3-2, sampling the predicted resource utilization rate from the strategy network, and sampling the real resource utilization rate from the real cluster data. And updating the parameters of the discrimination network according to the difference between the predicted resource utilization rate and the real resource utilization rate, and taking the parameters as the loss function of the strategy network.

The error calculation method in the step 4 comprises the following steps:

the method for calculating the mean square error comprises the following steps:

the average absolute error is calculated by the following method:

The method of calculating the mean absolute error refers to the method of calculating the total standard deviation. The calculation method for calculating the overhead is the time required by the statistical prediction algorithm to complete one prediction. As shown in fig. 3, 4 and 5.

Example (b):

as shown in fig. 2. The experiment was performed in a local cluster with five physical servers, each server was equipped with two 4-core intel (r) xeon (r) E5-2650 processors, four GeForce RTX 2080Ti GPUs, 16GB memory, 1TB disk space, and running software environments Pycharm (2018.1.4), Python 3.6, Keras 2.3.1, and tensrflow 2.2.0.

In order to evaluate the resource utilization predictor based on the inverse reinforcement learning and LSTM neural networks, three best algorithms, called sparse autoencoder and recurrent neural networks based resource prediction algorithm (L-PAW), recurrent neural networks based resource prediction algorithm (RNN), and bayesian information criterion based resource prediction algorithm (BIC-E), were used for comparison, and a brief description of each algorithm is given below.

The L-PAW algorithm uses a sparse self-encoder to encode the resource utilization rate and then inputs the resource utilization rate into a recurrent neural network for learning

The RNN algorithm inputs the resource utilization rate into a recurrent neural network for learning

BIC-E algorithm selects optimal statistical learning method based on Bayesian information criterion to predict resource utilization rate

The data set used in the experiment was a cluster operating data set for Alibaba in 2018 and a cluster operating data set for Google in 2011. The data set comprises resource allocation information, task running states and resource use conditions of each machine, and the interval of sampling points is 10 seconds. The experiment mainly uses the task use condition and resource allocation information of each machine at each time point to make a data set, and uses the invention to predict the resource utilization rate.

Step 1, extracting real data of each cluster operation in a period of time, and performing data preprocessing. The data preprocessing comprises three parts of data blocking, operation track packaging, prediction environment setting and the like. The data blocking part divides longer time cluster running data into small sections of running data in a plurality of continuous time periods according to the time stamp sequence of the cluster running data and with a set time length. The set time length is determined according to the demand of cluster resource utilization rate prediction. And the operation track packaging part is used for packaging the task operation information and the resource utilization rate information into variables such as states, actions and the like. The prediction environment setting refers to setting the state action sequence as an interactive prediction environment. The prediction environment refers to an environment interacting with an algorithm, and the algorithm receives a given state of the prediction environment, acts on the current state and then obtains feedback of the prediction environment.

And 2, coding the relation between the task running state and the resource utilization rate by using an LSTM neural network. The task running state comprises a task request resource, a starting time and an ending time. And the resource utilization rate sequence is used as the output of the LSTM neural network, and the neural network is optimized by using a back propagation method. The process is iteratively trained multiple times until the fit of the neural network to the resource utilization is fully converged. Then, the last layer of the LSTM neural network hidden layer is taken out to be used as the code of the current task running state.

And 3, learning the real resource utilization rate sequence by using an inverse reinforcement learning algorithm. The inverse reinforcement learning algorithm firstly learns a resource utilization rate cost function in a real environment through an inverse reinforcement learning process, and then performs reinforcement learning from the cost function to generate a resource utilization rate prediction function. The resource utilization rate prediction function is mapped to the cluster resource utilization rate from the cluster state, so that an algorithm can predict the resource utilization rate of a real cluster, the inverse reinforcement learning comprises a strategy network and a judgment network, the strategy network is used for predicting the resource utilization rate and inputting the task running state, and the strategy network outputs the predicted resource utilization rate through calculation. The judgment network is used for evaluating the quality of the strategy network prediction, namely the input task running state and the predicted resource utilization rate, the judgment network outputs the judgment on the quality of the predicted value through calculation, the better the predicted value is, the larger the value output by the judgment network is, the worse the predicted value is, and the smaller the value output by the judgment network is.

And 4, inputting a task running state in the test data, predicting by using a resource utilization rate prediction function generated by a strategy network, comparing the resource utilization rate output by a prediction result with the resource utilization rate in the test data, and counting the calculation cost and the prediction error, wherein the prediction error comprises a mean square error, an average absolute error and a standard deviation of the average absolute error. After the algorithm training is finished, optimizing the resource utilization rate by using a peak clipping and valley filling method, namely scheduling the task to a computer with lower predicted resource utilization rate.

The step 1 comprises the following steps:

step 1-1, carrying out block extraction on the original data of the Ali Bargo Google: dividing cluster operation data larger than one hour into small sections of operation data in more than two continuous time periods according to the timestamp sequence of the real cluster operation data and the set time length; the set time length is determined according to the demand predicted by the utilization rate of the cluster resources;

The step 2 comprises the following steps:

Is defined as:

wherein

For the running task set at time t

Step 2-2, Using the reverseA propagation method of

The step 3 comprises the following steps:

And 3-2, sampling the utilization rate of the predicted resources from the policy network, and sampling the utilization rate of the real resources from the real cluster data. And updating the parameters of the discrimination network according to the difference between the predicted resource utilization rate and the real resource utilization rate, and taking the parameters as a loss function of the strategy network.

The error calculation method in the step 4 comprises the following steps:

the method for calculating the mean square error comprises the following steps:

the average absolute error is calculated by the following method:

The method of calculating the mean absolute error refers to the method of calculating the total standard deviation. The calculation method for calculating the overhead is the time required by the statistical prediction algorithm to complete one prediction.

The experimental result shows that when the prediction length is shorter, the method is not much different from the prior art, but along with the increase of the prediction length, the prediction accuracy and stability of the method are obviously superior to those of the traditional resource utilization rate prediction method.

Compared with the prior art, the invention has the remarkable advantages that: compared with the resource utilization rate method, the resource utilization rate prediction method based on the inverse reinforcement learning and the LSTM neural network can improve the prediction accuracy and stability and can predict the resource utilization rate under the condition of not providing input.

In a specific implementation, the present invention further provides a computer storage medium, wherein the computer storage medium may store a program, and the program may include some or all of the steps of the embodiments of the method of the present invention when executed. The storage medium may be a magnetic disk, an optical disk, a read-only memory (ROM) or a Random Access Memory (RAM).

The present invention provides a method for optimizing computer resource utilization, and a plurality of methods and approaches for implementing the technical solution, and the above description is only a preferred embodiment of the present invention, it should be noted that, for those skilled in the art, a plurality of modifications and embellishments can be made without departing from the principle of the present invention, and these modifications and embellishments should also be regarded as the protection scope of the present invention. All the components not specified in the present embodiment can be realized by the prior art.

Claims

1. A method for optimizing the utilization rate of computer resources is characterized by comprising the following steps:

step 1, extracting real data of cluster operation within a period of time, and performing data preprocessing;

step 4, inputting a task running state in the test data, predicting by using a resource utilization rate prediction function, comparing the predicted resource utilization rate with the resource utilization rate in the test data, and counting and calculating overhead and prediction errors, wherein the prediction errors comprise mean square errors, average absolute errors and standard deviations of the average absolute errors; after prediction is finished, scheduling the tasks to a computer with the predicted resource utilization rate lower than fifty percent by using a peak clipping and valley filling method, so as to optimize the computer resource utilization rate;

the step 1 comprises the following steps:

step 1-1, performing block extraction on original data: dividing cluster running data larger than one hour into small segments of running data in more than two continuous time periods according to the timestamp sequence of the real cluster running data and the set time length; the set time length is determined according to the demand of cluster resource utilization rate prediction;

step 1-3, setting the running track as a prediction environment capable of interacting with an algorithm, wherein the prediction environment refers to an environment interacting with the algorithm, the algorithm receives a given state of the prediction environment, acts on the current state, and then obtains feedback of the prediction environment;

in step 1-2, the state variables

Is defined as:

wherein i is the machine numberT is the current time, m is the length of the history data contained in the state variable,

the task running state on the ith machine from t-m to t-1,

the real resource utilization rate of the ith machine from t-m to t-1;

action variable

Is defined as:

wherein

The predicted resource utilization for machine i at time t.

2. The method of claim 1, wherein the step 2 comprises:

Is defined as follows:

wherein

For the running task set at time t:

wherein task ₁ ,task ₂ Representing two different tasks;

Step 2-2, using a back propagation method to

And (3) training the LSTM neural network for the fitting target, iterating for more than two times until the relative error between the LSTM neural network training target and the true value is less than ten percent, and taking out the last hidden layer output of the LSTM neural network as the code of the task feature sample.

3. The method of claim 2, wherein step 3 comprises:

and 3-3, training the strategy network by using a reinforcement learning method until the strategy network is converged, thereby generating a resource utilization rate prediction function.

4. The method of claim 3, wherein the error calculation method of step 4 is:

the mean square error MSE calculation method comprises the following steps:

the method for calculating the average absolute error MAE comprises the following steps: