CN113220450B - Load prediction method, resource scheduling method and device for cloud-side multi-data center - Google Patents
Load prediction method, resource scheduling method and device for cloud-side multi-data center Download PDFInfo
- Publication number
- CN113220450B CN113220450B CN202110473131.1A CN202110473131A CN113220450B CN 113220450 B CN113220450 B CN 113220450B CN 202110473131 A CN202110473131 A CN 202110473131A CN 113220450 B CN113220450 B CN 113220450B
- Authority
- CN
- China
- Prior art keywords
- load
- input
- data
- vector
- prediction
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 79
- 239000013598 vector Substances 0.000 claims abstract description 91
- 230000008569 process Effects 0.000 claims abstract description 41
- 238000003062 neural network model Methods 0.000 claims abstract description 25
- 230000006870 function Effects 0.000 claims description 38
- 238000004364 calculation method Methods 0.000 claims description 27
- 230000004913 activation Effects 0.000 claims description 20
- 230000007246 mechanism Effects 0.000 claims description 13
- 230000008859 change Effects 0.000 claims description 12
- 238000013528 artificial neural network Methods 0.000 claims description 11
- 238000013507 mapping Methods 0.000 claims description 11
- 238000012545 processing Methods 0.000 claims description 11
- 230000015654 memory Effects 0.000 claims description 9
- 230000003044 adaptive effect Effects 0.000 claims description 6
- 238000012549 training Methods 0.000 claims description 5
- 238000009795 derivation Methods 0.000 claims description 3
- 238000013215 result calculation Methods 0.000 claims description 3
- 238000005070 sampling Methods 0.000 claims description 3
- 230000009466 transformation Effects 0.000 claims description 3
- 230000005540 biological transmission Effects 0.000 claims description 2
- 238000010586 diagram Methods 0.000 description 10
- 238000004590 computer program Methods 0.000 description 9
- 238000013468 resource allocation Methods 0.000 description 7
- 238000004458 analytical method Methods 0.000 description 4
- 238000013461 design Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 238000007781 pre-processing Methods 0.000 description 3
- 230000002159 abnormal effect Effects 0.000 description 2
- 230000007423 decrease Effects 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 230000007774 longterm Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000002123 temporal effect Effects 0.000 description 2
- 238000009825 accumulation Methods 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000000903 blocking effect Effects 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 238000013480 data collection Methods 0.000 description 1
- 238000007418 data mining Methods 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 230000001934 delay Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000013277 forecasting method Methods 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 230000006403 short-term memory Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
- G06F9/505—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the load
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5061—Partitioning or combining of resources
- G06F9/5077—Logical partitioning of resources; Management or configuration of virtualized resources
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/047—Probabilistic or stochastic networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Software Systems (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Computational Linguistics (AREA)
- Artificial Intelligence (AREA)
- Mathematical Physics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Probability & Statistics with Applications (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
本发明公开了一种面向云端多数据中心的负载预测方法,包括以下过程:获取记录每一个时间点虚拟机资源使用情况的日志记录文件,从中提取所需要的特征量数据和历史负载数据;并将特征量数据和历史负载数据转换为相应的输入特征序列和历史负载向量;利用预先构建的神经网络模型计算得到负载预测的非线性分量;利用预先构建的自回归模型计算得到负载预测的线性分量;整合负载预测的非线性分量和线性分量得到最终的负载预测结果。本发明综合考虑多数据中心环境下,负载序列的随时间变化的线性趋势和非线性特征,将神经网络模型与自回归模型的统计学习方法相结合,可以有效提升对未来负载预测精度。
The invention discloses a load prediction method oriented to multiple data centers in the cloud, including the following processes: acquiring a log record file that records the resource usage of a virtual machine at each time point, and extracting the required feature quantity data and historical load data therefrom; and Convert feature quantity data and historical load data into corresponding input feature sequences and historical load vectors; use pre-built neural network models to calculate nonlinear components of load prediction; use pre-built autoregressive models to calculate linear components of load prediction ; Integrate the nonlinear and linear components of load prediction to get the final load prediction result. The invention comprehensively considers the time-varying linear trend and nonlinear characteristics of the load sequence in the multi-data center environment, and combines the neural network model with the statistical learning method of the autoregressive model, which can effectively improve the prediction accuracy of future loads.
Description
技术领域technical field
本发明具体涉及一种面向云端多数据中心的负载预测方法,还涉及一种面向云端多数据中心的资源调度方法,属于云计算和数据挖掘技术领域。The invention specifically relates to a load prediction method oriented to cloud multi-data centers, and also relates to a resource scheduling method oriented to cloud multi-data centers, belonging to the technical field of cloud computing and data mining.
背景技术Background technique
云计算技术的不断增长计算需求促使云数据中心规模的持续扩张,预计 2022年底,国内数据中心业务市场规模将增长至3200亿元,其规模结构也由单一的数据中心向云端多数据中心转变。为了实现绿色节能发展,需要数据中心具备动态调节其内部资源分配的决策能力,整合计算资源实现服务弹性。有效的负载预测是实现资源弹性分配的前提条件,可以为高质量的扩展方案提供决策参考。因此基于当前云端多数据中心的环境,结合负载特征建立合适的负载预测模型具有重要意义。The growing computing demand of cloud computing technology has prompted the continuous expansion of the scale of cloud data centers. It is expected that by the end of 2022, the domestic data center business market will grow to 320 billion yuan, and its scale structure will also change from a single data center to multiple cloud data centers. In order to achieve green energy-saving development, the data center needs to have the decision-making ability to dynamically adjust its internal resource allocation, and integrate computing resources to achieve service flexibility. Effective load prediction is a prerequisite for realizing resource elastic allocation, which can provide decision-making reference for high-quality expansion schemes. Therefore, based on the current multi-data center environment in the cloud, it is of great significance to establish an appropriate load prediction model combined with load characteristics.
工作负载是与上下文密切关联的时间序列问题,然而大多数负载预测的研究仍停滞在单一数据中心范围内。相比于传统的单一数据中心,多数据中心资源调度拓展能力更强,业务部署也更加灵活,负载变化受用户行为驱动的影响更加突出。这导致了各自所需要的计算资源需求是动态变化的,不同数据中心的负载波动趋势也存在较大差异性,使得面向云端多数据中心的负载预测需要结合各数据中心的具体情况进行建模分析,不能仅仅依靠原有的面向单一数据中心的算法。Workload is a time-series problem that is closely related to context, yet most studies of load forecasting are still stagnant at the scope of a single data center. Compared with a traditional single data center, multi-data center resource scheduling and expansion capabilities are stronger, business deployment is more flexible, and load changes are more prominently affected by user behavior. As a result, the computing resource requirements required by each of them are dynamically changed, and the load fluctuation trend of different data centers is also quite different. Therefore, the load forecasting for multiple data centers in the cloud needs to be modeled and analyzed in combination with the specific conditions of each data center. , can not rely solely on the original single data center-oriented algorithm.
有鉴于此,确有必要提出一种面向云端多数据中心的负载预测方法与资源分配器系统,以解决上述问题。In view of this, it is indeed necessary to propose a load prediction method and resource allocator system for cloud multi-data centers to solve the above problems.
发明内容SUMMARY OF THE INVENTION
本发明的目的在于克服现有技术中的不足,提供了一种面向云端多数据中心的负载预测方法,将神经网络模型与自回归模型相结合预测服务器上未来的负载变化的非线性分量和线性分量,提高负载预测结果的准确度。The purpose of the present invention is to overcome the deficiencies in the prior art, and provide a load prediction method oriented to cloud multi-data centers, which combines a neural network model and an autoregressive model to predict the nonlinear component and linearity of future load changes on the server. components to improve the accuracy of load prediction results.
为解决上述技术问题,本发明的技术方案如下。In order to solve the above technical problems, the technical solutions of the present invention are as follows.
第一方面,本发明提供了一种面向云端多数据中心的负载预测方法,包括以下过程:In a first aspect, the present invention provides a load prediction method for multiple data centers in the cloud, including the following processes:
获取记录每一个时间点虚拟机资源使用情况的日志记录文件,从中提取所需要的特征量数据和历史负载数据;并将特征量数据和历史负载数据转换为相应的输入特征序列和历史负载向量;Obtain the log record file that records the resource usage of the virtual machine at each point in time, extract the required feature data and historical load data from it, and convert the feature data and historical load data into the corresponding input feature sequence and historical load vector;
基于获得的输入特征序列和历史负载向量,利用预先构建的神经网络模型计算得到负载预测的非线性分量;Based on the obtained input feature sequence and historical load vector, the nonlinear component of load prediction is calculated by using a pre-built neural network model;
基于获得的历史负载向量,利用预先构建的自回归模型计算得到负载预测的线性分量;Based on the obtained historical load vector, the linear component of load prediction is calculated by using a pre-built autoregressive model;
整合负载预测的非线性分量和线性分量得到最终的负载预测结果。The nonlinear and linear components of load prediction are integrated to obtain the final load prediction result.
可选的,所述特征量包括:采样记录时间、虚拟机内核数配置、CPU容量、虚拟机的内存配置容量、虚拟机内存的有效使用量、磁盘读取吞吐量、磁盘写吞吐量、网络接收吞吐量和网络传输吞吐量;所述负载指CPU有效使用量。Optionally, the feature quantities include: sampling and recording time, virtual machine core number configuration, CPU capacity, virtual machine memory configuration capacity, effective usage of virtual machine memory, disk read throughput, disk write throughput, network Receive throughput and network transmission throughput; the load refers to the effective CPU usage.
可选的,所述将特征量数据转换为相应的输入特征序列,包括:Optionally, converting the feature quantity data into a corresponding input feature sequence includes:
利用滑动窗口对特征量数据进行切分,形成时间步长固定的时间序列,作为输入特征序列。The feature quantity data is segmented by sliding window to form a time series with a fixed time step as the input feature sequence.
可选的,所述神经网络模型包括:编码器、解码器以及多层感知机网络;其中,编码器的输入为采集到的输入特征序列,解码器的输入为上层编码器输出的自适应提取的输入特征序列,多层感知机网络的输入为解码器输出的文本向量。Optionally, the neural network model includes: an encoder, a decoder, and a multi-layer perceptron network; wherein, the input of the encoder is the collected input feature sequence, and the input of the decoder is the adaptive extraction of the output of the upper-layer encoder. The input feature sequence of the multi-layer perceptron network is the text vector output by the decoder.
可选的,所述基于获得的输入特征序列和历史负载向量,利用预先构建的神经网络模型计算得到负载预测的非线性分量,包括:Optionally, based on the obtained input feature sequence and historical load vector, use a pre-built neural network model to calculate and obtain the nonlinear component of load prediction, including:
编码器模块包含输入注意力层、softmax层、LSTM神经网络层,每一层的输出就是下一层的输入;在编码器模块数据循环更新中需要将先前LSTM单元输出的隐藏层状态和细胞状态st-1作为输入参数,使用:The encoder module includes an input attention layer, a softmax layer, and an LSTM neural network layer. The output of each layer is the input of the next layer; in the data loop update of the encoder module, the hidden layer state output by the previous LSTM unit needs to be updated. and the cell state s t-1 as input parameters, use:
ht=f1(ht-1,Xt)h t =f 1 (h t-1 ,X t )
表示一个该更新计算过程;其中f1表示所述的LSTM单元,表示m维度的实数向量空间,Xt表示在t时刻的输入特征序列;Represents an update calculation process; where f 1 represents the LSTM unit, Represents a real vector space of m dimension, X t represents the input feature sequence at time t;
在LSTM单元中,在每个时间步上有以计算方式:In an LSTM unit, at each time step there is a computation:
其中,ft,it,ot分别表示遗忘门,输入门和输出门; 分别是施加给rt和ht-1的权重矩阵,表示4m×dr维度的实数向量空间,表示4m×m维度的实数向量空间,m是隐藏层维度,dr是输入rt的向量维度;rt是t时刻的输入;ht-1是t-1时刻输出的隐藏状态向量;是当前时刻神经网络模型输出的候选单元状态向量;sigmoid和 tanh分别表示不同的激活函数;Among them, f t , i t , o t represent forgetting gate, input gate and output gate respectively; are the weight matrices applied to r t and h t-1 , respectively, represents a real vector space of 4m × d r dimensions, Represents a real vector space of 4m×m dimensions, m is the hidden layer dimension, d r is the vector dimension of the input r t ; r t is the input at time t; h t-1 is the hidden state vector output at time t-1; is the candidate unit state vector output by the neural network model at the current moment; sigmoid and tanh represent different activation functions respectively;
接着通过输入注意力机制可得到每条输入的特征序列对应的权重 Then, the weight corresponding to each input feature sequence can be obtained through the input attention mechanism
其中,是一个中间变量,无具体实际含义,和是注意力机制模型中需要学习的参数,表示T维度的实数向量空间,表示 T×2m维度的实数向量空间,表示T×T维度的实数向量空间;tanh表示双曲正切激活函数,exp表示指数函数;的计算是通过softmax层进行处理;in, is an intermediate variable with no specific actual meaning, and is the parameter that needs to be learned in the attention mechanism model, represents a real vector space of dimension T, represents a real vector space of dimension T × 2m, Represents the real vector space of T×T dimension; tanh represents the hyperbolic tangent activation function, and exp represents the exponential function; The calculation is processed through the softmax layer;
通过关注度权重大小,可以得到自适应提取的输入特征序列:Through the attention weight, the adaptively extracted input feature sequence can be obtained:
进而可以更新LSTM单元的隐藏层状态为:In turn, the hidden layer state of the LSTM unit can be updated as:
使用表示在t时刻的自适应输入特征序列,从而让编码器能够有选择性地专注于更加重要的相关输入特征,而非平等对待所有的输入特征序列,实现了发掘特征序列之间的相互依赖关系;use Represents the adaptive input feature sequence at time t, so that the encoder can selectively focus on more important relevant input features, rather than treating all input feature sequences equally, and realizes the interdependence between feature sequences. ;
在解码器模块中,因为在传统的编码器解码器模型中当输入的序列过长其表征能力下降,模型效果迅速恶化的问题,因此在模型的解码层使用了时间注意力机制来自适应选择相关的隐藏层状态。In the decoder module, because in the traditional encoder-decoder model, when the input sequence is too long, its representation ability decreases, and the model effect deteriorates rapidly. Therefore, the temporal attention mechanism is used in the decoding layer of the model to adaptively select relevant correlations. the state of the hidden layer.
与编码器模块中的方法类似,解码器模块中的注意力机制也需要将之前 LSTM单元输出的隐藏层状态和细胞状态作为输入参数,其中z代表编码器中隐藏层的维度,表示z维度的实数向量空间,其重要性权重的公式推导过程与输入的特征量序列关注度的计算过程相同,代表第k个编码器隐藏状态hk对最终预测的重要性大小。接着解码器将所有的编码器隐藏层状态按照权重求和得到文本向量:Similar to the method in the encoder module, the attention mechanism in the decoder module also requires the hidden layer state output from the previous LSTM unit and cell state as an input parameter, where z represents the dimension of the hidden layer in the encoder, Represents a real vector space in z dimension, its importance weights The formula derivation process of and the attention of the input feature sequence The calculation process is the same, Represents the importance of the kth encoder hidden state h k to the final prediction. Then the decoder sums all the encoder hidden layer states according to the weights to obtain the text vector:
结合历史负载数据{LT-P,LT-P+1,...,LT-1}和得到的文本向量,经过向量拼接与线性变化,可以得到自适应提取的解码层输入:Combining the historical load data {L TP ,L T-P+1 ,...,L T-1 } and the obtained text vector, after vector splicing and linear change, the adaptively extracted decoding layer input can be obtained:
其中,为t-1时刻的负载与计算得到的文本向量的拼接,m是前面所述的解码器中的隐藏层维度,表示m+1维度的实数向量空间。和是线性变换过程中待学习的参数。接着利用计算得到的更新解码器隐藏层状态:in, is the splicing of the load at time t-1 and the calculated text vector, m is the hidden layer dimension in the decoder described above, Represents a real vector space of dimension m+1. and is the parameter to be learned in the linear transformation process. Then use the calculated Update the decoder hidden layer state:
其中,f2是一个LSTM单元的非线性激活函数,其具体更新计算方式与f1一致。Among them, f 2 is a nonlinear activation function of an LSTM unit, and its specific update calculation method is consistent with f 1 .
输出层是由多层感知机构成,将编解码模型输出的最终隐藏层状态即 {dT-P,dT-P+1,...,dT-1}作为输入,经由三层感知机网络输出得到最终的模型预测结果,其中在多层感知机的前两层使用了PReLU作为激活函数:The output layer is composed of a multi-layer perceptron. The final hidden layer state output by the encoding and decoding model, namely {d TP , d T-P+1 ,..., d T-1 }, is used as input, and the three-layer perceptron network is used as input. The output is the final model prediction result, where PReLU is used as the activation function in the first two layers of the multilayer perceptron:
f3=max(μdt,dt)f 3 =max(μd t ,d t )
其中f3代表PReLU激活函数,μ是一个只在训练过程中被更新的参数。 PReLU激活函数避免了部分参数无法被更新的问题。最后一层感知机的激活函数为Sigmoid函数,以保证预测结果能被限制在合理的范围内。where f3 represents the PReLU activation function and μ is a parameter that is only updated during training. The PReLU activation function avoids the problem that some parameters cannot be updated. The activation function of the last layer of perceptron is a sigmoid function to ensure that the prediction results can be limited within a reasonable range.
最后经过T个时间步得到了最后的负载预测结果的非线性部分 Finally, after T time steps, the nonlinear part of the final load prediction result is obtained
其中,代表在经历了T个时间步后对于负载预测的非线性部分,我们用T表示时间步大小,non表示上述非线性的标识,Fnon用于表示上述整个负载预测的非线性计算过程,{LT-P,...,LT-1}表示T时间步之前P-1个时间步的负载大小, {XT-P,...,XT}表示T时间步前P个时间步的输入特征序列。是隐藏层状态和文本向量的拼接向量,表示z+m维度的实数向量空间,参数Wy和bw实现了将拼接向量映射为解码层隐藏层状态的尺寸,其中Wy表示拼接向量在解码器计算过程中对应的非线性映射权重,bw表示偏置值,其计算实现过程由计算机程序实现,同理和uw分别代表输出层中的非线性映射权重和偏置值,使用Fnon代表上述过程的神经网络预测函数,表示计算结果即最终预测结果的非线性组成部分。in, Represents the nonlinear part of load prediction after T time steps, we use T to represent the time step size, non to represent the above nonlinear identifier, F non to represent the nonlinear calculation process of the entire load prediction above, {L TP ,...,L T-1 } represents the load size of P-1 time steps before T time step, {X TP ,...,X T } represents the input feature sequence of P time steps before T time step . is the concatenation vector of the hidden layer state and the text vector, Represents the real vector space of z+m dimension, and the parameters W y and b w realize the size of mapping the splicing vector to the hidden layer state of the decoding layer, where W y represents the corresponding nonlinear mapping weight of the splicing vector in the decoder calculation process, b w represents the bias value, and its calculation and realization process is realized by a computer program. and u w represent the nonlinear mapping weight and bias value in the output layer, respectively, and use F non to represent the neural network prediction function of the above process, Represents the nonlinear component of the calculation result, which is the final prediction result.
可选的,所述自回归模型的具体计算公式为:Optionally, the specific calculation formula of the autoregressive model is:
其中,{LT-P,LT-P+1,...,LT-1}为历史数据,εT为随机扰动变量,λt为每一时刻对应的权重大小,两种变量均可在自回归模型设计中实现初始化与数值自动更新,使用Flinear代表上述过程的自回归预测函数,表示计算结果即最终预测结果的线性组成部分。Among them, {L TP ,L T-P+1 ,...,L T-1 } is the historical data, ε T is the random disturbance variable, and λ t is the weight corresponding to each moment. In the design of the autoregressive model, initialization and automatic numerical update are realized, and F linear is used to represent the autoregressive prediction function of the above process. Represents the linear component of the calculation result, which is the final prediction result.
第二方面,本发明还提供了一种面向云端多数据中心的负载预测装置,包括:In a second aspect, the present invention also provides a load prediction device oriented to multiple data centers in the cloud, including:
数据处理模块,用于获取记录每一个时间点虚拟机资源使用情况的日志记录文件,从中提取所需要的特征量数据和历史负载数据;并将特征量数据和历史负载数据转换为相应的输入特征序列和历史负载向量;The data processing module is used to obtain the log record file that records the resource usage of the virtual machine at each time point, extract the required feature data and historical load data from it, and convert the feature data and historical load data into corresponding input features sequence and historical load vectors;
非线性分量预测模块,用于基于获得的输入特征序列和历史负载向量,利用预先构建的神经网络模型计算得到负载预测的非线性分量;The nonlinear component prediction module is used to calculate the nonlinear component of load prediction by using the pre-built neural network model based on the obtained input feature sequence and historical load vector;
线性分量预测模块,用于基于获得的历史负载向量,利用预先构建的自回归模型计算得到负载预测的线性分量;The linear component prediction module is used to calculate the linear component of the load prediction by using the pre-built autoregressive model based on the obtained historical load vector;
预测结果计算模块,整合负载预测的非线性分量和线性分量得到最终的负载预测结果。The forecasting result calculation module integrates the nonlinear and linear components of load forecasting to obtain the final load forecasting result.
第三方面,本发明还提供了一种面向云端多数据中心的资源调度方法,包括以下过程:In a third aspect, the present invention also provides a resource scheduling method for multiple data centers in the cloud, including the following processes:
基于上述方法计算得到云端多数据中心环境下集群中各个服务器上虚拟机的负载预测结果;Calculate the load prediction result of the virtual machines on each server in the cluster in the cloud multi-data center environment based on the above method;
基于各个服务器上虚拟机的负载预测结果生成相应的资源调度策略。A corresponding resource scheduling policy is generated based on the load prediction results of the virtual machines on each server.
第四方面,本发明还提供了一种资源分配器,包括:In a fourth aspect, the present invention also provides a resource allocator, comprising:
负载预测模块,用于基于上述方法计算得到云端多数据中心环境下集群中各个服务器上虚拟机的负载预测结果;A load prediction module, configured to calculate and obtain the load prediction result of the virtual machines on each server in the cluster in the cloud multi-data center environment based on the above method;
资源调度模块,用于基于各个服务器上虚拟机的负载预测结果生成相应的资源调度策略。The resource scheduling module is used for generating corresponding resource scheduling policies based on the load prediction results of the virtual machines on each server.
与现有技术相比,本发明所达到的有益效果是:本发明综合考虑多数据中心环境下,负载序列的随时间变化的线性趋势和非线性特征,将基于LSTM的神经网络方法与自回归模型的统计学习方法相结合,可以有效提升对未来负载测预测精度。Compared with the prior art, the beneficial effects achieved by the present invention are: the present invention comprehensively considers the time-varying linear trend and nonlinear characteristics of the load sequence under the multi-data center environment, and combines the LSTM-based neural network method with the autoregressive method. The combination of statistical learning methods of the model can effectively improve the prediction accuracy of future load testing.
附图说明Description of drawings
图1是本发明系统的结构示意图;Fig. 1 is the structural representation of the system of the present invention;
图2是本发明方法的流程图;Fig. 2 is the flow chart of the inventive method;
图3是神经网络模型中编码器解码器模块的原理示意图;Fig. 3 is the principle schematic diagram of the encoder-decoder module in the neural network model;
图4是神经网络模型中多层感知机输出网络示意图;Fig. 4 is a schematic diagram of a multilayer perceptron output network in a neural network model;
图5是本发明系统的具体原理结构框图;Fig. 5 is the concrete principle structure block diagram of the system of the present invention;
图6是数据中心某服务器的在一段时间内实际负载变化趋势。Figure 6 shows the actual load variation trend of a server in the data center over a period of time.
具体实施方式Detailed ways
下面结合附图对本发明作进一步描述。以下实施例仅用于更加清楚地说明本发明的技术方案,而不能以此来限制本发明的保护范围。The present invention will be further described below in conjunction with the accompanying drawings. The following examples are only used to illustrate the technical solutions of the present invention more clearly, and cannot be used to limit the protection scope of the present invention.
实施例1Example 1
本发明设计了一种面向云端多数据中心的负载预测方法与资源调度方法,该系统将云端的数据中心获取各虚拟机的日志记录文件作为输入,通过对数据进行预处理,预测负载线性与非线性变化,最终使得系统能够根据负载预测结果,产生相应的资源配置调度策略。The present invention designs a load prediction method and resource scheduling method oriented to multiple data centers in the cloud. The system takes the log record files of each virtual machine obtained by the data center in the cloud as input, and preprocesses the data to predict the linearity and non-linearity of the load. The linear change finally enables the system to generate the corresponding resource allocation scheduling strategy according to the load prediction result.
本发明的一种面向云端多数据中心的负载预测方法,如图2所示,包括如下步骤:A load prediction method for cloud multi-data centers of the present invention, as shown in FIG. 2 , includes the following steps:
步骤1,首先从云端数据中心获取集群中服务器上虚拟机的日志记录文件,所述虚拟机的日志文件里面包含了每一个时间点虚拟机所占用的各种资源的使用情况。从日志记录文件中提取所需要的特征量,并转换为系统可识别和处理的时序数据格式;
数据收集和处理均是面向所有虚拟机的,在此过程中会收集到大量的虚拟机日志记录文件,其处理和分析方法相同,因此本实施例中分析以一台虚拟机为例处理日志记录文件并进行负载的预测。Data collection and processing are for all virtual machines. During this process, a large number of virtual machine log records will be collected. The processing and analysis methods are the same. Therefore, in this embodiment, a virtual machine is used as an example to process log records. file and perform load forecasting.
为了简化海量日志记录文件中数据繁杂的信息,剔除对负载预测没有影响的非时间序列信息。本发明首先在预处理阶段指定了模型所需要的特征量种类,规范了时序数据格式。从而便于在对负载产生影响的多种数据中学习到更丰富的结构信息与上下文联系。In order to simplify the complex information in the massive log record files, the non-time-series information that has no effect on load forecasting is eliminated. The invention firstly specifies the types of feature quantities required by the model in the preprocessing stage, and standardizes the time series data format. Therefore, it is convenient to learn richer structural information and contextual connections in the various data that affect the load.
对日志记录文件预处理的具体过程为:The specific process of preprocessing the log record file is as follows:
1)首先将采集到的日志记录文件进行数据清洗,提取所需要的特征量数据,舍弃其余与负载预测不相关的记录数据,只保留各个时间点的虚拟机资源使用情况。若出现局部数据异常现象,则选择将字段前后相邻数据求均值代替异常数据,当某一段的序列缺失达到两个或以上时,考虑到数据中心各特征之间可能存在的复杂非线性关系,为了避免简单的数据处理方式对训练精度产生影响,因此抛弃这段存在数据缺失的异常数据。1) First, clean the collected log record files, extract the required feature data, discard the rest of the record data irrelevant to load prediction, and only retain the virtual machine resource usage at each point in time. If there is a local data anomaly, choose the average value of the adjacent data before and after the field to replace the abnormal data. When the sequence of a certain segment is missing two or more, considering the complex nonlinear relationship that may exist between the features of the data center, In order to avoid the impact of simple data processing on the training accuracy, this abnormal data with missing data is discarded.
2)最终采集到的数据应当包含但不限于如下特征量:采样记录时间、虚拟机内核数配置、CPU容量、CPU有效使用量、虚拟机的内存配置容量、虚拟机内存的有效使用量、磁盘读取吞吐量、磁盘写吞吐量、网络接收吞吐量、网络传输吞吐量。将CPU有效使用量作为负载预测的目标,其余的特征量作为模型的输入数据。2) The final collected data should include but not limited to the following features: sampling and recording time, configuration of virtual machine cores, CPU capacity, CPU effective usage, virtual machine memory configuration capacity, virtual machine memory effective usage, disk Read throughput, disk write throughput, network receive throughput, network transmit throughput. The effective CPU usage is used as the target of load prediction, and the rest of the features are used as the input data of the model.
其中特征量的记录时间1970.01.01为计算机默认的开始计时时间,单位MHZ 代表兆赫是波动频率单位之一,单位KB/S代表每秒钟可以处理的千字节数。The recording time 1970.01.01 of the feature quantity is the default start time of the computer, the unit MHZ stands for megahertz, which is one of the fluctuation frequency units, and the unit KB/S stands for the number of kilobytes that can be processed per second.
3)接着利用滑动窗口对采集到的特征量数据进行切分,形成时间步长固定的时间序列曲线,以后续作为模型的输入特征序列。3) Then use the sliding window to segment the collected feature quantity data to form a time series curve with a fixed time step, which is then used as the input feature sequence of the model.
用来代表某一台虚拟机日志文件整理完后得到的输入特征序列,其中,n代表负载预测输入的特征维度, P代表输入特征序列的时间步长,T代表预测的目标时刻,表示一个n×P维度的实数空间。设置每一个Xk均代表作为输入特征之一的时间步长为P的特征序列,另外,引入代表在某个时刻t的含有n个输入特征的负载预测的输入特征序列。use to represent the input feature sequence obtained after finishing the log file of a virtual machine, where n represents the feature dimension of the load prediction input, P represents the time step of the input feature sequence, T represents the predicted target time, Represents a real space of n×P dimensions. set up Each X k represents a feature sequence with time step P as one of the input features. In addition, we introduce represents a sequence of input features for load prediction with n input features at some time t.
此外,使用L={LT-P,LT-P+1,...,LT-1}来表示历史负载数据向量,使用表示在T时刻的云数据中心的CPU的有效使用量即当前的瞬时,也是我们的预测目标。In addition, using L={L TP ,L T-P+1 ,...,L T-1 } to represent the historical load data vector, use It represents the effective usage of the CPU of the cloud data center at time T, that is, the current instant, which is also our prediction target.
综上,经过对日志记录文件进行预处理后,得到了所需要的输入特征序列X={X1,X2,...,Xn}P以及历史负载数据向量L={LT-P,LT-P+1,...,LT-1}。To sum up, after preprocessing the log record file, the required input feature sequence X={X 1 ,X 2 ,...,X n } P and the historical load data vector L={L TP ,L are obtained T-P+1 ,...,L T-1 }.
步骤2,将经由步骤1处理之后得到的输入特征序列和历史负载数据向量别传入预先设计的神经网络模型与自回归模型中,输出的下一时刻的负载预测结果。In step 2, the input feature sequence and historical load data vector obtained after the processing in
神经网络模型的组成包括编码器、解码器以及多层感知机网络。其中,编码器的输入为采集到的输入特征序列,解码器的输入为上层编码器输出的自适应提取的输入特征序列,多层感知机网络的输入为解码器输出的文本向量。利用神经网络模块提取特征序列之间的相互依赖关系、分析特征量之间的非线性变化趋势,最后输出得到负载预测结果的非线性组成部分。编码器解码器中内嵌了注意力机制用于发掘特征数据对负载影响的权重大小,以及分析先前负载序列以及各特征量对负载产生的影响。The composition of the neural network model includes encoder, decoder and multilayer perceptron network. Among them, the input of the encoder is the collected input feature sequence, the input of the decoder is the adaptively extracted input feature sequence output by the upper encoder, and the input of the multilayer perceptron network is the text vector output by the decoder. The neural network module is used to extract the interdependence between the feature sequences, analyze the nonlinear change trend between the feature quantities, and finally output the nonlinear component of the load prediction result. The attention mechanism is embedded in the encoder and decoder to explore the weight of the feature data on the load, and to analyze the previous load sequence and the impact of each feature on the load.
神经网络的结构和具体处理过程如图3和图4所示,首先对图3和图4中出现的所有变量、符号进行说明:{X1,X2,...,Xn}P代表采集到的输入特征序列,因为其是在一个个时间节点上采集到的数据,所以也可以称为时间序列。时间序列可以进行拆分从而可以得到如{x1 T-P,x1 T-P+1,...,x1 T}到 {xn T-P,xn T-P+1,...,xn T}所示的n个由单一特征量构成的特征向量,其中n代表负载预测输入的特征维度,P代表输入特征序列的时间步长,T代表预测的目标时刻。The structure and specific processing process of the neural network are shown in Figure 3 and Figure 4. First, all variables and symbols appearing in Figure 3 and Figure 4 are explained: {X 1 , X 2 ,...,X n } P represents The collected input feature sequence, because it is the data collected at each time node, can also be called a time series. The time series can be split so that {x 1 TP ,x 1 T-P+1 ,...,x 1 T } to {x n TP ,x n T-P+1 ,...,x The n feature vectors represented by n T } are composed of a single feature quantity, where n represents the feature dimension of the load prediction input, P represents the time step of the input feature sequence, and T represents the predicted target time.
ht代表神经网络模型的编码器中产生某个时刻t对应的隐藏层张量数据。softmax函数又可称归一化指数函数,目的是为了使编码器产生的所有隐藏层权重之和为1。代表第k条输入特征序列在t时刻的数值,代表在编码器中得到的第k条输入特征序列在t时刻的数值对应的权重大小。代表在t时刻第k个自适应提取得到输入向量。LSTM代表长短期记忆神经网络,它是编码器解码器模型中进行模型参数更新的重要组件。dt代表解码器中产生的t时刻对应的隐藏层张量数据。代表在解码器中得到的第i个编码器的隐藏层状态在t时刻的数值对应的权重。是解码器将所有的编码器隐藏状态按照权重求和得到的文本向量,是一个求和符号代表将到的所有值进行相加。Lt代表t时刻对应的负载即CPU的有效使用,表示在T时刻神经网络模型的得到了负载预测结果的非线性组成部分。h t represents the hidden layer tensor data corresponding to a certain time t generated in the encoder of the neural network model. The softmax function can also be called a normalized exponential function, and its purpose is to make the sum of all hidden layer weights generated by the encoder equal to 1. represents the value of the kth input feature sequence at time t, Represents the weight corresponding to the value of the kth input feature sequence obtained in the encoder at time t. Represents the input vector obtained by the k-th adaptive extraction at time t. LSTM stands for Long Short-Term Memory Neural Network, and it is an important component in the encoder-decoder model for model parameter updating. d t represents the hidden layer tensor data corresponding to time t generated in the decoder. Represents the weight corresponding to the value of the hidden layer state of the ith encoder obtained in the decoder at time t. is the text vector obtained by the decoder summing all the hidden states of the encoder according to the weights, is a summation symbol representing the arrive All values of . L t represents the load corresponding to the time t, that is, the effective use of the CPU, Represents the nonlinear component of the neural network model that obtains the load prediction result at time T.
在编码器模块中,引入了输入注意力机制,从而实现自适应给输入特征序列赋予权重。编码器模块中一共包含了如图3所示的输入注意力层、softmax层、 LSTM神经网络层,每一层的输出就是下一层的输入。在编码器模块数据循环更新中需要将先前LSTM单元输出的隐藏层状态和细胞状态st-1作为输入参数,使用:In the encoder module, an input attention mechanism is introduced to achieve adaptive weighting to the input feature sequence. The encoder module includes a total of input attention layer, softmax layer, and LSTM neural network layer as shown in Figure 3. The output of each layer is the input of the next layer. The hidden layer state output from the previous LSTM unit needs to be updated in the encoder module data loop update and the cell state s t-1 as input parameters, use:
ht=f1(ht-1,Xt)h t =f 1 (h t-1 ,X t )
表示一个该更新计算过程。其中f1表示所述的LSTM单元,表示m维度的实数向量空间,Xt表示在t时刻的输入特征序列。Represents a calculation process for this update. where f 1 represents the LSTM unit described, represents a real vector space of dimension m, and X t represents the input feature sequence at time t.
在LSTM单元中,在每个时间步上有以计算方式:In an LSTM unit, at each time step there is a computation:
其中,ft,it,ot分别表示遗忘门,输入门和输出门; 分别是施加给rt和ht-1的权重矩阵,表示4m×dr维度的实数向量空间,表示4m×m维度的实数向量空间,m是隐藏层维度,dr是输入rt的向量维度;rt是t时刻的输入;ht-1是t-1时刻输出的隐藏状态向量;是当前时刻神经网络模型输出的候选单元状态向量;sigmoid和tanh 分别表示不同的激活函数。Among them, f t , i t , o t represent forgetting gate, input gate and output gate respectively; are the weight matrices applied to r t and h t-1 , respectively, represents a real vector space of 4m × d r dimensions, Represents a real vector space of 4m×m dimensions, m is the hidden layer dimension, d r is the vector dimension of the input r t ; r t is the input at time t; h t-1 is the hidden state vector output at time t-1; is the candidate unit state vector output by the neural network model at the current moment; sigmoid and tanh represent different activation functions respectively.
接着通过输入注意力机制可得到每条输入的特征序列对应的权重 Then, the weight corresponding to each input feature sequence can be obtained through the input attention mechanism
其中,是一个中间变量,无具体实际含义,和是注意力机制模型中需要学习的参数,表示T维度的实数向量空间,表示T×2m维度的实数向量空间,表示T×T维度的实数向量空间。tanh表示双曲正切激活函数,exp表示指数函数。的计算是通过softmax层进行处理。通过关注度权重大小,可以得到自适应提取的输入特征序列:in, is an intermediate variable with no specific actual meaning, and is the parameter that needs to be learned in the attention mechanism model, represents a real vector space of dimension T, represents a real vector space of dimension T × 2m, Represents a real vector space of T×T dimension. tanh represents the hyperbolic tangent activation function, and exp represents the exponential function. The computation is processed through a softmax layer. Through the attention weight, the adaptively extracted input feature sequence can be obtained:
进而可以更新LSTM单元的隐藏层状态为:In turn, the hidden layer state of the LSTM unit can be updated as:
使用表示在t时刻的自适应输入特征序列,从而让编码器能够有选择性地专注于更加重要的相关输入特征,而非平等对待所有的输入特征序列,实现了发掘特征序列之间的相互依赖关系。use Represents the adaptive input feature sequence at time t, so that the encoder can selectively focus on more important relevant input features, rather than treating all input feature sequences equally, and realizes the interdependence between feature sequences. .
在解码器模块中,因为在传统的编码器解码器模型中当输入的序列过长其表征能力下降,模型效果迅速恶化的问题,因此在模型的解码层使用了时间注意力机制来自适应选择相关的隐藏层状态。In the decoder module, because in the traditional encoder-decoder model, when the input sequence is too long, its representation ability decreases, and the model effect deteriorates rapidly. Therefore, the temporal attention mechanism is used in the decoding layer of the model to adaptively select relevant correlations. the state of the hidden layer.
与编码器模块中的方法类似,解码器模块中的注意力机制也需要将之前 LSTM单元输出的隐藏层状态和细胞状态作为输入参数,其中z代表编码器中隐藏层的维度,表示z维度的实数向量空间,其重要性权重的公式推导过程与输入的特征量序列关注度的计算过程相同,代表第k个编码器隐藏状态hk对最终预测的重要性大小。接着解码器将所有的编码器隐藏层状态按照权重求和得到文本向量:Similar to the method in the encoder module, the attention mechanism in the decoder module also requires the hidden layer state output from the previous LSTM unit and cell state as an input parameter, where z represents the dimension of the hidden layer in the encoder, Represents a real vector space in z dimension, its importance weights The formula derivation process of and the attention of the input feature sequence The calculation process is the same, Represents the importance of the kth encoder hidden state h k to the final prediction. Then the decoder sums all the encoder hidden layer states according to the weights to obtain the text vector:
结合历史负载数据{LT-P,LT-P+1,...,LT-1}和得到的文本向量,经过向量拼接与线性变化,可以得到自适应提取的解码层输入:Combining the historical load data {L TP ,L T-P+1 ,...,L T-1 } and the obtained text vector, after vector splicing and linear change, the adaptively extracted decoding layer input can be obtained:
其中,为t-1时刻的负载与计算得到的文本向量的拼接,m是前面所述的解码器中的隐藏层维度,表示m+1维度的实数向量空间。和是线性变换过程中待学习的参数。接着利用计算得到的更新解码器隐藏层状态:in, is the splicing of the load at time t-1 and the calculated text vector, m is the hidden layer dimension in the decoder described above, Represents a real vector space of dimension m+1. and is the parameter to be learned in the linear transformation process. Then use the calculated Update the decoder hidden layer state:
其中,f2是一个LSTM单元的非线性激活函数,其具体更新计算方式与f1一致。Among them, f 2 is a nonlinear activation function of an LSTM unit, and its specific update calculation method is consistent with f 1 .
输出层是由多层感知机构成,将编解码模型输出的最终隐藏层状态即 {dT-P,dT-P+1,...,dT-1}作为输入,经由三层感知机网络输出得到最终的模型预测结果,其中在多层感知机的前两层使用了PReLU作为激活函数:The output layer is composed of a multi-layer perceptron. The final hidden layer state output by the encoding and decoding model, namely {d TP , d T-P+1 ,..., d T-1 }, is used as input, and the three-layer perceptron network is used as input. The output is the final model prediction result, where PReLU is used as the activation function in the first two layers of the multilayer perceptron:
f3=max(μdt,dt)f 3 =max(μd t ,d t )
其中f3代表PReLU激活函数,μ是一个只在训练过程中被更新的参数。 PReLU激活函数避免了部分参数无法被更新的问题。最后一层感知机的激活函数为Sigmoid函数,以保证预测结果能被限制在合理的范围内。where f3 represents the PReLU activation function and μ is a parameter that is only updated during training. The PReLU activation function avoids the problem that some parameters cannot be updated. The activation function of the last layer of perceptron is a sigmoid function to ensure that the prediction results can be limited within a reasonable range.
最后经过T个时间步得到了最后的负载预测结果的非线性部分 Finally, after T time steps, the nonlinear part of the final load prediction result is obtained
其中,代表在经历了T个时间步后对于负载预测的非线性部分,我们用 T表示时间步大小,non表示上述非线性的标识,Fnon用于表示上述整个负载预测的非线性计算过程,{LT-P,...,LT-1}表示T时间步之前P-1个时间步的负载大小, {XT-P,...,XT}表示T时间步前P个时间步的输入特征序列。是隐藏层状态和文本向量的拼接向量,表示z+m维度的实数向量空间,参数Wy和bw实现了将拼接向量映射为解码层隐藏层状态的尺寸,其中Wy表示拼接向量在解码器计算过程中对应的非线性映射权重,bw表示偏置值,其计算实现过程由计算机程序实现,同理和uw分别代表输出层中的非线性映射权重和偏置值,使用Fnon代表上述过程的神经网络预测函数,表示计算结果即最终预测结果的非线性组成部分。in, Represents the nonlinear part of load prediction after T time steps, we use T to represent the time step size, non to represent the above nonlinear identifier, F non to represent the nonlinear calculation process of the entire load prediction above, {L TP ,...,L T-1 } represents the load size of P-1 time steps before T time step, {X TP ,...,X T } represents the input feature sequence of P time steps before T time step . is the concatenation vector of the hidden layer state and the text vector, Represents the real vector space of z+m dimension, and the parameters W y and b w realize the size of mapping the splicing vector to the hidden layer state of the decoding layer, where W y represents the corresponding nonlinear mapping weight of the splicing vector in the decoder calculation process, b w represents the bias value, and its calculation and realization process is realized by a computer program. and u w represent the nonlinear mapping weight and bias value in the output layer, respectively, and use F non to represent the neural network prediction function of the above process, Represents the nonlinear component of the calculation result, which is the final prediction result.
在对负载的线性变化的预测中,根据采集到的历史负载数据,将过去多个时间步的负载数据为自回归模型的输入,来预测下一个时间步的负载预测值。从而能够达到发掘负载变化的长期线性变化趋势,避免神经网络模型的输入输出尺度不敏感问题。In the prediction of the linear change of the load, the load data of the past multiple time steps are used as the input of the autoregressive model according to the collected historical load data to predict the load prediction value of the next time step. In this way, the long-term linear change trend of load changes can be explored, and the problem of insensitivity of the input and output scales of the neural network model can be avoided.
自回归模型的具体计算公式为:The specific calculation formula of the autoregressive model is:
其中,{LT-P,LT-P+1,...,LT-1}为历史数据,εT为随机扰动变量,λt为每一时刻对应的权重大小,两种变量均可在自回归模型设计中实现初始化与数值自动更新,使用Flinear代表上述过程的自回归预测函数,表示计算结果即最终预测结果的线性组成部分。Among them, {L TP ,L T-P+1 ,...,L T-1 } is the historical data, ε T is the random disturbance variable, and λ t is the weight corresponding to each moment. In the design of the autoregressive model, initialization and automatic numerical update are realized, and F linear is used to represent the autoregressive prediction function of the above process. Represents the linear component of the calculation result, which is the final prediction result.
步骤3,将得到的负载预测结果返回给数据中心资源分配器,资源分配器将产生的资源分配策略发送给云端数据中心进行服务的资源分配工作。In
如图1和图5所示,将负载预测结果应用到资源分配器中,具体为:As shown in Figure 1 and Figure 5, the load prediction result is applied to the resource allocator, specifically:
在上述分析建模过程中,正如图6所示的真实负载变化趋势图,负载的变化过程往往是整体的线性变化趋势与非线性变化趋势并存。因此在负载预测分析中,综合分析负载非线性分量与线性分量有助于提升预测结果的准确性。因此在提出的方法中联合了负载的非线性预测分量与线性预测分量即神经网络模型与自回归模型的输出结果和作为最终的预测结果。In the above analysis and modeling process, as shown in the real load change trend diagram shown in Figure 6, the load change process is often the coexistence of an overall linear change trend and a nonlinear change trend. Therefore, in the load prediction analysis, comprehensive analysis of the nonlinear and linear components of the load helps to improve the accuracy of the prediction results. Therefore, the proposed method combines the nonlinear prediction component and linear prediction component of the load, that is, the output results of the neural network model and the autoregressive model. and as the final prediction result.
其最终的预测结果可以表示为:Its final prediction result can be expressed as:
其中,表示在T时刻得到的自回归模型的预测分量,表示在T时刻神经网络模型的预测分量。[dT;CT]∈Rp+m是隐藏层状态和文本向量的拼接向量,参数Wy和bw实现了将拼接向量映射为解码层隐藏层状态的尺寸,和bv分别代表输出层中的非线性映射权重和偏置值,{LT-P,LT-P+1,...,LT-1}代表历史负载数据, {XT-P,XT-P+1,...,XT}代表输入的序列向量。in, represents the predicted component of the autoregressive model obtained at time T, represents the prediction component of the neural network model at time T. [d T ; C T ]∈R p+m is the splicing vector of the hidden layer state and the text vector, and the parameters W y and b w realize the size of mapping the splicing vector to the hidden layer state of the decoding layer, and b v represent the nonlinear mapping weights and bias values in the output layer, respectively, {L TP ,L T-P+1 ,...,L T-1 } represent historical load data, {X TP ,X T- P+1 ,...,X T } represents the input sequence vector.
接着,资源分配器根据预测结果,生成相应的资源调度策略,弹性调整各服务器虚拟机的占用资源。Next, the resource allocator generates a corresponding resource scheduling policy according to the prediction result, and flexibly adjusts the occupied resources of each server virtual machine.
系统需要将在T时刻的各服务器上的虚拟机负载预测结果反馈给资源分配器,资源分配器根据负载预测结果产生相应的资源分配策略,云端数据中心根据该策略动态调整各虚拟机的资源分配情况。The system needs to predict the virtual machine load on each server at time T Feedback to the resource allocator, the resource allocator generates a corresponding resource allocation strategy according to the load prediction result, and the cloud data center dynamically adjusts the resource allocation of each virtual machine according to the strategy.
步骤4,将过去多个时刻的负载预测结果与云端数据中心收集到的当前实际负载数据反馈给预测模型,从而模型可以不断获得新的实验数据进行模型训练,从而进一步提升模型预测的性能,减少误差。Step 4: Calculate the load prediction results of multiple times in the past The current actual load data collected by the cloud data center is fed back to the prediction model, so that the model can continuously obtain new experimental data for model training, thereby further improving the performance of model prediction and reducing errors.
通过系统可以不断将负载预测结果与实际负载误差反馈给模型,从而减少了负载模型因数据匮乏而导致的预测偏差。Through the system, the load prediction result and the actual load error can be continuously fed back to the model, thereby reducing the prediction deviation of the load model due to lack of data.
所提出的资源分配器是基于负载预测方法对云端多数据中心环境下的服务器集群进行未来负载预测之后而产生相应的资源分配结果。资源分配器的具体实现细节可由计算机编码实现,其调度方法选择如下:根据负载预测结果,为在未来一段时间内可能承受较多计算任务的服务器分配更多的计算资源如CPU 内核数量、内存等,从而使得当前服务器能够继续支撑其服务运行。当计算资源在单个服务器的分配上达到了上线,资源分配器应考虑将即将到来的计算任务进行重新分发,避免单个服务器节点因承受过多的任务量而导致任务阻塞,甚至系统的崩溃。若各服务器之间负载均衡,可设置计数器,将计算任务均匀分配。在多数据中心环境下,当计算资源余量受限,应考虑将计算任务退回到云端,进行重新分配,避免因任务的堆积而产生的计算延迟。The proposed resource allocator generates the corresponding resource allocation results after predicting the future load of the server cluster in the cloud multi-data center environment based on the load forecasting method. The specific implementation details of the resource allocator can be realized by computer code, and the scheduling method is selected as follows: according to the load prediction result, allocate more computing resources such as the number of CPU cores, memory, etc. to the server that may bear more computing tasks in the future. , so that the current server can continue to support its service operation. When the computing resources are allocated to a single server, the resource allocator should consider redistributing the upcoming computing tasks to avoid task blocking or even system collapse due to excessive task load on a single server node. If the load is balanced among the servers, a counter can be set to distribute the computing tasks evenly. In a multi-data center environment, when the amount of computing resources is limited, you should consider returning computing tasks to the cloud for reallocation to avoid computing delays caused by the accumulation of tasks.
综上所述,本发明根据采集到的日志记录文件,预测得到服务器上各虚拟机的未来的负载变化情况,利用资源分配器实现对数据中心资源的动态调整与配置。综合考虑了负载序列变化的长期变化趋势,和负载特征与数据中心环境下其余特征序列的相互依赖性,设计了一种基于神经网络与自回归方法的融合模型来进行多数据中心的负载预测。提高了预测结果的鲁棒性,避免了神经网络模型的尺度不敏感问题。并保证了模型的灵活性,从而能够适应具有不同变化趋势的数据中心负载预测。To sum up, the present invention predicts the future load changes of each virtual machine on the server according to the collected log record files, and utilizes the resource allocator to realize dynamic adjustment and configuration of data center resources. Considering the long-term trend of load sequence changes and the interdependence of load characteristics and other characteristic sequences in the data center environment, a fusion model based on neural network and autoregressive method is designed to predict the load of multiple data centers. The robustness of the prediction results is improved, and the scale insensitivity problem of the neural network model is avoided. And the flexibility of the model is guaranteed, so that it can adapt to the data center load forecast with different changing trends.
实施例2Example 2
基于与实施例1同样的发明构思,本发明实施例一种面向云端多数据中心的负载预测装置,包括:Based on the same inventive concept as
数据处理模块,用于获取记录每一个时间点虚拟机资源使用情况的日志记录文件,从中提取所需要的特征量数据和历史负载数据;并将特征量数据和历史负载数据转换为相应的输入特征序列和历史负载向量;The data processing module is used to obtain the log record file that records the resource usage of the virtual machine at each time point, extract the required feature data and historical load data from it, and convert the feature data and historical load data into corresponding input features sequence and historical load vectors;
非线性分量预测模块,用于基于获得的输入特征序列和历史负载向量,利用预先构建的神经网络模型计算得到负载预测的非线性分量;The nonlinear component prediction module is used to calculate the nonlinear component of load prediction by using the pre-built neural network model based on the obtained input feature sequence and historical load vector;
线性分量预测模块,用于基于获得的历史负载向量,利用预先构建的自回归模型计算得到负载预测的线性分量;The linear component prediction module is used to calculate the linear component of the load prediction by using the pre-built autoregressive model based on the obtained historical load vector;
预测结果计算模块,整合负载预测的非线性分量和线性分量得到最终的负载预测结果。The forecasting result calculation module integrates the nonlinear and linear components of load forecasting to obtain the final load forecasting result.
本发明装置的各模块具体实现方案参见实施例1方法的各步骤实现过程。For the specific implementation scheme of each module of the device of the present invention, refer to the implementation process of each step of the method in Example 1.
实施例3Example 3
基于与实施例1同样的发明构思,本发明实施例的一种资源分配器,包括:Based on the same inventive concept as
负载预测模块,用于基于上述方法计算得到云端多数据中心环境下集群中各个服务器上虚拟机的负载预测结果;A load prediction module, configured to calculate and obtain the load prediction result of the virtual machines on each server in the cluster in the cloud multi-data center environment based on the above method;
资源调度模块,用于基于各个服务器上虚拟机的负载预测结果生成相应的资源调度策略。The resource scheduling module is used for generating corresponding resource scheduling policies based on the load prediction results of the virtual machines on each server.
本发明装置的各模块具体实现方案参见实施例1方法的各步骤实现过程。For the specific implementation scheme of each module of the device of the present invention, refer to the implementation process of each step of the method in Example 1.
本领域内的技术人员应明白,本申请的实施例可提供为方法、系统、或计算机程序产品。因此,本申请可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、 CD-ROM、光学存储器等)上实施的计算机程序产品的形式。As will be appreciated by those skilled in the art, the embodiments of the present application may be provided as a method, a system, or a computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
本申请是参照根据本申请实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/ 或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the present application. It will be understood that each flow and/or block in the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to the processor of a general purpose computer, special purpose computer, embedded processor or other programmable data processing device to produce a machine such that the instructions executed by the processor of the computer or other programmable data processing device produce Means for implementing the functions specified in a flow or flow of a flowchart and/or a block or blocks of a block diagram.
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory result in an article of manufacture comprising instruction means, the instructions The apparatus implements the functions specified in the flow or flow of the flowcharts and/or the block or blocks of the block diagrams.
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions can also be loaded on a computer or other programmable data processing device to cause a series of operational steps to be performed on the computer or other programmable device to produce a computer-implemented process such that The instructions provide steps for implementing the functions specified in the flow or blocks of the flowcharts and/or the block or blocks of the block diagrams.
以上所述仅是本发明的优选实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本发明技术原理的前提下,还可以做出若干改进和变型,这些改进和变型也应视为本发明的保护范围。The above are only the preferred embodiments of the present invention. It should be pointed out that for those skilled in the art, without departing from the technical principles of the present invention, several improvements and modifications can also be made. These improvements and modifications It should also be regarded as the protection scope of the present invention.
Claims (7)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110473131.1A CN113220450B (en) | 2021-04-29 | 2021-04-29 | Load prediction method, resource scheduling method and device for cloud-side multi-data center |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110473131.1A CN113220450B (en) | 2021-04-29 | 2021-04-29 | Load prediction method, resource scheduling method and device for cloud-side multi-data center |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113220450A CN113220450A (en) | 2021-08-06 |
CN113220450B true CN113220450B (en) | 2022-10-21 |
Family
ID=77089970
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110473131.1A Active CN113220450B (en) | 2021-04-29 | 2021-04-29 | Load prediction method, resource scheduling method and device for cloud-side multi-data center |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113220450B (en) |
Families Citing this family (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114064203B (en) * | 2021-10-28 | 2024-07-23 | 西安理工大学 | Cloud virtual machine load prediction method based on multi-scale analysis and depth network model |
CN114124554B (en) * | 2021-11-29 | 2022-08-30 | 燕山大学 | Virtual network service chain throughput prediction method |
CN114338694B (en) * | 2022-03-04 | 2022-05-31 | 广州鹏捷科技股份有限公司 | One-stop cloud data center server scheduling method and system |
CN115460061B (en) * | 2022-08-03 | 2024-04-30 | 中国科学院信息工程研究所 | Health evaluation method and device based on intelligent operation and maintenance scene |
CN115509752A (en) * | 2022-09-29 | 2022-12-23 | 福州大学 | Edge Prediction Method Based on Deep Autoregressive Recurrent Neural Network |
CN118394592B (en) * | 2024-04-16 | 2025-02-11 | 广州视声智能股份有限公司 | A Paas platform based on cloud computing |
CN118227742A (en) * | 2024-05-24 | 2024-06-21 | 浙江口碑网络技术有限公司 | Data trend analysis method, device, equipment, storage medium and program product |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106502799A (en) * | 2016-12-30 | 2017-03-15 | 南京大学 | A kind of host load prediction method based on long memory network in short-term |
CN108170529A (en) * | 2017-12-26 | 2018-06-15 | 北京工业大学 | A kind of cloud data center load predicting method based on shot and long term memory network |
CN108196957A (en) * | 2017-12-28 | 2018-06-22 | 福州大学 | A kind of host load prediction method under cloud environment |
CN111638958A (en) * | 2020-06-02 | 2020-09-08 | 中国联合网络通信集团有限公司 | Cloud host load processing method and device, control equipment and storage medium |
-
2021
- 2021-04-29 CN CN202110473131.1A patent/CN113220450B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106502799A (en) * | 2016-12-30 | 2017-03-15 | 南京大学 | A kind of host load prediction method based on long memory network in short-term |
CN108170529A (en) * | 2017-12-26 | 2018-06-15 | 北京工业大学 | A kind of cloud data center load predicting method based on shot and long term memory network |
CN108196957A (en) * | 2017-12-28 | 2018-06-22 | 福州大学 | A kind of host load prediction method under cloud environment |
CN111638958A (en) * | 2020-06-02 | 2020-09-08 | 中国联合网络通信集团有限公司 | Cloud host load processing method and device, control equipment and storage medium |
Non-Patent Citations (1)
Title |
---|
基于深度学习的主机负载在线预测模型研究;钱声攀 等;《计算机工程》;20200909;第84-89页 * |
Also Published As
Publication number | Publication date |
---|---|
CN113220450A (en) | 2021-08-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113220450B (en) | Load prediction method, resource scheduling method and device for cloud-side multi-data center | |
CN104951425B (en) | A kind of cloud service performance self-adapting type of action system of selection based on deep learning | |
CN109714395A (en) | Cloud platform resource uses prediction technique and terminal device | |
CN117389824A (en) | Cloud server load prediction method based on signal decomposition and mixing model | |
Cheng et al. | GRU-ES: Resource usage prediction of cloud workloads using a novel hybrid method | |
CN113051130A (en) | Mobile cloud load prediction method and system of LSTM network combined with attention mechanism | |
CN116663842A (en) | Digital management system and method based on artificial intelligence | |
CN116883065A (en) | Merchant risk prediction method and device | |
CN112506663A (en) | Cloud server CPU load prediction method, system and medium based on denoising and error correction | |
CN115766125A (en) | Network flow prediction method based on LSTM and generation countermeasure network | |
Qiu et al. | FLASH: Fast model adaptation in ML-centric cloud platforms | |
CN117236571B (en) | Planning method and system based on Internet of things | |
Li et al. | Learning scheduling policies for co-located workloads in cloud datacenters | |
CN117973610A (en) | Logistics aging prediction method and system based on interpretable model | |
CN116266128A (en) | Method and system for scheduling ecological platform resources | |
CN113298120B (en) | Fusion model-based user risk prediction method, system and computer equipment | |
CN114969148A (en) | System access amount prediction method, medium and equipment based on deep learning | |
CN112667394B (en) | Computer resource utilization rate optimization method | |
Thakkar et al. | Mvms: Rnn based pro-active resource scaling in cloud environment | |
Huang et al. | Accurate prediction of required virtual resources via deep reinforcement learning | |
CN113762972A (en) | Data storage control method and device, electronic equipment and storage medium | |
CN118656685B (en) | A derivative feature extraction method, device, computer equipment and storage medium | |
CN119151597B (en) | Agricultural product early warning management system based on big data statistics | |
Xiao et al. | Resource prediction based on program granularity combined with data purification | |
Rawat | Workload prediction for cloud services by using a hybrid neural network model |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |