CN117786566A

CN117786566A - Load prediction model training method, server load prediction method and device

Info

Publication number: CN117786566A
Application number: CN202311638529.1A
Authority: CN
Inventors: 韩天宇; 段其甲; 曾荣飞; 张成奇
Original assignee: Northeastern University China
Current assignee: Northeastern University China
Priority date: 2023-12-01
Filing date: 2023-12-01
Publication date: 2024-03-29

Abstract

This application discloses a load prediction model training method, a server load prediction method and a device. The model includes an encoder and a decoder. The encoder includes a first graph convolutional neural network, a first feedforward neural network, a first multi-layer neural network, and a first graph convolutional neural network. The head attention mechanism network, the second feedforward neural network, the decoder includes the second graph convolutional neural network, the third feedforward neural network, the second multi-head attention mechanism network, and the fourth feedforward neural network. The method includes: Each graph structure in each training sample is input into each network in the encoder in turn, and the spatiotemporal characteristics of the first network after convolution calculation of each training sample are obtained. Each graph structure in each true value sample is input into the second network in turn. The graph convolutional neural network and the third feedforward neural network obtain the spatial characteristics of the second network after convolution calculation, and input the spatiotemporal characteristics of the first network and the spatial characteristics of the second network into the second multi-head attention mechanism network in sequence. After the fourth feedforward neural network, the final model is obtained by adjusting the training.

Description

Load prediction model training method, server load prediction method and device

技术领域Technical Field

本申请涉及云计算技术领域，具体而言，涉及一种负载预测模型的训练方法、服务器的负载预测方法及装置。The present application relates to the field of cloud computing technology, specifically, to a load prediction model training method, a server load prediction method and a device.

背景技术Background technique

云计算平台是当前信息技术领域的重要发展方向之一，通过提供弹性资源和灵活的服务模式，满足了企业和个人对于高效、可扩展和可靠计算能力的需求。在云计算平台中，服务器负载预测是一项关键任务，它可以帮助云服务提供商有效地规划和管理资源，保证用户的服务质量，同时也可以降低成本和提高利润。Cloud computing platform is one of the important development directions in the current information technology field. By providing elastic resources and flexible service models, it meets the needs of enterprises and individuals for efficient, scalable and reliable computing capabilities. In cloud computing platforms, server load prediction is a key task, which can help cloud service providers plan and manage resources effectively to ensure user service quality, while also reducing costs and increasing profits.

在相关技术中，很多基于统计方法的负载预测模型已经被提出和应用，例如时间序列模型(如ARIMA(Autoregressive Integrated Moving Average，自回归差分移动平均)模型、SARIMA(Seasonal Autoregressive Integrated Moving Average，季节性差分自回归滑动平均)模型)、回归模型(如线性回归、支持向量机等)和一些深度学习模型(如循环神经网络、长短期记忆网络和卷积神经网络)。然而，这些传统的方法通常难以处理复杂的非线性关系和时空依赖关系，因此在预测准确性方面存在一定的局限性。In related technologies, many load forecasting models based on statistical methods have been proposed and applied, such as time series models (such as ARIMA (Autoregressive Integrated Moving Average) model, SARIMA (Seasonal Autoregressive Integrated Moving Average) model), regression models (such as linear regression, support vector machine, etc.) and some deep learning models (such as recurrent neural network, long short-term memory network and convolutional neural network). However, these traditional methods are usually difficult to handle complex nonlinear relationships and spatiotemporal dependencies, so there are certain limitations in prediction accuracy.

发明内容Contents of the invention

本申请提供了一种负载预测模型的训练方法、服务器的负载预测方法及装置，可以提高负载预测的准确性。This application provides a load prediction model training method, a server load prediction method and a device, which can improve the accuracy of load prediction.

具体的技术方案如下：The specific technical solutions are as follows:

第一方面，本申请实施例提供了一种负载预测模型的训练方法，所述负载预测模型包括编码器和解码器，所述编码器包括第一图卷积神经网络、第一前馈神经网络、第一多头注意力机制网络、第二前馈神经网络，所述解码器包括第二图卷积神经网络、第三前馈神经网络、第二多头注意力机制网络、第四前馈神经网络，所述方法包括：In a first aspect, embodiments of the present application provide a method for training a load prediction model. The load prediction model includes an encoder and a decoder. The encoder includes a first graph convolutional neural network and a first feedforward neural network. , the first multi-head attention mechanism network, the second feed-forward neural network, the decoder includes a second graph convolutional neural network, a third feed-forward neural network, a second multi-head attention mechanism network, a fourth feed-forward neural network Neural network, the method includes:

获取训练集，其中，所述训练集包括多个训练样本及每个所述训练样本对应的真值样本，每个所述训练样本包括连续多个历史时刻的图结构，每个所述训练样本对应的真值样本包括所述连续多个历史时刻相邻的未来多个时刻的图结构，每个时刻的图结构包括该时刻下多个服务器中，每个所述服务器的目标数据信息，所述目标数据信息至少包括负载信息；Obtain a training set, wherein the training set includes a plurality of training samples and a true value sample corresponding to each training sample, each training sample includes a graph structure of multiple consecutive historical moments, and each training sample The corresponding true value sample includes the graph structure of multiple consecutive future moments adjacent to the multiple historical moments. The graph structure of each moment includes the target data information of each of the multiple servers at that moment, so The target data information at least includes load information;

将每个所述训练样本中每个所述图结构分别输入所述第一图卷积神经网络中，提取每个所述训练样本中每个所述图结构的第一路网空间特征，以及将每个所述真值样本中每个所述图结构分别输入所述第二图卷积神经网络中，提取每个所述真值样本中每个所述图结构的第二路网空间特征；input each graph structure in each training sample into the first graph convolutional neural network, extract the first network space feature of each graph structure in each training sample, and Each graph structure in each true value sample is input into the second graph convolutional neural network, and the second network space feature of each graph structure in each true value sample is extracted. ;

将每个所述第一路网空间特征经过所述第一前馈神经网络的卷积计算，得到卷积计算后的第一路网空间特征，以及将每个所述第二路网空间特征经过所述第三前馈神经网络的卷积计算，得到卷积计算后的第二路网空间特征；Each of the first road network spatial features is subjected to the convolution calculation of the first feedforward neural network to obtain the first road network spatial features after the convolution calculation, and each of the second road network spatial features is obtained After the convolution calculation of the third feedforward neural network, the second path network spatial characteristics after the convolution calculation are obtained;

对每个时刻的卷积计算后的第一路网空间特征增加对应的时间信息，并将增加时间信息后的每个所述训练样本的第一路网空间特征输入所述第一多头注意力机制网络中，获得每个所述训练样本的第一路网时空特征；The corresponding time information is added to the first path network spatial feature after the convolution calculation at each moment, and the first path network spatial feature of each training sample after adding the time information is input into the first multi-head attention In the force mechanism network, obtain the first network spatiotemporal characteristics of each training sample;

将每个训练样本的第一路网时空特征经过所述第二前馈神经网络的卷积计算，得到每个所述训练样本卷积计算后的第一路网时空特征；Pass the first road network spatio-temporal characteristics of each training sample through the convolution calculation of the second feedforward neural network to obtain the first road network spatio-temporal characteristics after the convolution calculation of each training sample;

将卷积计算后的第一路网时空特征，以及增加时间信息后的对应真值样本中的多个第二路网空间特征，输入所述第二多头注意力机制网络中，获得每个所述训练样本对应的未来多个时刻的第三路网时空特征，以及每个所述真值样本的第二路网时空特征；Input the first road network spatiotemporal features after convolution calculation and the multiple second road network spatial features in the corresponding true value samples after adding time information into the second multi-head attention mechanism network to obtain the third road network spatiotemporal features at multiple future moments corresponding to each of the training samples and the second road network spatiotemporal features of each of the true value samples;

将每个所述训练样本对应的未来多个时刻的第三路网时空特征，以及每个所述真值样本的第二路网时空特征输入所述第四前馈神经网络，获得卷积计算后的第三路网时空特征和卷积计算后的第二路网时空特征；Input the spatio-temporal characteristics of the third network at multiple times in the future corresponding to each training sample and the spatio-temporal characteristics of the second network of each true value sample into the fourth feedforward neural network to obtain convolution calculations The spatio-temporal characteristics of the third road network after the convolution calculation and the spatio-temporal characteristics of the second road network after the convolution calculation;

根据卷积计算后的第三路网时空特征和卷积计算后的第二路网时空特征，计算当前损失值，在所述当前损失值不满足收敛条件时，对所述负载预测模型的模型参数进行调整，并继续对参数调整后的所述负载预测模型进行训练，直至所述当前损失值满足所述收敛条件时，获得最终的负载预测模型。According to the third road network spatiotemporal characteristics after convolution calculation and the second road network spatiotemporal characteristics after convolution calculation, the current loss value is calculated. When the current loss value does not meet the convergence condition, the model parameters of the load prediction model are adjusted, and the load prediction model after parameter adjustment is continued to be trained until the current loss value meets the convergence condition, so as to obtain the final load prediction model.

在一种可能的实施方式中，每个所述服务器的目标数据信息的获取方法包括：In a possible implementation, the method for obtaining target data information of each server includes:

对每个所述服务器的原始数据信息进行异常值检测；Perform outlier detection on the original data information of each server;

当检测到所述原始数据信息中存在异常值时，对所述异常值进行修复；When an outlier is detected in the original data information, repair the outlier;

对异常值修复后的所述原始数据信息进行归一化或标准化操作，获得所述目标数据信息。Perform a normalization or standardization operation on the original data information after outlier repair to obtain the target data information.

在一种可能的实施方式中，所述对每个所述服务器的原始数据信息进行异常值检测包括：In a possible implementation manner, performing outlier detection on the original data information of each server includes:

基于主成分分析PCA算法对所述原始数据信息进行降维处理，并利用局部离群因子LOF对降维后的所述原始数据信息进行异常值检测。The original data information is subjected to dimensionality reduction processing based on the principal component analysis (PCA) algorithm, and outlier detection is performed on the original data information after dimensionality reduction using the local outlier factor (LOF).

在一种可能的实施方式中，所述对所述异常值进行修复，包括：In a possible implementation, repairing the outliers includes:

利用生成对抗网络对所述异常值进行修复，获得修复后的第一数据，以及利用支持向量回归算法对所述异常值进行修复，获得修复后的第二数据；Using a generative adversarial network to repair the outliers to obtain the repaired first data, and using a support vector regression algorithm to repair the outliers to obtain the repaired second data;

对所述第一数据和所述第二数据进行加权计算，获得对所述异常值修复后的最终结果。A weighted calculation is performed on the first data and the second data to obtain a final result after the outlier is repaired.

在一种可能的实施方式中，所述目标数据信息还包括：流量和/或网络性能。In a possible implementation, the target data information also includes: traffic and/or network performance.

第二方面，本申请实施例提供了一种服务器的负载预测方法，所述方法包括：In a second aspect, embodiments of the present application provide a server load prediction method, which method includes:

获取云计算平台中各个服务器在最近多个历史时刻的目标数据信息，所述目标数据信息至少包括负载信息；Obtain target data information of each server in the cloud computing platform at multiple recent historical moments, where the target data information at least includes load information;

针对每个历史时刻，基于所述云计算平台中各个所述服务器的所述目标数据信息生成所述每个历史时刻的图结构；For each historical moment, generate the graph structure of each historical moment based on the target data information of each server in the cloud computing platform;

将最近多个历史时刻的图结构输入负载预测模型中，预测出未来多个时刻的所述各个服务器的负载信息，其中，所述负载预测模型根据第一方面任一实施方式所述的方法训练而成。The graph structure of the most recent multiple historical moments is input into a load prediction model to predict the load information of each server at multiple future moments, wherein the load prediction model is trained according to the method described in any implementation method of the first aspect.

第三方面，本申请实施例提供了一种负载预测模型的训练装置，所述负载预测模型包括编码器和解码器，所述编码器包括第一图卷积神经网络、第一前馈神经网络、第一多头注意力机制网络、第二前馈神经网络，所述解码器包括第二图卷积神经网络、第三前馈神经网络、第二多头注意力机制网络、第四前馈神经网络，所述装置包括：In a third aspect, embodiments of the present application provide a training device for a load prediction model. The load prediction model includes an encoder and a decoder. The encoder includes a first graph convolutional neural network and a first feedforward neural network. , the first multi-head attention mechanism network, the second feed-forward neural network, the decoder includes a second graph convolutional neural network, a third feed-forward neural network, a second multi-head attention mechanism network, a fourth feed-forward neural network Neural network, the device includes:

获取单元，用于获取训练集，其中，所述训练集包括多个训练样本及每个所述训练样本对应的真值样本，每个所述训练样本包括连续多个历史时刻的图结构，每个所述训练样本对应的真值样本包括所述连续多个历史时刻相邻的未来多个时刻的图结构，每个时刻的图结构包括该时刻下多个服务器中，每个所述服务器的目标数据信息，所述目标数据信息至少包括负载信息；An acquisition unit is used to acquire a training set, wherein the training set includes a plurality of training samples and a true value sample corresponding to each training sample, and each training sample includes a graph structure of multiple consecutive historical moments. The true value samples corresponding to each of the training samples include the graph structure of multiple consecutive future moments adjacent to the multiple historical moments, and the graph structure of each moment includes multiple servers at that moment. Target data information, the target data information at least includes load information;

第一特征提取单元，用于将每个所述训练样本中每个所述图结构分别输入所述第一图卷积神经网络中，提取每个所述训练样本中每个所述图结构的第一路网空间特征；A first feature extraction unit, configured to input each graph structure in each training sample into the first graph convolutional neural network, and extract the characteristics of each graph structure in each training sample. The spatial characteristics of the first road network;

第二特征提取单元，用于将每个所述真值样本中每个所述图结构分别输入所述第二图卷积神经网络中，提取每个所述真值样本中每个所述图结构的第二路网空间特征；A second feature extraction unit, configured to input each graph structure in each true value sample into the second graph convolutional neural network, and extract each graph structure in each true value sample. The second network spatial characteristics of the structure;

第一卷积计算单元，用于将每个所述第一路网空间特征经过所述第一前馈神经网络的卷积计算，得到卷积计算后的第一路网空间特征；A first convolution calculation unit, configured to perform convolution calculation on each of the first road network spatial features of the first feedforward neural network to obtain the first road network spatial features after convolution calculation;

第二卷积计算单元，用于将每个所述第二路网空间特征经过所述第三前馈神经网络的卷积计算，得到卷积计算后的第二路网空间特征；A second convolution calculation unit is used to perform convolution calculation on each of the second road network spatial features of the third feedforward neural network to obtain the second road network spatial features after convolution calculation;

增加单元，用于对每个时刻的卷积计算后的第一路网空间特征增加对应的时间信息；Add a unit to add corresponding time information to the spatial characteristics of the first network after the convolution calculation at each moment;

第一注意力处理单元，用于将增加时间信息后的每个所述训练样本的第一路网空间特征输入所述第一多头注意力机制网络中，获得每个所述训练样本的第一路网时空特征；A first attention processing unit, configured to input the first road network spatial feature of each of the training samples after adding time information into the first multi-head attention mechanism network to obtain the first road network spatiotemporal feature of each of the training samples;

第三卷积计算单元，用于将每个训练样本的第一路网时空特征经过所述第二前馈神经网络的卷积计算，得到每个所述训练样本卷积计算后的第一路网时空特征；The third convolution calculation unit is used to pass the first channel spatiotemporal characteristics of each training sample through the convolution calculation of the second feedforward neural network to obtain the first channel network after the convolution calculation of each training sample. Network spatiotemporal characteristics;

第二注意力处理单元，用于将卷积计算后的第一路网时空特征，以及增加时间信息后的对应真值样本中的多个第二路网空间特征，输入所述第二多头注意力机制网络中，获得每个所述训练样本对应的未来多个时刻的第三路网时空特征，以及每个所述真值样本的第二路网时空特征；The second attention processing unit is used to input the first road network spatiotemporal features after convolution calculation and the multiple second road network spatial features in the corresponding true value samples after adding time information into the second multi-head attention mechanism network to obtain the third road network spatiotemporal features at multiple future moments corresponding to each of the training samples and the second road network spatiotemporal features of each of the true value samples;

第四卷积计算单元，用于将每个所述训练样本对应的未来多个时刻的第三路网时空特征，以及每个所述真值样本的第二路网时空特征输入所述第四前馈神经网络，获得卷积计算后的第三路网时空特征和卷积计算后的第二路网时空特征；The fourth convolution calculation unit is used to input the spatio-temporal features of the third road network at multiple times in the future corresponding to each training sample, and the spatio-temporal features of the second road network for each of the true value samples into the fourth convolution calculation unit. The feedforward neural network obtains the spatio-temporal characteristics of the third network after convolution calculation and the spatio-temporal characteristics of the second network after convolution calculation;

调整训练单元，用于根据卷积计算后的第三路网时空特征和卷积计算后的第二路网时空特征，计算当前损失值，在所述当前损失值不满足收敛条件时，对所述负载预测模型的模型参数进行调整，并继续对参数调整后的所述负载预测模型进行训练，直至所述当前损失值满足所述收敛条件时，获得最终的负载预测模型。Adjust the training unit to calculate the current loss value based on the spatio-temporal characteristics of the third road network after convolution calculation and the spatio-temporal characteristics of the second road network after convolution calculation. When the current loss value does not meet the convergence condition, all Adjust the model parameters of the load prediction model, and continue to train the load prediction model after parameter adjustment until the current loss value meets the convergence condition, and the final load prediction model is obtained.

在一种可能的实施方式中，获取单元包括：In a possible implementation, the acquisition unit includes:

异常值检测模块，用于对每个所述服务器的原始数据信息进行异常值检测；An outlier detection module, used for performing outlier detection on the original data information of each server;

修复模块，用于当检测到所述原始数据信息中存在异常值时，对所述异常值进行修复；A repair module, configured to repair an abnormal value when an abnormal value is detected in the original data information;

处理模块，用于对异常值修复后的所述原始数据信息进行归一化或标准化操作，获得所述目标数据信息。A processing module, configured to normalize or standardize the original data information after outlier repair to obtain the target data information.

在一种可能的实施方式中，异常值检测模块，用于基于主成分分析PCA算法对所述原始数据信息进行降维处理，并利用局部离群因子LOF对降维后的所述原始数据信息进行异常值检测。In a possible implementation, the outlier detection module is used to perform dimensionality reduction processing on the original data information based on the principal component analysis PCA algorithm, and use the local outlier factor LOF to perform dimensionality reduction on the original data information after dimensionality reduction. Perform outlier detection.

在一种可能的实施方式中，修复模块，用于利用生成对抗网络对所述异常值进行修复，获得修复后的第一数据，以及利用支持向量回归算法对所述异常值进行修复，获得修复后的第二数据；对所述第一数据和所述第二数据进行加权计算，获得对所述异常值修复后的最终结果。In one possible implementation, the repair module is used to repair the outlier using a generative adversarial network to obtain repaired first data, and to repair the outlier using a support vector regression algorithm to obtain repaired second data; and to perform weighted calculation on the first data and the second data to obtain a final result after repairing the outlier.

在一种可能的实施方式中，所述目标数据信息还包括：流量和/或网络性能。In a possible implementation manner, the target data information further includes: traffic and/or network performance.

第四方面，本申请实施例提供了一种服务器的负载预测装置，所述装置包括：In the fourth aspect, embodiments of the present application provide a load prediction device for a server, where the device includes:

获取单元，用于获取云计算平台中各个服务器在最近多个历史时刻的目标数据信息，所述目标数据信息至少包括负载信息；An acquisition unit, configured to acquire target data information of each server in the cloud computing platform at multiple recent historical moments, where the target data information at least includes load information;

生成单元，用于针对每个历史时刻，基于所述云计算平台中各个所述服务器的所述目标数据信息生成所述每个历史时刻的图结构；A generation unit configured to generate, for each historical moment, the graph structure of each historical moment based on the target data information of each server in the cloud computing platform;

预测单元，用于将最近多个历史时刻的图结构输入负载预测模型中，预测出未来多个时刻的所述各个服务器的负载信息，其中，所述负载预测模型根据第一方面任一实施方式所述的方法训练而成。A prediction unit is used to input the graph structure of the most recent multiple historical moments into a load prediction model to predict the load information of each server at multiple future moments, wherein the load prediction model is trained according to the method described in any implementation method of the first aspect.

第五方面，本申请实施例提供了一种计算机可读存储介质，其上存储有计算机程序，该程序被处理器执行时实现如第一方面任一可能的实现方式所述的方法，实现如第二方面任一可能的实现方式所述的方法。In a fifth aspect, embodiments of the present application provide a computer-readable storage medium on which a computer program is stored. When the program is executed by a processor, the method described in any possible implementation manner of the first aspect is implemented, and the method is implemented as follows: The method described in any possible implementation of the second aspect.

第六方面，本申请实施例提供了一种电子设备，电子设备包括：In a sixth aspect, an embodiment of the present application provides an electronic device, the electronic device comprising:

一个或多个处理器；one or more processors;

所述处理器与存储装置耦合，所述存储装置用于存储一个或多个程序；The processor is coupled to a storage device for storing one or more programs;

当一个或多个程序被一个或多个处理器执行，使得电子设备实现如第一方面任一可能的实现方式所述的方法，实现如第一方面任一可能的实现方式所述的方法，实现如第二方面任一可能的实现方式所述的方法。When one or more programs are executed by one or more processors, the electronic device implements the method described in any possible implementation of the first aspect, and implements the method described in any possible implementation of the first aspect, Implement the method described in any possible implementation manner of the second aspect.

第七方面，本申请实施例提供了一种计算机程序产品，所述计算机程序产品中包含有指令，当指令在计算机或处理器上运行时，使得计算机或处理器执行第一方面任一可能的实现方式所述的方法，实现如第二方面任一可能的实现方式所述的方法。In a seventh aspect, embodiments of the present application provide a computer program product. The computer program product contains instructions. When the instructions are run on a computer or processor, the computer or processor causes the computer or processor to execute any possible method of the first aspect. The method described in the implementation manner implements the method described in any possible implementation manner of the second aspect.

本申请实施例提供的负载预测模型的训练方法、服务器的负载预测方法及装置，该负载预测模型包括编码器和解码器，并且编码器和解码器均包含了图卷积神经网络、前馈神经网络、多头注意力机制网络，每个训练样本中每个图结构依次经过编码器中的第一图卷积神经网络、第一前馈神经网络、第一多头注意力机制网络和第二前馈神经网络，最终输出每个训练样本卷积计算后的第一路网时空特征，将每个真值样本中每个图结构依次输入第二图卷积神经网络、第三前馈神经网络，获得卷积计算后的第二路网空间特征，将每个训练样本对应的未来多个时刻的第三路网时空特征，以及将每个真值样本的第二路网时空特征输入第四前馈神经网络，获得卷积计算后的第三路网时空特征和卷积计算后的第二路网时空特征，根据卷积计算后的第三路网时空特征和卷积计算后的第二路网时空特征，计算当前损失值，在当前损失值不满足收敛条件时，对负载预测模型的模型参数进行调整，并继续对参数调整后的负载预测模型进行训练，直至当前损失值满足收敛条件时，获得最终的负载预测模型，以便基于最终的负载预测模型进行负载预测。由此可知，本申请实施例可以从空间和时间层面分析云计算平台中各个服务器的负载信息，能够捕捉到长期的依赖关系，从而可以使得训练得到的负载预测模型更准确地预测服务器的负载信息。Embodiments of the present application provide a load prediction model training method, a server load prediction method and a device. The load prediction model includes an encoder and a decoder, and both the encoder and the decoder include a graph convolutional neural network and a feedforward neural network. network, multi-head attention mechanism network, each graph structure in each training sample sequentially passes through the first graph convolutional neural network, the first feedforward neural network, the first multi-head attention mechanism network and the second feedforward neural network in the encoder. The feed neural network finally outputs the spatio-temporal characteristics of the first network after convolution calculation of each training sample, and inputs each graph structure in each true value sample into the second graph convolutional neural network and the third feedforward neural network in turn. Obtain the spatial features of the second road network after convolution calculation, input the spatio-temporal features of the third road network at multiple times in the future corresponding to each training sample, and input the spatio-temporal features of the second road network of each true value sample into the fourth front Feed the neural network to obtain the spatio-temporal characteristics of the third channel network after convolution calculation and the spatio-temporal characteristics of the second channel network after convolution calculation. According to the spatio-temporal characteristics of the third channel network after convolution calculation and the second channel network after convolution calculation, The spatio-temporal characteristics of the network are used to calculate the current loss value. When the current loss value does not meet the convergence conditions, the model parameters of the load prediction model are adjusted, and the load prediction model after parameter adjustment continues to be trained until the current loss value meets the convergence conditions. , obtain the final load prediction model in order to perform load prediction based on the final load prediction model. It can be seen from this that the embodiments of the present application can analyze the load information of each server in the cloud computing platform from the spatial and temporal levels, and can capture long-term dependencies, so that the trained load prediction model can more accurately predict the load information of the server. .

本申请实施例的创新性至少包括：The innovativeness of the embodiments of this application at least includes:

1、本申请实施例提供的负载预测模型包括编码器和解码器，并且编码器和解码器均包含了图卷积神经网络(Graph convolution Network，GCN)、前馈神经网络、多头注意力机制网络，本申请实施例先将目标数据信息转化成包括服务器拓扑结构的图结构，然后基于图卷积神经网络从空间维度提取路网空间特征，再基于多头注意力机制网络从时间维度分析各个路网空间特征的依赖关系，从中提取路网时空特征，并且每次特征的提取后，通过前馈神经网络对提取的特征进行精炼，可以加快负载预测模型的收敛及提高负载预测模型的质量。因此，本申请实施例训练的负载预测模型可以从空间和时间层面分析云计算平台中各个服务器的负载信息，能够捕捉到长期的依赖关系，从而可以使得训练得到的负载预测模型更准确地预测服务器的负载信息。1. The load prediction model provided by the embodiment of this application includes an encoder and a decoder, and both the encoder and the decoder include a graph convolution neural network (GCN), a feedforward neural network, and a multi-head attention mechanism network. , the embodiment of this application first converts the target data information into a graph structure including the server topology, then extracts the spatial characteristics of the road network from the spatial dimension based on the graph convolutional neural network, and then analyzes each road network from the time dimension based on the multi-head attention mechanism network. The spatial and temporal features of the road network are extracted from the dependencies of spatial features, and after each feature extraction, the extracted features are refined through a feedforward neural network, which can speed up the convergence of the load prediction model and improve the quality of the load prediction model. Therefore, the load prediction model trained in the embodiments of this application can analyze the load information of each server in the cloud computing platform from the spatial and temporal levels, and can capture long-term dependencies, so that the trained load prediction model can predict the server more accurately. load information.

2、本申请实施例通过对原始数据信息进行异常值检测、修复、归一化(标准化)操作，可以使得目标数据信息更加准确，从而可以使模型更好地适应不同的数据分布和变化，进而提升模型的性能和鲁棒性。2. The embodiments of the present application can make the target data information more accurate by performing outlier detection, repair, and normalization operations on the original data information, so that the model can better adapt to different data distributions and changes, thereby improving the performance and robustness of the model.

3、本申请实施例通将PCA(Principal Components Analysis，主成分分析)算法与LOF(Local Outlier Factor，局部离群因子)算法相结合，可以更准确地检测异常值，并提高鲁棒性。3. The embodiment of this application combines the PCA (Principal Components Analysis, principal component analysis) algorithm with the LOF (Local Outlier Factor, local outlier factor) algorithm to more accurately detect outliers and improve robustness.

4、本申请实施例基于GAN(Generative Adversarial Nets，生成对抗网络)的异常值修复方法和支持向量回归异常值修复方法进行加权修复异常值，可以提高异常值修复的准确性。4. The embodiment of this application performs weighted repair of outliers based on the outlier repair method of GAN (Generative Adversarial Nets, Generative Adversarial Nets) and the support vector regression outlier repair method, which can improve the accuracy of outlier repair.

5、本申请实施例在模型训练或者模型预测时，模型输入除了包括负载信息外，还包括流量和/或网络性能，通过结合历史的流量和/或网络性能对未来负载信息进行预测，使得预测结果更加客观，更加接近未来的真实值，从而可以进一步提高负载预测的准确性。5. During model training or model prediction in the embodiment of this application, in addition to load information, the model input also includes traffic and/or network performance. The future load information is predicted by combining historical traffic and/or network performance, so that the prediction The results are more objective and closer to the true value in the future, which can further improve the accuracy of load prediction.

附图说明Description of drawings

为了更清楚地说明本申请实施例或现有技术中的技术方案，下面将对实施例或现有技术描述中所需要使用的附图作简单介绍。显而易见地，下面描述中的附图仅仅是本申请的一些实施例。对于本领域普通技术人员来讲，在不付出创造性劳动的前提下，还可以根据这些附图获得其他的附图。In order to more clearly explain the embodiments of the present application or the technical solutions in the prior art, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, the drawings in the following description are only some embodiments of the present application. For those of ordinary skill in the art, other drawings can also be obtained based on these drawings without exerting creative efforts.

图1为本申请实施例提供的一种负载预测模型的训练方法的流程示意图；Figure 1 is a schematic flow chart of a load prediction model training method provided by an embodiment of the present application;

图2为本申请实施例提供的一种负载预测模型的结构示意图；Figure 2 is a schematic structural diagram of a load prediction model provided by an embodiment of the present application;

图3为本申请实施例提供的一种服务器的负载预测方法的流程示意图；Figure 3 is a schematic flowchart of a server load prediction method provided by an embodiment of the present application;

图4为本申请实施例提供的一种负载预测模型的训练装置的组成框图；Figure 4 is a block diagram of a training device for a load prediction model provided by an embodiment of the present application;

图5为本申请实施例提供的一种服务器的负载预测装置的组成框图。FIG. 5 is a block diagram of a server load prediction device provided by an embodiment of the present application.

具体实施方式Detailed ways

下面将结合本申请实施例中的附图，对本申请实施例中的技术方案进行清楚、完整的描述。显然，所描述的实施例仅仅是本申请的一部分实施例，而不是全部的实施例。基于本申请中的实施例，本领域普通技术人员在没有付出创造性劳动前提下所获得的所有其他实施例，都属于本申请保护的范围。The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application. Obviously, the described embodiments are only some of the embodiments of the present application, but not all of the embodiments. Based on the embodiments in this application, all other embodiments obtained by those of ordinary skill in the art without creative efforts fall within the scope of protection of this application.

需要说明的是，在不冲突的情况下，本申请中的实施例及实施例中的特征可以相互组合。本申请实施例及附图中的术语“包括”和“具有”以及它们的任何变形，意图在于覆盖不排他的包含。例如包含的一系列步骤或单元的过程、方法、系统、产品或设备没有限定于已列出的步骤或单元，而是可选地还包括没有列出的步骤或单元，或可选地还包括对于这些过程、方法、产品或设备固有的其他步骤或单元。It should be noted that, as long as there is no conflict, the embodiments and features in the embodiments of this application can be combined with each other. The terms "including" and "having" and any variations thereof in the embodiments and drawings of this application are intended to cover non-exclusive inclusion. For example, a process, method, system, product or device that includes a series of steps or units is not limited to the listed steps or units, but optionally also includes steps or units that are not listed, or optionally also includes Other steps or units inherent to such processes, methods, products or devices.

图1为一种负载预测模型的训练方法的流程示意图，该方法可以应用于电子设备或计算机设备，该方法包括下述步骤S110-S180。如图2所示，负载预测模型包括编码器210和解码器220，编码器包括第一图卷积神经网络211、第一前馈神经网络212、第一多头注意力机制网络213、第二前馈神经网络214，解码器包括第二图卷积神经网络221、第三前馈神经网络222、第二多头注意力机制网络223、第四前馈神经网络224。Figure 1 is a schematic flowchart of a load prediction model training method. The method can be applied to electronic equipment or computer equipment. The method includes the following steps S110-S180. As shown in Figure 2, the load prediction model includes an encoder 210 and a decoder 220. The encoder includes a first graph convolutional neural network 211, a first feedforward neural network 212, a first multi-head attention mechanism network 213, a second Feedforward neural network 214, the decoder includes a second graph convolutional neural network 221, a third feedforward neural network 222, a second multi-head attention mechanism network 223, and a fourth feedforward neural network 224.

下面对该模型训练方法进行详细介绍：The following is a detailed introduction to the model training method:

S110：获取训练集。S110: Obtain the training set.

其中，训练集包括多个训练样本及每个训练样本对应的真值样本，每个训练样本包括连续多个历史时刻的图结构，每个训练样本对应的真值样本包括连续多个历史时刻相邻的未来多个时刻的图结构，每个时刻的图结构包括该时刻下多个服务器中，每个服务器的目标数据信息，目标数据信息至少包括负载信息。每个时刻的图结构可以为多个服务器的拓扑结构，图结构中的每个节点代表一个服务器，每个节点携带目标数据信息。The training set includes multiple training samples and true value samples corresponding to each training sample. Each training sample includes a graph structure of multiple consecutive historical moments. The true value sample corresponding to each training sample includes a graph structure of multiple future moments adjacent to multiple consecutive historical moments. The graph structure at each moment includes target data information of each server among multiple servers at that moment. The target data information includes at least load information. The graph structure at each moment can be a topological structure of multiple servers. Each node in the graph structure represents a server, and each node carries target data information.

连续多个历史时刻与该连续多个历史时刻相邻的未来多个时刻的时长可以相同，也可以不同，当时长不同时，连续多个历史时刻的时长往往大于该未来多个时刻的时长。例如，连续多个历史时刻为第1-5时刻，未来多个时刻可以为6-10时刻，也可以为6-8时刻。The durations of multiple consecutive historical moments and multiple future moments adjacent to the multiple consecutive historical moments may be the same or different. When the durations are different, the duration of multiple consecutive historical moments is often greater than the duration of the multiple future moments. For example, multiple consecutive historical moments are moments 1-5, and multiple future moments can be moments 6-10, or moments 6-8.

目标数据信息除了包括负载信息外，还可以包括：流量和/或网络性能。流量包括入流量、出流量和总流量等，负载信息可以包括CPU(Central Processing Unit，中央处理器)使用率、内存使用率等。网络性能包括延迟、带宽利用率等。In addition to load information, target data information may also include: traffic and/or network performance. Traffic includes incoming traffic, outgoing traffic, and total traffic. Load information can include CPU (Central Processing Unit, central processing unit) usage, memory usage, etc. Network performance includes latency, bandwidth utilization, etc.

为了提高负载预测模型的稳定性、泛化性能和鲁棒性，在获取到每个服务器的原始数据信息(至少包括原始的负载信息)之后，可以先对每个服务器的原始数据信息进行异常值检测。当检测到原始数据信息中存在异常值时，再对异常值进行修复，最后可以对异常值修复后的原始数据信息进行归一化或标准化操作，获得目标数据信息。In order to improve the stability, generalization performance and robustness of the load prediction model, after obtaining the original data information of each server (including at least the original load information), you can first perform outlier analysis on the original data information of each server. detection. When outliers are detected in the original data information, the outliers are repaired. Finally, the original data information after the outlier repair can be normalized or standardized to obtain the target data information.

其中，对每个服务器的原始数据信息进行异常值检测的方法包括：基于PCA算法对原始数据信息进行降维处理，并利用LOF对降维后的原始数据信息进行异常值检测，即利用LOF计算和识别降维后的原始数据信息的离群因子得分，并根据得分确定异常值。通过结合PCA和LOF这两种方法，可以更准确地检测异常值，并提高鲁棒性。Among them, the method for detecting outliers on the original data information of each server includes: performing dimensionality reduction processing on the original data information based on the PCA algorithm, and using LOF to detect outliers on the original data information after dimensionality reduction, that is, using LOF to calculate and identify the outlier factor score of the original data information after dimensionality reduction, and determining the outlier according to the score. By combining the two methods of PCA and LOF, outliers can be detected more accurately and robustness can be improved.

对异常值进行修复的方法包括：利用生成对抗网络对异常值进行修复，获得修复后的第一数据，以及利用支持向量回归算法对异常值进行修复，获得修复后的第二数据；对第一数据和第二数据进行加权计算，获得对异常值修复后的最终结果。Methods for repairing outliers include: using a generative adversarial network to repair the outliers to obtain the repaired first data, and using the support vector regression algorithm to repair the outliers to obtain the repaired second data; The first data and the second data are weighted to obtain the final result after repairing the outliers.

其中，可以先训练一个GAN模型，将原始数据信息作为输入，目标是生成与原始数据信息相似的合成数据。训练完成后，使用生成器部分来生成修复后的异常值数据；同时使用支持向量回归来生成修复后的异常值数据。最后将这两种方法中得到的被修复数据进行加权求和，根据加权得到的结果，生成修复后的最终结果。Among them, a GAN model can be trained first, taking the original data information as input, and the goal is to generate synthetic data similar to the original data information. After training is completed, use the generator part to generate the repaired outlier data; at the same time, use support vector regression to generate the repaired outlier data. Finally, the repaired data obtained by these two methods are weighted and summed, and the final result after repair is generated based on the weighted results.

本申请实施例通过检测并修复异常值，并对数据进行归一化或标准化等操作，可以使模型更好地适应不同的数据分布和变化，从而提升模型的性能和鲁棒性。By detecting and repairing outliers and performing operations such as normalizing or standardizing the data, the embodiments of the present application can make the model better adapt to different data distributions and changes, thereby improving the performance and robustness of the model.

在对原始数据信息进行预处理，获得目标数据信息之后，可以根据需要预测的服务器的数量，来确定每条数据的长度，之后调整数据尺寸，这里需要输入的数据指一个时刻下的所有信息，包括流量、负载以及网络性能等，并且每条数据都是一个图结构。保证负载预测模型中编码器和解码器的输入与输出的张量大小是一致的。之后可以将这些数据集划分成训练集和测试集。由于本申请实施例是时间序列数据，所以训练集的划分同一般数据集划分略有不同，需要将一部分数据作为历史数据，将该历史数据的未来一部分数据作为目标数据。训练集是指负载预测模型用来学习的数据，测试集则是用来测试模型训练后性能的数据。模型的性能指标包括准确率，召回率和均方误差等。准确率表示为正确预测的样本比例，召回率表示在所有实际为正例中，被正确预测为正例的比例。均方误差则表示预测值与真实值之间的平方差的均值。After preprocessing the original data information and obtaining the target data information, the length of each piece of data can be determined based on the number of servers that need to be predicted, and then the data size can be adjusted. The data that needs to be input here refers to all the information at one time. Including traffic, load, network performance, etc., and each piece of data is a graph structure. Ensure that the input and output tensor sizes of the encoder and decoder in the load prediction model are consistent. These data sets can then be divided into training and test sets. Since the embodiment of this application is time series data, the division of the training set is slightly different from the division of general data sets. It is necessary to use a part of the data as historical data and a part of the future data of the historical data as target data. The training set refers to the data used by the load prediction model to learn, and the test set refers to the data used to test the performance of the model after training. The performance indicators of the model include accuracy, recall and mean square error. The precision rate represents the proportion of correctly predicted samples, and the recall rate represents the proportion of correctly predicted positive samples among all actual positive samples. The mean square error represents the mean squared difference between the predicted value and the true value.

除此之外，为了增强模型输入，在将训练集进入负载预测模型训练之前计算位置编码，位置编码将每个时间步(每个时刻)视为一个绝对位置，并使用使用三角函数(如正弦和余弦函数)来生成不同频率的周期性编码。局部时间戳为位置编码，可能会存在使编码器和解码器之间的query和key的错误匹配问题，所以除了进行位置编码还加入全局时间戳作为时间编码，使用这种编码，能考虑如星期、月份、节假日等日期因素的影响，提高服务器负载预测的准确度。In addition, in order to enhance the model input, the position encoding is calculated before the training set is fed into the load prediction model training. The position encoding treats each time step (each moment) as an absolute position and uses trigonometric functions (such as sine and cosine functions) to generate periodic encodings of different frequencies. The local timestamp is used as the position encoding, which may cause the query and key to be mismatched between the encoder and decoder. Therefore, in addition to the position encoding, the global timestamp is also added as the time encoding. This encoding can take into account the influence of date factors such as weeks, months, holidays, etc., and improve the accuracy of server load prediction.

S120：将每个训练样本中每个图结构分别输入第一图卷积神经网络中，提取每个训练样本中每个图结构的第一路网空间特征，以及将每个真值样本中每个图结构分别输入第二图卷积神经网络中，提取每个真值样本中每个图结构的第二路网空间特征。S120: Input each graph structure in each training sample into the first graph convolutional neural network, extract the first network spatial characteristics of each graph structure in each training sample, and convert each graph structure in each true value sample into the first graph convolutional neural network. Each graph structure is input into the second graph convolutional neural network respectively, and the second network spatial characteristics of each graph structure in each true value sample are extracted.

为了能够使负载预测模型提取空间依赖，本申请实施例在编码器和解码器中加入了图卷积神经网络，利用图卷积神经网络在图结构上执行卷积操作。卷积层通过一组可学习的滤波器对输入数据进行滤波、提取特征。卷积层的核心思想是权值共享，所有滤波器都对输入数据计算同样的卷积，从而能减少需要训练的参数。卷积操作可以看作是一种特殊的线性变换，它将输入数据中的局部关系转化成全局信息，并且具有平移不变性，即特征提取的结果与输入数据的位置无关，这使得卷积层非常适用于处理空间和时间上相邻的数据，能够捕捉它们内在的规律和结构，提高模型的鲁棒性。In order to enable the load prediction model to extract spatial dependence, embodiments of the present application add a graph convolutional neural network to the encoder and decoder, and use the graph convolutional neural network to perform convolution operations on the graph structure. The convolutional layer filters the input data and extracts features through a set of learnable filters. The core idea of the convolutional layer is weight sharing. All filters calculate the same convolution on the input data, thereby reducing the parameters that need to be trained. The convolution operation can be regarded as a special linear transformation, which converts local relationships in the input data into global information, and has translation invariance, that is, the result of feature extraction has nothing to do with the position of the input data, which makes the convolution layer It is very suitable for processing spatially and temporally adjacent data, capable of capturing their inherent patterns and structures, and improving the robustness of the model.

S130：将每个第一路网空间特征经过第一前馈神经网络的卷积计算，得到卷积计算后的第一路网空间特征，以及将每个第二路网空间特征经过第三前馈神经网络的卷积计算，得到卷积计算后的第二路网空间特征。S130: Pass each first network spatial feature through the convolution calculation of the first feedforward neural network to obtain the first network spatial feature after the convolution calculation, and pass each second road network spatial feature through the third feedforward neural network. The convolution calculation of the fed neural network is performed to obtain the spatial characteristics of the second network after the convolution calculation.

前馈神经网络是一种人工神经网络，其结构由多个层次的节点组成，并按特定的方向传递信息。分别将第一路网空间特征和第二路网空间特征输入前馈神经网络，可以使得第一路网空间特征和第二路网空间特征更加精炼。Feedforward neural network is an artificial neural network whose structure consists of multiple levels of nodes and transmits information in a specific direction. Inputting the spatial features of the first road network and the spatial features of the second road network into the feedforward neural network respectively can make the spatial features of the first road network and the spatial features of the second road network more refined.

S140：对每个时刻的卷积计算后的第一路网空间特征增加对应的时间信息，并将增加时间信息后的每个训练样本的第一路网空间特征输入第一多头注意力机制网络中，获得每个训练样本的第一路网时空特征。S140: Add corresponding time information to the first channel network spatial feature after convolution calculation at each moment, and input the first channel network spatial feature of each training sample after adding time information into the first multi-head attention mechanism In the network, the first network spatiotemporal characteristics of each training sample are obtained.

其中，每个时刻的卷积计算后的第一路网空间特征对应的时间信息为该卷积计算后的第一路网空间特征所在的时刻。将每个训练样本中各个时刻卷积计算后的第一路网空间特征增加对应的时间信息后，可以将一个训练样本中所有的第一路网空间特征输入第一多头注意力机制网络中，以确保模型能够考虑路网特征的时间动态，获得每个训练样本的第一路网时刻特征。Among them, the time information corresponding to the first road network spatial feature calculated by convolution at each moment is the time at which the first road network spatial feature calculated by convolution is located. After adding the corresponding time information to the first path network spatial features calculated by convolution at each time in each training sample, all the first path network spatial features in a training sample can be input into the first multi-head attention mechanism network , to ensure that the model can consider the time dynamics of road network characteristics and obtain the first road network moment characteristics of each training sample.

S150：将每个训练样本的第一路网时空特征经过第二前馈神经网络的卷积计算，得到每个训练样本卷积计算后的第一路网时空特征。S150: The spatio-temporal characteristics of the first network of each training sample are subjected to the convolution calculation of the second feedforward neural network to obtain the spatio-temporal characteristics of the first network after convolution calculation of each training sample.

为了使得第一路网时空特征更加精炼，可以将每个训练样本的第一路网时空特征经过第二前馈神经网络的卷积计算，得到每个训练样本卷积计算后的第一路网时空特征。In order to make the first road network spatiotemporal features more refined, the first road network spatiotemporal features of each training sample may be subjected to convolution calculation of the second feedforward neural network to obtain the first road network spatiotemporal features after convolution calculation of each training sample.

S160：将卷积计算后的第一路网时空特征，以及增加时间信息后的对应真值样本中的多个第二路网空间特征，输入第二多头注意力机制网络中，获得每个训练样本对应的未来多个时刻的第三路网时空特征，以及每个真值样本的第二路网时空特征。S160: Input the spatio-temporal features of the first path network calculated by convolution and the spatial features of the second path network in the corresponding true value samples after adding time information into the second multi-head attention mechanism network to obtain each The spatio-temporal characteristics of the third road network at multiple times in the future corresponding to the training sample, and the spatio-temporal characteristics of the second road network for each true value sample.

编码器输出卷积计算后的第一路网时空特征之后，可以输入解码器的第二多头注意力机制网络中做解码计算，获得每个训练样本对应的未来多个时刻的第三路网时空特征，同时在对每个时刻的卷积计算后的第二路网空间特征增加对应的时间信息后，将增加时间信息后的每个真值样本的第二路网空间特征输入第二多头注意力机制网络中，获得每个真值样本的第二路网时空特征。After the encoder outputs the spatio-temporal features of the first network after convolution calculation, it can be input into the second multi-head attention mechanism network of the decoder for decoding calculations to obtain the third network at multiple times in the future corresponding to each training sample. spatio-temporal features, and after adding corresponding time information to the second path network spatial feature after the convolution calculation at each moment, input the second path network spatial feature of each true value sample after adding the time information. In the head attention mechanism network, the second network spatiotemporal characteristics of each true value sample are obtained.

S170：将每个训练样本对应的未来多个时刻的第三路网时空特征，以及每个真值样本的第二路网时空特征输入第四前馈神经网络，获得卷积计算后的第三路网时空特征和卷积计算后的第二路网时空特征。S170: Input the spatio-temporal characteristics of the third network at multiple times in the future corresponding to each training sample, and the spatio-temporal characteristics of the second network of each true value sample into the fourth feedforward neural network, and obtain the third network after convolution calculation. The spatio-temporal characteristics of the road network and the spatio-temporal characteristics of the second road network after convolution calculation.

为了使得第三路网时空特征和第二路网时空特征更加精炼，可以将第三路网时空特征和第二路网时空特征分别输入第四前馈神经网络中，获得卷积计算后的第三路网时空特征和卷积计算后的第二路网时空特征。In order to make the third road network spatiotemporal features and the second road network spatiotemporal features more refined, the third road network spatiotemporal features and the second road network spatiotemporal features can be respectively input into the fourth feedforward neural network to obtain the third road network spatiotemporal features after convolution calculation and the second road network spatiotemporal features after convolution calculation.

S180：根据卷积计算后的第三路网时空特征和卷积计算后的第二路网时空特征，计算当前损失值，在当前损失值不满足收敛条件时，对负载预测模型的模型参数进行调整，并继续对参数调整后的负载预测模型进行训练，直至当前损失值满足收敛条件时，获得最终的负载预测模型。S180: Calculate the current loss value based on the spatiotemporal characteristics of the third road network after convolution calculation and the spatiotemporal characteristics of the second road network after convolution calculation. When the current loss value does not meet the convergence conditions, adjust the model parameters of the load prediction model, and continue to train the load prediction model after the parameter adjustment until the current loss value meets the convergence conditions, and obtain the final load prediction model.

在获得卷积计算后的第三路网时空特征和卷积计算后的第二路网时空特征之后，可以根据卷积计算后的第三路网时空特征和卷积计算后的第二路网时空特征之间的差异，计算当前损失值，比如针对每一对训练样本和真值样本，分别计算卷积计算后的第三路网时空特征及其对应的卷积计算后的第二路网时空特征之差，然后对这些差值的均值作为当前损失值。After obtaining the spatio-temporal characteristics of the third road network after convolution calculation and the spatio-temporal characteristics of the second road network after convolution calculation, it can be based on the spatio-temporal characteristics of the third road network after convolution calculation and the second road network after convolution calculation. The difference between spatio-temporal features is used to calculate the current loss value. For example, for each pair of training samples and true value samples, the spatio-temporal features of the third path network after convolution calculation and the corresponding second path network after convolution calculation are calculated. The difference between spatio-temporal features, and then the average of these differences is used as the current loss value.

在当前损失值大于或者等于损失阈值时，确定当前损失值不满足收敛条件，此时可以对负载预测模型的模型参数进行调整，然后基于调整参数后的负载预测模型，继续执行步骤S110-S170，直至在当前损失值小于损失阈值时，确定当前损失值满足收敛条件，并将当前训练得到的负载预测模型作为最终的负载预测模型。When the current loss value is greater than or equal to the loss threshold, it is determined that the current loss value does not meet the convergence condition. At this time, the model parameters of the load prediction model can be adjusted, and then based on the load prediction model after adjusting the parameters, steps S110-S170 are continued. Until the current loss value is less than the loss threshold, it is determined that the current loss value meets the convergence condition, and the load prediction model currently trained is used as the final load prediction model.

在一种实施方式中，模型训练过程的损失函数使用交叉熵损失函数来完成。交叉熵损失函数可以测量每组未来时刻目标数据信息的概率分布，并将预测概率与真实的目标数据信息进行比较。同时，为了避免过拟合，通常还会加入正则化项，例如L2正则化等。In one implementation, the loss function of the model training process is accomplished using a cross-entropy loss function. The cross-entropy loss function can measure the probability distribution of target data information at each set of future moments and compare the predicted probability with the real target data information. At the same time, in order to avoid overfitting, regularization terms are usually added, such as L2 regularization.

在获得一个收敛的负载预测模型之后，将测试数据集输入至该负载预测模型，对模型的输出结果进行可视化或分析，计算其性能指标并进行评估。After obtaining a converged load forecasting model, the test data set is input into the load forecasting model, the output results of the model are visualized or analyzed, and its performance indicators are calculated and evaluated.

需要补充的是，本申请实施例可以在编码器和解码器的各网络之间进行残差连接，并且在残差连接后进行层归一化操作，缓解了在深度神经网络中增加深度带来的梯度消失问题。并且使用了Dropout(丢弃)操作，即丢弃部分网络中神经元的输出，减少过拟合现象。It should be added that the embodiments of the present application can perform residual connections between the networks of the encoder and the decoder, and perform layer normalization operations after the residual connections, which alleviates the problems caused by increasing depth in the deep neural network. The vanishing gradient problem. And the Dropout operation is used, that is, the output of some neurons in the network is discarded to reduce the over-fitting phenomenon.

解码器同编码器部分一样，相比编码器多了一个用于与编码器输出进行交互的多头注意力机制。Multi-Head Attention(多头注意力)层将输入序列中的每个位置都作为查询(Q)、键(K)和值(V)，计算出每个位置和所有位置之间的注意力分布，得到一个加权和表示该位置的上下文信息。全连接层则对该上下文信息进行前向传播，得到该层的输出。具体而言，Multi-Head Attention(多头注意力)的计算公式如下：The decoder is the same as the encoder part. Compared with the encoder, it has an additional multi-head attention mechanism for interacting with the encoder output. The Multi-Head Attention layer treats each position in the input sequence as a query (Q), key (K), and value (V), and calculates the attention distribution between each position and all positions, Obtain a weighted sum representing the contextual information of the location. The fully connected layer performs forward propagation on the context information to obtain the output of the layer. Specifically, the calculation formula of Multi-Head Attention is as follows:

其中，d_k为向量维度，Q、K、V均为向量。该计算公式表明，对于每个查询Q，Multi-Head Attention会根据其与所有键K的相似度，对所有值V进行加权求和。Among them, d _k is the vector dimension, and Q, K, and V are all vectors. This calculation formula shows that for each query Q, Multi-Head Attention performs a weighted sum of all values V based on their similarity to all keys K.

解码器中的第二多头注意力机制网络与编码器中的第一多头注意力机制网络类似，但是在计算注意力分布时，解码器中的第二多头注意力机制网络只考虑该位置之前的位置，从而避免了解码器中使用未来信息的问题。The second multi-head attention mechanism network in the decoder is similar to the first multi-head attention mechanism network in the encoder, but when calculating the attention distribution, the second multi-head attention mechanism network in the decoder only considers the The position before the position, thus avoiding the problem of using future information in the decoder.

本申请实施例提供的负载预测模型的训练方法，该负载预测模型包括编码器和解码器，并且编码器和解码器均包含了图卷积神经网络、前馈神经网络、多头注意力机制网络，每个训练样本中每个图结构依次经过编码器中的第一图卷积神经网络、第一前馈神经网络、第一多头注意力机制网络和第二前馈神经网络，最终输出每个训练样本卷积计算后的第一路网时空特征，将每个真值样本中每个图结构依次输入第二图卷积神经网络、第三前馈神经网络，获得卷积计算后的第二路网空间特征，将每个训练样本对应的未来多个时刻的第三路网时空特征，以及将每个真值样本的第二路网时空特征输入第四前馈神经网络，获得卷积计算后的第三路网时空特征和卷积计算后的第二路网时空特征，根据卷积计算后的第三路网时空特征和卷积计算后的第二路网时空特征，计算当前损失值，在当前损失值不满足收敛条件时，对负载预测模型的模型参数进行调整，并继续对参数调整后的负载预测模型进行训练，直至当前损失值满足收敛条件时，获得最终的负载预测模型，以便基于最终的负载预测模型进行负载预测。由此可知，本申请实施例可以从空间和时间层面分析云计算平台中各个服务器的负载信息，能够捕捉到长期的依赖关系，从而可以使得训练得到的负载预测模型更准确地预测服务器的负载信息。The training method of the load prediction model provided in the embodiment of the present application comprises an encoder and a decoder, and both the encoder and the decoder comprise a graph convolutional neural network, a feedforward neural network, and a multi-head attention mechanism network. Each graph structure in each training sample sequentially passes through the first graph convolutional neural network, the first feedforward neural network, the first multi-head attention mechanism network, and the second feedforward neural network in the encoder, and finally outputs the first road network spatiotemporal features after convolution calculation of each training sample, and inputs each graph structure in each true value sample sequentially into the second graph convolutional neural network and the third feedforward neural network to obtain the second road network spatial features after convolution calculation, and each training sample The third road network spatiotemporal characteristics of the corresponding multiple future moments and the second road network spatiotemporal characteristics of each true value sample are input into the fourth feedforward neural network to obtain the third road network spatiotemporal characteristics after convolution calculation and the second road network spatiotemporal characteristics after convolution calculation. According to the third road network spatiotemporal characteristics after convolution calculation and the second road network spatiotemporal characteristics after convolution calculation, the current loss value is calculated. When the current loss value does not meet the convergence condition, the model parameters of the load prediction model are adjusted, and the load prediction model after parameter adjustment is continued to be trained until the current loss value meets the convergence condition, and the final load prediction model is obtained, so as to perform load prediction based on the final load prediction model. It can be seen from this that the embodiment of the present application can analyze the load information of each server in the cloud computing platform from the spatial and temporal levels, and can capture long-term dependencies, so that the trained load prediction model can more accurately predict the load information of the server.

基于上述方法实施例，本申请的另一实施例提供了一种服务器的负载预测方法，如图3所示，方法包括：Based on the above method embodiment, another embodiment of the present application provides a server load prediction method, as shown in Figure 3. The method includes:

S310：获取云计算平台中各个服务器在最近多个历史时刻的目标数据信息。S310: Obtain the target data information of each server in the cloud computing platform at multiple recent historical moments.

目标数据信息至少包括负载信息，还可以包括流量和/或网络性能。流量包括入流量、出流量和总流量等，负载信息可以包括CPU使用率、内存使用率等。网络性能包括延迟、带宽利用率等。The target data information at least includes load information, and may also include traffic and/or network performance. Traffic includes incoming traffic, outgoing traffic, and total traffic. Load information can include CPU usage, memory usage, etc. Network performance includes latency, bandwidth utilization, etc.

当需要预测未来多个时刻各服务器的负载信息时，可以先获取最近多个历史时刻的目标数据信息，例如最近10分钟包含的多个时刻的目标数据信息。其中，一个时刻的大小可以根据实际需求确定，例如一个时刻可以指1分钟，也可以指5分钟。When you need to predict the load information of each server at multiple times in the future, you can first obtain the target data information at multiple recent historical times, such as the target data information at multiple times included in the last 10 minutes. Among them, the size of a moment can be determined according to actual needs. For example, a moment can refer to 1 minute or 5 minutes.

为了提高负载预测模型的稳定性、泛化性能和鲁棒性，在获取到每个服务器在最近多个历史时刻的原始数据信息之后，可以先对每个服务器的原始数据信息进行异常值检测。当检测到原始数据信息中存在异常值时，再对异常值进行修复，最后可以对异常值修复后的原始数据信息进行归一化或标准化操作，获得目标数据信息。In order to improve the stability, generalization performance and robustness of the load prediction model, after obtaining the original data information of each server at multiple recent historical moments, you can first perform outlier detection on the original data information of each server. When outliers are detected in the original data information, the outliers are repaired. Finally, the original data information after the outlier repair can be normalized or standardized to obtain the target data information.

其中，对每个服务器的原始数据信息进行异常值检测的方法包括：基于PCA算法对原始数据信息进行降维处理，并利用LOF对降维后的原始数据信息进行异常值检测，即利用LOF计算和识别降维后的原始数据信息的离群因子得分，并根据得分确定异常值。通过结合PCA和LOF这两种方法，可以更准确地检测异常值，并提高鲁棒性。Among them, the method of detecting outliers on the original data information of each server includes: performing dimensionality reduction processing on the original data information based on the PCA algorithm, and using LOF to detect outliers on the reduced original data information, that is, using LOF calculation and identify outlier factor scores of original data information after dimensionality reduction, and determine outliers based on the scores. By combining the two methods, PCA and LOF, outliers can be detected more accurately and the robustness can be improved.

S320：针对每个历史时刻，基于云计算平台中各个服务器的目标数据信息生成每个历史时刻的图结构。S320: For each historical moment, generate a graph structure for each historical moment based on the target data information of each server in the cloud computing platform.

在获得云计算平台中各个服务器的目标数据信息之后，可以按照各个服务器的拓扑结构，生成每个历史时刻的图结构，使得图结构中的每个节点代表一个服务器，每个节点携带的信息包括该历史时刻该服务器的目标数据信息。After obtaining the target data information of each server in the cloud computing platform, a graph structure of each historical moment can be generated according to the topological structure of each server, so that each node in the graph structure represents a server, and the information carried by each node includes the target data information of the server at that historical moment.

S330：将最近多个历史时刻的图结构输入负载预测模型中，预测出未来多个时刻的各个服务器的负载信息。S330: Inputting the graph structure of the most recent multiple historical moments into the load prediction model, and predicting the load information of each server at multiple future moments.

其中，所述负载预测模型根据上述任一实施例提供的负载预测模型的训练方法训练而成。通过将最近多个历史时刻的图结构输入预先训练好的负载预测模型中，经过该负载预测模型中的编码器和解码器的处理，最终可以预测出未来多个时刻各个服务器的负载信息。The load prediction model is trained according to the training method of the load prediction model provided in any of the above embodiments. By inputting the graph structure of the recent multiple historical moments into the pre-trained load prediction model, after being processed by the encoder and decoder in the load prediction model, the load information of each server at multiple future moments can be finally predicted.

本申请实施例提供的服务器的负载预测方法，能够先获取云计算平台中各个服务器在最近多个历史时刻的目标数据信息，然后针对每个历史时刻，基于云计算平台中各个服务器的目标数据信息生成每个历史时刻的图结构，最后将最近多个历史时刻的图结构输入负载预测模型中，预测出未来多个时刻的各个服务器的负载信息。该负载预测模型包括编码器和解码器，并且编码器和解码器均包含了图卷积神经网络、前馈神经网络、多头注意力机制网络，本申请实施例先将目标数据信息转化成包括服务器拓扑结构的图结构，然后基于图卷积神经网络从空间维度提取路网空间特征，再基于多头注意力机制网络从时间维度分析各个路网空间特征的依赖关系，从中提取路网时空特征，并且每次特征的提取后，通过前馈神经网络将提取的特征进行精炼，可以加快负载预测模型的收敛及提高负载预测模型的质量。因此，本申请实施例训练的负载预测模型可以从空间和时间层面分析云计算平台中各个服务器的负载信息，能够捕捉到长期的依赖关系，从而可以使得训练得到的负载预测模型更准确地预测服务器的负载信息。The server load prediction method provided by the embodiment of the present application can first obtain the target data information of each server in the cloud computing platform at the most recent multiple historical moments, and then generate the graph structure of each historical moment based on the target data information of each server in the cloud computing platform for each historical moment, and finally input the graph structure of the most recent multiple historical moments into the load prediction model to predict the load information of each server at multiple moments in the future. The load prediction model includes an encoder and a decoder, and both the encoder and the decoder include a graph convolutional neural network, a feedforward neural network, and a multi-head attention mechanism network. The embodiment of the present application first converts the target data information into a graph structure including the server topology structure, and then extracts the road network spatial features from the spatial dimension based on the graph convolutional neural network, and then analyzes the dependency of each road network spatial feature from the time dimension based on the multi-head attention mechanism network, extracts the road network spatiotemporal features, and after each feature extraction, refines the extracted features through the feedforward neural network, which can accelerate the convergence of the load prediction model and improve the quality of the load prediction model. Therefore, the load prediction model trained by the embodiment of the present application can analyze the load information of each server in the cloud computing platform from the spatial and temporal levels, and can capture long-term dependencies, so that the trained load prediction model can more accurately predict the load information of the server.

基于上述方法实施例，本申请的另一实施例提供了一种负载预测模型的训练装置，所述负载预测模型包括编码器和解码器，所述编码器包括第一图卷积神经网络、第一前馈神经网络、第一多头注意力机制网络、第二前馈神经网络，所述解码器包括第二图卷积神经网络、第三前馈神经网络、第二多头注意力机制网络、第四前馈神经网络，如图4所示，所述装置包括：Based on the above method embodiment, another embodiment of the present application provides a training device for a load prediction model. The load prediction model includes an encoder and a decoder. The encoder includes a first graph convolutional neural network, a third graph convolutional neural network, and a decoder. A feedforward neural network, a first multi-head attention mechanism network, and a second feedforward neural network. The decoder includes a second graph convolutional neural network, a third feedforward neural network, and a second multi-head attention mechanism network. , the fourth feedforward neural network, as shown in Figure 4, the device includes:

获取单元410，用于获取训练集，其中，所述训练集包括多个训练样本及每个所述训练样本对应的真值样本，每个所述训练样本包括连续多个历史时刻的图结构，每个所述训练样本对应的真值样本包括所述连续多个历史时刻相邻的未来多个时刻的图结构，每个时刻的图结构包括该时刻下多个服务器中，每个所述服务器的目标数据信息，所述目标数据信息至少包括负载信息；The acquisition unit 410 is used to acquire a training set, wherein the training set includes multiple training samples and a true value sample corresponding to each of the training samples, each of the training samples includes a graph structure of multiple consecutive historical moments, the true value sample corresponding to each of the training samples includes a graph structure of multiple future moments adjacent to the multiple consecutive historical moments, and the graph structure at each moment includes target data information of each of the multiple servers at that moment, and the target data information at least includes load information;

第一特征提取单元420，用于将每个所述训练样本中每个所述图结构分别输入所述第一图卷积神经网络中，提取每个所述训练样本中每个所述图结构的第一路网空间特征；A first feature extraction unit 420, configured to input each of the graph structures in each of the training samples into the first graph convolutional neural network, respectively, and extract a first road network spatial feature of each of the graph structures in each of the training samples;

第二特征提取单元430，用于将每个所述真值样本中每个所述图结构分别输入所述第二图卷积神经网络中，提取每个所述真值样本中每个所述图结构的第二路网空间特征；A second feature extraction unit 430, configured to input each of the graph structures in each of the true value samples into the second graph convolutional neural network, respectively, and extract a second road network spatial feature of each of the graph structures in each of the true value samples;

第一卷积计算单元440，用于将每个所述第一路网空间特征经过所述第一前馈神经网络的卷积计算，得到卷积计算后的第一路网空间特征；The first convolution calculation unit 440 is used to perform convolution calculation on each of the first road network spatial features of the first feedforward neural network to obtain the first road network spatial features after convolution calculation;

第二卷积计算单元450，用于将每个所述第二路网空间特征经过所述第三前馈神经网络的卷积计算，得到卷积计算后的第二路网空间特征；The second convolution calculation unit 450 is used to perform the convolution calculation of each of the second road network spatial features through the third feedforward neural network to obtain the second road network spatial features after convolution calculation;

增加单元460，用于对每个时刻的卷积计算后的第一路网空间特征增加对应的时间信息；The adding unit 460 is used to add corresponding time information to the first road network spatial feature after the convolution calculation at each moment;

第一注意力处理单元470，用于将增加时间信息后的每个所述训练样本的第一路网空间特征输入所述第一多头注意力机制网络中，获得每个所述训练样本的第一路网时空特征；The first attention processing unit 470 is used to input the first network spatial feature of each training sample after adding time information into the first multi-head attention mechanism network, and obtain the first network spatial feature of each training sample. The spatiotemporal characteristics of the first road network;

第三卷积计算单元480，用于将每个训练样本的第一路网时空特征经过所述第二前馈神经网络的卷积计算，得到每个所述训练样本卷积计算后的第一路网时空特征；The third convolution calculation unit 480 is used to perform the convolution calculation of the first road network spatiotemporal feature of each training sample through the second feedforward neural network to obtain the first convolution calculation of each training sample. Road network spatiotemporal characteristics;

第二注意力处理单元490，用于将卷积计算后的第一路网时空特征，以及增加时间信息后的对应真值样本中的多个第二路网空间特征，输入所述第二多头注意力机制网络中，获得每个所述训练样本对应的未来多个时刻的第三路网时空特征，以及每个所述真值样本的第二路网时空特征；The second attention processing unit 490 is used to input the spatio-temporal features of the first road network calculated by convolution and the plurality of second road network spatial features in the corresponding true value samples after adding time information into the second plurality of In the head attention mechanism network, obtain the spatio-temporal characteristics of the third network at multiple times in the future corresponding to each training sample, and the spatio-temporal characteristics of the second network of each true value sample;

第四卷积计算单元4100，用于将每个所述训练样本对应的未来多个时刻的第三路网时空特征，以及每个所述真值样本的第二路网时空特征输入所述第四前馈神经网络，获得卷积计算后的第三路网时空特征和卷积计算后的第二路网时空特征；The fourth convolution calculation unit 4100 is used to input the spatio-temporal features of the third road network at multiple times in the future corresponding to each training sample, and the spatio-temporal features of the second road network for each of the true value samples into the third Four feedforward neural networks are used to obtain the spatio-temporal characteristics of the third network after convolution calculation and the spatio-temporal characteristics of the second network after convolution calculation;

调整训练单元4110，用于根据卷积计算后的第三路网时空特征和卷积计算后的第二路网时空特征，计算当前损失值，在所述当前损失值不满足收敛条件时，对所述负载预测模型的模型参数进行调整，并继续对参数调整后的所述负载预测模型进行训练，直至所述当前损失值满足所述收敛条件时，获得最终的负载预测模型。The adjustment training unit 4110 is used to calculate the current loss value based on the spatio-temporal characteristics of the third road network after convolution calculation and the spatio-temporal characteristics of the second road network after convolution calculation. When the current loss value does not meet the convergence condition, the The model parameters of the load prediction model are adjusted, and the load prediction model after parameter adjustment is continued to be trained until the current loss value meets the convergence condition, and the final load prediction model is obtained.

在一种可能的实施方式中，获取单元410包括：In a possible implementation, the acquiring unit 410 includes:

修复模块，用于当检测到所述原始数据信息中存在异常值时，对所述异常值进行修复；A repair module, used for repairing the abnormal value when an abnormal value is detected in the original data information;

处理模块，用于对异常值修复后的所述原始数据信息进行归一化或标准化操作，获得所述目标数据信息。The processing module is used to perform normalization or standardization operations on the original data information after the outliers are repaired to obtain the target data information.

在一种可能的实施方式中，修复模块，用于利用生成对抗网络对所述异常值进行修复，获得修复后的第一数据，以及利用支持向量回归算法对所述异常值进行修复，获得修复后的第二数据；对所述第一数据和所述第二数据进行加权计算，获得对所述异常值修复后的最终结果。In a possible implementation, the repair module is configured to use a generative adversarial network to repair the outliers and obtain the repaired first data, and to use a support vector regression algorithm to repair the outliers and obtain the repaired data. perform a weighted calculation on the first data and the second data to obtain the final result after repairing the outliers.

本申请实施例提供的负载预测模型的训练装置，该负载预测模型包括编码器和解码器，并且编码器和解码器均包含了图卷积神经网络、前馈神经网络、多头注意力机制网络，每个训练样本中每个图结构依次经过编码器中的第一图卷积神经网络、第一前馈神经网络、第一多头注意力机制网络和第二前馈神经网络，最终输出每个训练样本卷积计算后的第一路网时空特征，将每个真值样本中每个图结构依次输入第二图卷积神经网络、第三前馈神经网络，获得卷积计算后的第二路网空间特征，将每个训练样本对应的未来多个时刻的第三路网时空特征，以及将每个真值样本的第二路网时空特征输入第四前馈神经网络，获得卷积计算后的第三路网时空特征和卷积计算后的第二路网时空特征，根据卷积计算后的第三路网时空特征和卷积计算后的第二路网时空特征，计算当前损失值，在当前损失值不满足收敛条件时，对负载预测模型的模型参数进行调整，并继续对参数调整后的负载预测模型进行训练，直至当前损失值满足收敛条件时，获得最终的负载预测模型，以便基于最终的负载预测模型进行负载预测。由此可知，本申请实施例可以从空间和时间层面分析云计算平台中各个服务器的负载信息，能够捕捉到长期的依赖关系，从而可以使得训练得到的负载预测模型更准确地预测服务器的负载信息。The training device of the load prediction model provided by the embodiment of the present application. The load prediction model includes an encoder and a decoder, and both the encoder and the decoder include a graph convolutional neural network, a feedforward neural network, and a multi-head attention mechanism network. Each graph structure in each training sample sequentially passes through the first graph convolutional neural network, the first feedforward neural network, the first multi-head attention mechanism network and the second feedforward neural network in the encoder, and finally outputs each The spatio-temporal characteristics of the first network after the convolution calculation of the training sample are input into the second graph convolutional neural network and the third feedforward neural network in sequence for each graph structure in each true value sample to obtain the second network after the convolution calculation. For the spatial characteristics of the road network, input the spatio-temporal features of the third road network at multiple times in the future corresponding to each training sample, and the spatio-temporal features of the second road network for each true value sample into the fourth feedforward neural network to obtain the convolution calculation The current loss value is calculated based on the spatio-temporal characteristics of the third road network after convolution calculation and the spatio-temporal characteristics of the second road network after convolution calculation. , when the current loss value does not meet the convergence condition, adjust the model parameters of the load prediction model, and continue to train the load prediction model after parameter adjustment, until the current loss value meets the convergence condition, and obtain the final load prediction model. In order to perform load prediction based on the final load prediction model. It can be seen from this that the embodiments of the present application can analyze the load information of each server in the cloud computing platform from the spatial and temporal levels, and can capture long-term dependencies, so that the trained load prediction model can more accurately predict the load information of the server. .

基于上述方法实施例，本申请的另一实施例提供了一种服务器的负载预测装置，如图5所示，所述装置包括：Based on the above method embodiment, another embodiment of the present application provides a server load prediction device, as shown in Figure 5, the device includes:

获取单元510，用于获取云计算平台中各个服务器在最近多个历史时刻的目标数据信息，所述目标数据信息至少包括负载信息；The acquisition unit 510 is used to acquire the target data information of each server in the cloud computing platform at multiple recent historical moments, where the target data information at least includes load information;

生成单元520，用于针对每个历史时刻，基于所述云计算平台中各个所述服务器的所述目标数据信息生成所述每个历史时刻的图结构；A generating unit 520, configured to generate, for each historical moment, a graph structure of each historical moment based on the target data information of each server in the cloud computing platform;

预测单元530，用于将最近多个历史时刻的图结构输入负载预测模型中，预测出未来多个时刻的所述各个服务器的负载信息，其中，所述负载预测模型根据上述任一实施例提供的负载预测模型的训练方法训练而成。The prediction unit 530 is configured to input the graph structure of multiple recent historical moments into a load prediction model, and predict the load information of each server at multiple moments in the future, wherein the load prediction model is provided according to any of the above embodiments. It is trained using the load prediction model training method.

本申请实施例提供的服务器的负载预测装置，能够先获取云计算平台中各个服务器在最近多个历史时刻的目标数据信息，然后针对每个历史时刻，基于云计算平台中各个服务器的目标数据信息生成每个历史时刻的图结构，最后将最近多个历史时刻的图结构输入负载预测模型中，预测出未来多个时刻的各个服务器的负载信息。该负载预测模型包括编码器和解码器，并且编码器和解码器均包含了图卷积神经网络、前馈神经网络、多头注意力机制网络，本申请实施例先将目标数据信息转化成包括服务器拓扑结构的图结构，然后基于图卷积神经网络从空间维度提取路网空间特征，再基于多头注意力机制网络从时间维度分析各个路网空间特征的依赖关系，从中提取路网时空特征，并且每次特征的提取后，通过前馈神经网络将提取的特征进行精炼，可以加快负载预测模型的收敛及提高负载预测模型的质量。因此，本申请实施例训练的负载预测模型可以从空间和时间层面分析云计算平台中各个服务器的负载信息，能够捕捉到长期的依赖关系，从而可以使得训练得到的负载预测模型更准确地预测服务器的负载信息。The server load prediction device provided by the embodiment of the present application can first obtain the target data information of each server in the cloud computing platform at multiple recent historical moments, and then for each historical moment, based on the target data information of each server in the cloud computing platform Generate the graph structure of each historical moment, and finally input the graph structure of multiple recent historical moments into the load prediction model to predict the load information of each server at multiple times in the future. The load prediction model includes an encoder and a decoder, and both the encoder and the decoder include a graph convolutional neural network, a feedforward neural network, and a multi-head attention mechanism network. In this embodiment, the target data information is first converted into a server-based Topological graph structure, then extract the spatial features of the road network from the spatial dimension based on the graph convolutional neural network, and then analyze the dependence of the spatial features of each road network from the time dimension based on the multi-head attention mechanism network, and extract the spatiotemporal features of the road network, and After each feature is extracted, the extracted features are refined through a feedforward neural network, which can speed up the convergence of the load prediction model and improve the quality of the load prediction model. Therefore, the load prediction model trained in the embodiments of this application can analyze the load information of each server in the cloud computing platform from the spatial and temporal levels, and can capture long-term dependencies, so that the trained load prediction model can predict the server more accurately. load information.

基于上述方法实施例，本申请的另一实施例提供了一种计算机可读存储介质，其上存储有计算机程序，该程序被处理器执行时实现如上述任一实施方式所述的方法。Based on the above method embodiments, another embodiment of the present application provides a computer-readable storage medium on which a computer program is stored. When the program is executed by a processor, the method described in any of the above embodiments is implemented.

基于上述方法实施例，本申请的另一实施例提供了一种电子设备或计算机设备，包括：Based on the above method embodiment, another embodiment of the present application provides an electronic device or computer device, including:

一个或多个处理器；one or more processors;

其中，当所述一个或多个程序被所述一个或多个处理器执行时，使得电子设备或计算机设备实现如上任一实施方式所述的方法。Wherein, when the one or more programs are executed by the one or more processors, the electronic device or computer device is caused to implement the method described in any of the above embodiments.

基于上述实施例，本申请的另一实施例提供了一种计算机程序产品，所述计算机程序产品中包含有指令，当指令在计算机或处理器上运行时，使得计算机或处理器执行如上任一实施方式所述的方法。Based on the above embodiments, another embodiment of the present application provides a computer program product. The computer program product contains instructions. When the instructions are run on a computer or processor, the computer or processor causes the computer or processor to perform any of the above. The method described in the embodiment.

上述装置实施例与方法实施例相对应，与该方法实施例具有同样的技术效果，具体说明参见方法实施例。装置实施例是基于方法实施例得到的，具体的说明可以参见方法实施例部分，此处不再赘述。本领域普通技术人员可以理解：附图只是一个实施例的示意图，附图中的模块或流程并不一定是实施本申请所必须的。The above device embodiments correspond to the method embodiments and have the same technical effects as the method embodiments. For detailed description, please refer to the method embodiments. The device embodiment is obtained based on the method embodiment. For specific description, please refer to the method embodiment section and will not be described again here. Those of ordinary skill in the art can understand that the accompanying drawing is only a schematic diagram of an embodiment, and the modules or processes in the accompanying drawing are not necessarily necessary for implementing the present application.

本领域普通技术人员可以理解：实施例中的装置中的模块可以按照实施例描述分布于实施例的装置中，也可以进行相应变化位于不同于本实施例的一个或多个装置中。上述实施例的模块可以合并为一个模块，也可以进一步拆分成多个子模块。Those of ordinary skill in the art can understand that the modules in the device in the embodiment may be distributed in the device in the embodiment according to the description of the embodiment, or may be correspondingly changed and located in one or more devices different from this embodiment. The modules of the above embodiments can be combined into one module, or further divided into multiple sub-modules.

最后应说明的是：以上实施例仅用以说明本申请的技术方案，而非对其限制；尽管参照前述实施例对本申请进行了详细的说明，本领域的普通技术人员应当理解：其依然可以对前述实施例所记载的技术方案进行修改，或者对其中部分技术特征进行等同替换；而这些修改或者替换，并不使相应技术方案的本质脱离本申请实施例技术方案的精神和范围。Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present application, rather than to limit it. Although the present application has been described in detail with reference to the aforementioned embodiments, those skilled in the art should understand that they can still modify the technical solutions described in the aforementioned embodiments, or make equivalent replacements for some of the technical features therein. However, these modifications or replacements do not deviate the essence of the corresponding technical solutions from the spirit and scope of the technical solutions of the embodiments of the present application.

Claims

1. A method of training a load prediction model, the load prediction model comprising an encoder and a decoder, the encoder comprising a first graph roll-up neural network, a first feed-forward neural network, a first multi-head attention mechanism network, a second feed-forward neural network, the decoder comprising a second graph roll-up neural network, a third feed-forward neural network, a second multi-head attention mechanism network, a fourth feed-forward neural network, the method comprising:

Acquiring a training set, wherein the training set comprises a plurality of training samples and true value samples corresponding to each training sample, each training sample comprises a graph structure of a plurality of continuous historical moments, each true value sample corresponding to each training sample comprises a graph structure of a plurality of future moments adjacent to the plurality of continuous historical moments, each graph structure of each moment comprises target data information of each server in a plurality of servers at the moment, and the target data information at least comprises load information;

respectively inputting each graph structure in each training sample into the first graph convolution neural network, extracting a first road network space characteristic of each graph structure in each training sample, respectively inputting each graph structure in each truth sample into the second graph convolution neural network, and extracting a second road network space characteristic of each graph structure in each truth sample;

the convolution calculation of each first path network space feature through the first feedforward neural network is carried out to obtain a first path network space feature after the convolution calculation, and the convolution calculation of each second path network space feature through the third feedforward neural network is carried out to obtain a second path network space feature after the convolution calculation;

Adding corresponding time information to the first road network space characteristics calculated by convolution at each moment, and inputting the first road network space characteristics of each training sample after adding the time information into the first multi-head attention mechanism network to obtain first road network space-time characteristics of each training sample;

the first path network space-time characteristics of each training sample are subjected to convolution calculation of the second feedforward neural network, so that the first path network space-time characteristics of each training sample after convolution calculation are obtained;

inputting the first road network space-time characteristics after convolution calculation and a plurality of second road network space-time characteristics in corresponding truth value samples after time information addition into the second multi-head attention mechanism network to obtain third road network space-time characteristics of a plurality of future moments corresponding to each training sample and second road network space-time characteristics of each truth value sample;

inputting the third path network space-time characteristics of a plurality of future moments corresponding to each training sample and the second path network space-time characteristics of each truth sample into the fourth feedforward neural network to obtain the third path network space-time characteristics after convolution calculation and the second path network space-time characteristics after convolution calculation;

According to the third path network space-time characteristic after convolution calculation and the second path network space-time characteristic after convolution calculation, calculating a current loss value, adjusting model parameters of the load prediction model when the current loss value does not meet a convergence condition, and continuously training the load prediction model after parameter adjustment until the current loss value meets the convergence condition, and obtaining a final load prediction model.

2. The method according to claim 1, wherein the method for acquiring the target data information of each of the servers comprises:

detecting abnormal values of the original data information of each server;

when detecting that an abnormal value exists in the original data information, repairing the abnormal value;

and carrying out normalization or standardization operation on the original data information after the abnormal value is repaired to obtain the target data information.

3. The method of claim 2, wherein said anomaly value detection of raw data information for each of said servers comprises:

and performing dimension reduction processing on the original data information based on a Principal Component Analysis (PCA) algorithm, and detecting an outlier of the dimension reduced original data information by utilizing a Local Outlier Factor (LOF).

4. The method of claim 2, wherein the repairing the outlier comprises:

repairing the abnormal value by using a generated countermeasure network to obtain repaired first data, and repairing the abnormal value by using a support vector regression algorithm to obtain repaired second data;

and carrying out weighted calculation on the first data and the second data to obtain a final result after repairing the abnormal value.

5. The method of any of claims 1-4, wherein the target data information further comprises: traffic and/or network performance.

6. A method for predicting load of a server, the method comprising:

acquiring target data information of each server in the cloud computing platform at a plurality of latest historical moments, wherein the target data information at least comprises load information;

generating, for each historical moment, a graph structure of the each historical moment based on the target data information of the respective server in the cloud computing platform;

the graph structure of the latest historical moments is input into a load prediction model, and the load information of each server of the future moments is predicted, wherein the load prediction model is trained according to the method of any one of claims 1-5.

7. A training apparatus for a load prediction model, the load prediction model comprising an encoder and a decoder, the encoder comprising a first graph roll-up neural network, a first feed-forward neural network, a first multi-head attention mechanism network, a second feed-forward neural network, the decoder comprising a second graph roll-up neural network, a third feed-forward neural network, a second multi-head attention mechanism network, a fourth feed-forward neural network, the apparatus comprising:

the training set comprises a plurality of training samples and true value samples corresponding to each training sample, each training sample comprises a graph structure of a plurality of continuous historical moments, each true value sample corresponding to each training sample comprises a graph structure of a plurality of future moments adjacent to the plurality of continuous historical moments, each graph structure of each moment comprises target data information of each server in a plurality of servers at the moment, and the target data information at least comprises load information;

the first feature extraction unit is used for respectively inputting each graph structure in each training sample into the first graph convolution neural network and extracting the first road network space feature of each graph structure in each training sample;

The second feature extraction unit is used for respectively inputting each graph structure in each truth value sample into the second graph convolution neural network and extracting a second road network space feature of each graph structure in each truth value sample;

the first convolution calculation unit is used for carrying out convolution calculation on each first path network space feature through the first feedforward neural network to obtain a first path network space feature after convolution calculation;

the second convolution calculation unit is used for carrying out convolution calculation on each second road network spatial feature through the third feedforward neural network to obtain a second road network spatial feature after convolution calculation;

the adding unit is used for adding corresponding time information to the first road network space characteristics after convolution calculation at each moment;

the first attention processing unit is used for inputting the first road network space characteristics of each training sample after time information is added into the first multi-head attention mechanism network to obtain the first road network space-time characteristics of each training sample;

the third convolution calculation unit is used for carrying out convolution calculation on the first path network space-time characteristics of each training sample through the second feedforward neural network to obtain first path network space-time characteristics of each training sample after convolution calculation;

The second attention processing unit is used for inputting the first path network space-time characteristics after convolution calculation and a plurality of second path network space-time characteristics in the corresponding truth value samples after time information addition into the second multi-head attention mechanism network to obtain third path network space-time characteristics of a plurality of future moments corresponding to each training sample and second path network space-time characteristics of each truth value sample;

a fourth convolution calculation unit, configured to input a third path network space-time feature of a future multiple moments corresponding to each training sample and a second path network space-time feature of each truth sample into the fourth feedforward neural network, to obtain a third path network space-time feature after convolution calculation and a second path network space-time feature after convolution calculation;

the adjustment training unit is used for calculating a current loss value according to the third path network space-time characteristic after convolution calculation and the second path network space-time characteristic after convolution calculation, adjusting the model parameters of the load prediction model when the current loss value does not meet the convergence condition, and continuing to train the load prediction model after parameter adjustment until the current loss value meets the convergence condition, so as to obtain a final load prediction model.

8. A load predicting apparatus of a server, the apparatus comprising:

the cloud computing platform comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring target data information of each server in the cloud computing platform at a plurality of latest historical moments, and the target data information at least comprises load information;

a generation unit configured to generate, for each history time, a graph structure of each history time based on the target data information of each server in the cloud computing platform;

the prediction unit is configured to input the graph structures of the most recent historical moments into a load prediction model, and predict load information of each server of the plurality of future moments, where the load prediction model is trained according to the method of any one of claims 1-5.

9. A computer readable storage medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, implements the method according to any one of claims 1-5 or the method according to claim 6.

10. An electronic device, the electronic device comprising:

one or more processors;

the processor is coupled with a storage device for storing one or more programs;

The one or more programs, when executed by the one or more processors, cause the electronic device to perform the method of any of claims 1-5 or the method of claim 6.