CN116720158A

CN116720158A - Time sequence regression prediction method and system with uncertainty estimation

Info

Publication number: CN116720158A
Application number: CN202310463764.3A
Authority: CN
Inventors: 于治; 程彬倩; 万晨光; 刘晓娟; 项农
Original assignee: Hefei Institutes of Physical Science of CAS
Current assignee: Hefei Institutes of Physical Science of CAS
Priority date: 2023-04-26
Filing date: 2023-04-26
Publication date: 2023-09-08

Abstract

The present invention provides a time series regression prediction method and system with uncertainty estimation. The method is based on a deep learning model of the attention mechanism and can perform time series regression prediction and modeling of its uncertainty. The system includes: data acquisition and preprocessing module, neural network module, training module, and testing module. The time series regression prediction method with uncertainty estimation according to the embodiment of the present invention, by constructing a time series prediction and uncertainty model, based on the attention mechanism, extracts features of the input time series and decodes the output time series at the same time and its prediction uncertainty, which can effectively improve the actual reference value of time series prediction results. Compared with existing uncertainty estimation methods, it has higher execution efficiency and lower time and space cost. The invention is also applied in the discharge modeling of Tokamak 0-dimensional diagnostic physical quantities, making the modeling results more reliable and of practical value.

Description

A time series regression prediction method and system with uncertainty estimation

技术领域Technical field

本发明属于深度学习和不确定性度量领域，具体涉及一种带不确定度估计的时间序列回归预测方法及系统。The invention belongs to the field of deep learning and uncertainty measurement, and specifically relates to a time series regression prediction method and system with uncertainty estimation.

背景技术Background technique

深度神经网络是一种强大的机器学习工具，可以用于处理时间序列预测问题。然而，深度神经网络在实际应用中往往面临着许多挑战，例如样本量不足、数据集分布的偏斜、未知输入等，这些问题会导致模型存在不确定性这种不确定性会对深度神经网络的预测结果产生不良影响，降低其预测的准确性和可靠性。例如，在金融领域，不确定性可能导致模型无法准确预测股票价格或汇率波动，从而给投资者带来损失；在交通运输领域，不确定性可能导致交通预测不准确，从而给交通规划者和旅客带来不便；在可控核聚变领域，不确定性可能导致放电预测不准确，从而影响装置的安全稳定运行，甚至可能损坏装置。Deep neural networks are powerful machine learning tools that can be used to handle time series forecasting problems. However, deep neural networks often face many challenges in practical applications, such as insufficient sample size, skewed data set distribution, unknown inputs, etc. These problems will lead to uncertainty in the model. This uncertainty will have a negative impact on deep neural networks. It will have a negative impact on the prediction results and reduce the accuracy and reliability of its predictions. For example, in the financial field, uncertainty may cause models to fail to accurately predict stock prices or exchange rate fluctuations, causing losses to investors; in the transportation field, uncertainty may cause traffic forecasts to be inaccurate, causing transportation planners and It causes inconvenience to passengers; in the field of controllable nuclear fusion, uncertainty may lead to inaccurate discharge predictions, thereby affecting the safe and stable operation of the device, and may even damage the device.

因此，深度神经网络的不确定性度量非常重要，如何对深度学习模型中的不确定性进行有效的度量和管理，是当前深度学习领域的研究热点。Therefore, the uncertainty measurement of deep neural networks is very important. How to effectively measure and manage the uncertainty in deep learning models is a current research hotspot in the field of deep learning.

检索现有专利发现几乎没有用于时间序列的不确定性度量方法，现有的文献大多使用蒙特卡洛dropout、模型集成等方法进行不确定度估计。A search of existing patents found that there are almost no uncertainty measurement methods for time series. Most of the existing literature uses methods such as Monte Carlo dropout and model integration for uncertainty estimation.

现有不确定性度量方法的缺点主要包括以下几个方面：The shortcomings of existing uncertainty measurement methods mainly include the following aspects:

1.计算成本高：这些方法需要运行多个模型或采样，因此计算成本相对较高，特别是在需要高精度预测或大量数据时；1. High computational cost: These methods require running multiple models or sampling, so the computational cost is relatively high, especially when high-precision predictions or large amounts of data are required;

2.可解释性差：这些方法产生的不确定性度量通常难以解释，对于使用这些度量的应用场景，难以清楚地解释预测不确定性的来源，这对于一些关键应用场景，如可控核聚变、医疗、金融等，可能不太合适。2. Poor interpretability: The uncertainty measures produced by these methods are usually difficult to interpret. For the application scenarios using these measures, it is difficult to clearly explain the source of prediction uncertainty. This is difficult for some key application scenarios, such as controllable nuclear fusion, Medical care, finance, etc. may not be suitable.

总之，不确定性度量在深度学习中是一个非常重要的研究领域，其研究结果可以帮助我们更好地理解深度学习模型的性能和行为，从而提高其在实际应用中的效果和可靠性，而针对时间序列的不确定度估计高效低成本建模仍然是个挑战。In short, uncertainty measurement is a very important research field in deep learning. Its research results can help us better understand the performance and behavior of deep learning models, thereby improving its effectiveness and reliability in practical applications. Efficient and low-cost modeling of uncertainty estimates for time series remains a challenge.

发明内容Contents of the invention

为了解决深度神经网络存在不确定性、克服现有的不确定性度量方法计算开销大并且可能影响预测结果准确度的问题，本发明提供一种带不确定度估计的时间序列回归预测方法，其采用注意力机制实现了对时间序列及其预测不确定度的建模。同时利用托卡马克放电实验数据训练该模型，实现了高保真地、快速地诊断信号时间序列及其预测不确定度建模，实现了托卡马克实验提案0维诊断物理量验证。In order to solve the problem of uncertainty in deep neural networks and overcome the problem that existing uncertainty measurement methods have high computational overhead and may affect the accuracy of prediction results, the present invention provides a time series regression prediction method with uncertainty estimation, which The attention mechanism is used to model time series and their prediction uncertainty. At the same time, the model is trained using tokamak discharge experimental data to achieve high-fidelity and rapid diagnostic signal time series and prediction uncertainty modeling, and to realize the verification of 0-dimensional diagnostic physical quantities of the tokamak experimental proposal.

为达到上述目的，本发明采用的技术方案为：In order to achieve the above objects, the technical solutions adopted by the present invention are:

基于注意力机制的深度学习模型构建带不确定度估计的时间序列回归预测方法，具体包括如下步骤：The deep learning model based on the attention mechanism constructs a time series regression prediction method with uncertainty estimation, which specifically includes the following steps:

S1、数据获取及预处理：获取时间序列数据，包括输入时间序列和输出时间序列，并对数据进行重采样、标准化预处理操作，以建立模型训练、测试的数据集；S1. Data acquisition and preprocessing: Obtain time series data, including input time series and output time series, and perform resampling and standardized preprocessing operations on the data to establish a data set for model training and testing;

S2、构建基于注意力的时间序列预测及不确定度估计模型：将输入时间序列传入到基于注意力的时间序列预测及不确定度估计模型中，该模型将对输入时间序列进行特征提取并映射到潜在空间，并分别通过时序输出模块和不确定度估计模块得到输出时间序列及其不确定度；S2. Construct an attention-based time series prediction and uncertainty estimation model: Pass the input time series into the attention-based time series prediction and uncertainty estimation model. The model will extract features of the input time series and Map to the latent space, and obtain the output time series and its uncertainty through the time series output module and uncertainty estimation module respectively;

S3、训练模型：首先，使用目标输出时间序列和时序输出模块的预测输出，根据损失函数L1计算损失，并使用误差反向传播算法进行模型参数优化；然后，在已有模型参数基础上，根据损失函数L2计算实际偏差与不确定度估计模块预测输出的损失，并使用误差反向传播算法进行模型中不确定度估计网络的参数优化，最终得到带不确定度估计的时间序列预测回归最优模型；S3. Training model: First, use the target output time series and the predicted output of the time series output module to calculate the loss according to the loss function L1, and use the error back propagation algorithm to optimize the model parameters; then, based on the existing model parameters, according to The loss function L2 calculates the loss of the prediction output of the actual deviation and uncertainty estimation module, and uses the error back propagation algorithm to optimize the parameters of the uncertainty estimation network in the model, and finally obtains the optimal time series prediction regression with uncertainty estimation. Model;

S4、测试并验证模型有效性：将测试集数据输入到训练好的时间序列预测及不确定度模型中，输出模型预测时间序列及其预测不确定度。S4. Test and verify the validity of the model: Input the test set data into the trained time series prediction and uncertainty model, and output the model prediction time series and its prediction uncertainty.

进一步地，步骤2中构建的基于注意力机制构建时间序列预测及不确定度模型包括位置编码器、输入编码器、时序输出模块、直接不确定度估计模块，线性输出层，Furthermore, the time series prediction and uncertainty model based on the attention mechanism constructed in step 2 includes a position encoder, an input encoder, a time series output module, a direct uncertainty estimation module, and a linear output layer.

位置编码器：将时序信息加到数据中帮助模型学习数据的相对和绝对位置信息；Position encoder: Adds timing information to the data to help the model learn the relative and absolute position information of the data;

输入编码器：对输入时间序列进行特征提取，将其压缩成指定长度的语义向量，并映射到潜在空间中；Input encoder: perform feature extraction on the input time series, compress it into a semantic vector of specified length, and map it into the latent space;

时序输出模块：根据输入编码器映射到潜在空间中的特征向量进行解码，得到输出时间序列的全局特征，并通过线性全连接层得到输出时间序列的最终表达；Time series output module: Decode according to the feature vector mapped to the latent space by the input encoder to obtain the global features of the output time series, and obtain the final expression of the output time series through the linear fully connected layer;

直接不确定度估计模块：根据输入编码器映射到潜在空间中的特征向量进行解码，得到时间序列不确定度的全局特征，并通过线性全连接层得到不确定度时间序列的最终表达。Direct uncertainty estimation module: Decode according to the feature vector mapped by the input encoder into the latent space to obtain the global characteristics of the time series uncertainty, and obtain the final expression of the uncertainty time series through the linear fully connected layer.

线性输出层：针对时序输出模块和直接不确定度估计模块的输出进行简单的线性映射到目标输出维度。Linear output layer: A simple linear mapping of the outputs of the timing output module and the direct uncertainty estimation module to the target output dimension.

进一步地，所述的损失函数L1、L2使用掩码机制，根据时间序列有效长度计算有效的均方误差损失和带权均方误差损失。Furthermore, the loss functions L1 and L2 use a mask mechanism to calculate effective mean square error loss and weighted mean square error loss based on the effective length of the time series.

进一步地，所述位置编码器利用周期函数对原始张量的位置信息进行编码，得到时序张量，并将时序张量与原始输入张量结合，使得模型具备学习时间序列信息的能力。Further, the position encoder uses a periodic function to encode the position information of the original tensor to obtain a time series tensor, and combines the time series tensor with the original input tensor, so that the model has the ability to learn time series information.

进一步地，所述直接不确定度估计模块包括不确定度估计解码器和线性输出层，其中不确定度估计解码器用于根据输入特征在潜在空间中的映射，解码得到时间序列不确定度的全局特征；线性输出层通过调整不确定度全局特征的权重，将其映射到样本的不确定度输出向量空间中。Further, the direct uncertainty estimation module includes an uncertainty estimation decoder and a linear output layer, where the uncertainty estimation decoder is used to decode and obtain the global uncertainty of the time series according to the mapping of the input features in the latent space. Features; the linear output layer maps the uncertainty global features to the uncertainty output vector space of the sample by adjusting the weight of the uncertainty global features.

进一步地，所述不确定度估计解码器包括多头注意力机制、全连接神经网络，多头注意力机制借助线性层建立多个注意力，每个注意力关注输入信息的不同部分，然后再进行拼接，能够增强模型的表达能力；全连接神经网络是多个线性层与激活函数Relu的串联，通过简单线性与非线性处理单元的复合映射，可获得相对复杂的非线性处理能力。Further, the uncertainty estimation decoder includes a multi-head attention mechanism and a fully connected neural network. The multi-head attention mechanism uses linear layers to establish multiple attentions. Each attention focuses on different parts of the input information and then splices them. , can enhance the expression ability of the model; the fully connected neural network is a series connection of multiple linear layers and activation function Relu. Through the composite mapping of simple linear and nonlinear processing units, relatively complex nonlinear processing capabilities can be obtained.

另一方面，本发明申请一种带不确定度估计的时间序列回归预测系统，基于注意力机制构建时间序列预测及不确定度模型，具体包括如下：On the other hand, the present invention applies for a time series regression prediction system with uncertainty estimation, which builds a time series prediction and uncertainty model based on the attention mechanism, specifically including the following:

数据获取及预处理模块：用于获取时间序列数据，包括输入时间序列和输出时间序列，并对数据进行重采样、标准化预处理操作，以建立模型训练、测试的数据集；Data acquisition and preprocessing module: used to obtain time series data, including input time series and output time series, and perform resampling and standardized preprocessing operations on the data to establish a data set for model training and testing;

神经网络模块：所述神经网络包括位置编码器、输入编码器、时序输出模块、直接不确定度估计模块，线性输出层；用于根据输入时间序列得到输出时间序列及其不确定度；Neural network module: The neural network includes a position encoder, an input encoder, a timing output module, a direct uncertainty estimation module, and a linear output layer; used to obtain the output time series and its uncertainty based on the input time series;

训练模块：用于于利用任务的训练数据集，训练所述的神经网络，以得到训练好的神经网络；Training module: used to train the neural network using the training data set of the task to obtain a trained neural network;

测试模块：用于将测试集的输入时间序列数据输入到训练好的神经网络模型中，获取预测时间序列及其不确定度。Test module: used to input the input time series data of the test set into the trained neural network model to obtain the predicted time series and its uncertainty.

在本发明的实施例中，提供一种电子设备包括可读存储介质、中央处理器、图形处理器。所述可读存储介质上存储着可被所述中央处理器和所述图形处理器上运行的计算机程序，所述中央处理器和图形处理器执行所述计算机程序时实现所述的时间序列预测方法的步骤。In an embodiment of the present invention, an electronic device is provided including a readable storage medium, a central processing unit, and a graphics processor. The readable storage medium stores a computer program that can be run on the central processor and the graphics processor. When the central processor and the graphics processor execute the computer program, the time series prediction is implemented. Method steps.

在本发明的实施例中，提供一种可读存储介质，所述可读存储介质上存储有计算机程序，所述计算机程序被处理器运行时执行所述带不确定度估计的时间序列回归预测模型方法的步骤。In an embodiment of the present invention, a readable storage medium is provided. A computer program is stored on the readable storage medium. The computer program executes the time series regression prediction with uncertainty estimation when run by a processor. Steps of the model approach.

本发明的有益效果是：The beneficial effects of the present invention are:

1、该种基于注意力机制的深度学习模型构建带不确定度估计的时间序列回归预测模型，通过构建时间序列预测及不确定度模型，以注意力机制为基础，通过对输入时间序列的特征提取，创新型地同时解码出输出时间序列和其预测不确定度，能够有效提高时间序列预测结果的实际参考价值，相较于现有的不确定度估计方法具有更高的执行效率和更低的时间空间成本。1. This deep learning model based on the attention mechanism constructs a time series regression prediction model with uncertainty estimation. By constructing a time series prediction and uncertainty model, based on the attention mechanism, the characteristics of the input time series are evaluated Extract and innovatively decode the output time series and its prediction uncertainty at the same time, which can effectively improve the actual reference value of the time series prediction results. Compared with the existing uncertainty estimation method, it has higher execution efficiency and lower cost. time and space costs.

2、该种基于注意力机制的深度学习模型构建带不确定度估计的时间序列回归预测模型，被创新性地应用在托卡马克0维诊断物理量的放电建模中，使得建模结果更加可靠，更加具有实用价值。2. This deep learning model based on the attention mechanism constructs a time series regression prediction model with uncertainty estimation, and is innovatively applied in the discharge modeling of Tokamak's 0-dimensional diagnostic physical quantities, making the modeling results more reliable. , more practical value.

附图说明Description of the drawings

为了更清楚地说明本发明实施例或现有技术中的技术方案，下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍，显而易见地，下面描述中的附图仅仅用于示出优先实施方式的目的，而并不认为是对本发明的限制。In order to explain the embodiments of the present invention or the technical solutions in the prior art more clearly, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings in the following description are only They are used for the purpose of illustrating preferred embodiments and are not to be construed as limitations of the invention.

图1为本发明实施例所述的一种带不确定度估计的时间序列回归预测方法的流程示意图；Figure 1 is a schematic flow chart of a time series regression prediction method with uncertainty estimation according to an embodiment of the present invention;

图2为本发明实施例所述的获取时间序列数据的基本流程图；Figure 2 is a basic flow chart for obtaining time series data according to the embodiment of the present invention;

图3为本发明实施例所述的基于注意力的时间序列预测及不确定度估计神经网络模型架构图；Figure 3 is an architecture diagram of an attention-based time series prediction and uncertainty estimation neural network model according to an embodiment of the present invention;

图4为本发明实施例所述的训练神经网络以得到训练好的神经网络的基本流程图；Figure 4 is a basic flow chart of training a neural network to obtain a trained neural network according to an embodiment of the present invention;

图5为本发明实施例基于注意力的时间序列预测及不确定度估计模型建模效果示意图；Figure 5 is a schematic diagram of the modeling effect of the attention-based time series prediction and uncertainty estimation model according to the embodiment of the present invention;

图6为本发明实施例所述的时间序列预测装置的结构示意图；Figure 6 is a schematic structural diagram of a time series prediction device according to an embodiment of the present invention;

图7为本发明实施例所述的电子设备的结构示意图。FIG. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.

具体实施方式Detailed ways

为了使本发明的目的、技术方案及优点更加清楚明白，以下结合附图及实施例，对本发明进行进一步详细说明。应当理解，此处所描述的具体实施例仅仅用以解释本发明，并不用于限定本发明。此外，下面所描述的本发明各个实施方式中所涉及到的技术特征只要彼此之间未构成冲突就可以相互组合。In order to make the purpose, technical solutions and advantages of the present invention more clear, the present invention will be further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention and are not intended to limit the present invention. In addition, the technical features involved in the various embodiments of the present invention described below can be combined with each other as long as they do not conflict with each other.

本发明利用基于注意力机制的时间序列预测及不确定度估计模型，对时间序列回归问题建模，使得模型同时输出时间序列预测结果及其不确定度。因此，本发明的主要贡献是基于注意力机制的时间序列预测及不确定度估计模型的构建，另外在实施例中针对托卡马克0维诊断物理量进行放电建模及不确定度估计建模。The present invention uses a time series prediction and uncertainty estimation model based on the attention mechanism to model the time series regression problem, so that the model outputs the time series prediction results and its uncertainty at the same time. Therefore, the main contribution of the present invention is the construction of time series prediction and uncertainty estimation models based on the attention mechanism. In addition, in the embodiment, discharge modeling and uncertainty estimation modeling are performed for Tokamak 0-dimensional diagnostic physical quantities.

如图1所示，本发明提供了一种带不确定度的时间序列回归预测方法，包括以下步骤：As shown in Figure 1, the present invention provides a time series regression prediction method with uncertainty, which includes the following steps:

具体的，如图2所示，步骤S1所述数据获取及预处理过程，包括以下步骤：Specifically, as shown in Figure 2, the data acquisition and preprocessing process described in step S1 includes the following steps:

S11、获取原始时间序列数据，以托卡马克动力学系统为例，原始时间序列数据从MDSplus数据库中读取，并根据实验顺序被存储到不同的HDF5文件中以便后续的使用；S11. Obtain original time series data. Taking the tokamak dynamics system as an example, the original time series data is read from the MDSplus database and stored in different HDF5 files according to the experimental order for subsequent use;

S12、利用固定采样率对所述原始时间序列数据进行重采样，得到重采样后的时间序列，在保留足够多的数据信息的前提下减少计算成本；S12. Use a fixed sampling rate to resample the original time series data to obtain a resampled time series, thereby reducing computing costs while retaining sufficient data information;

S13、利用Z-Score标准化方法对重采样时间序列数据进行标准化处理，数据处理后符合标准正态分布，即均值为0，标准差为1；S13. Use the Z-Score standardization method to standardize the resampled time series data. After data processing, it conforms to the standard normal distribution, that is, the mean is 0 and the standard deviation is 1;

S14、根据上述经过预处理的时间序列数据，按4：4：2的比例划分得到两个训练集和一个测试集。S14. Based on the above preprocessed time series data, divide it into two training sets and one test set according to the ratio of 4:4:2.

具体的，如图3所示，所述的带不确定度估计的时间序列回归预测模型包括位置编码器、输入编码器、时序输出模块、直接不确定度估计模块、线性输出层。Specifically, as shown in Figure 3, the time series regression prediction model with uncertainty estimation includes a position encoder, an input encoder, a time series output module, a direct uncertainty estimation module, and a linear output layer.

所述位置编码器利用周期函数对原始张量的位置信息进行编码，得到时序张量，并将时序张量与原始输入张量结合，使得模型具备学习时间序列相对位置信息和绝对位置信息的能力。The position encoder uses a periodic function to encode the position information of the original tensor to obtain a time series tensor, and combines the time series tensor with the original input tensor, so that the model has the ability to learn time series relative position information and absolute position information. .

所述输入编码器对输入时间序列进行特征提取，将其压缩成指定长度的语义向量，并映射到潜在空间中。The input encoder performs feature extraction on the input time series, compresses it into a semantic vector of specified length, and maps it into the latent space.

所述时序输出模块根据输入编码器映射到潜在空间中的特征向量进行解码，得到输出诊断信号的全局特征，并通过线性全连接层得到诊断信号时间序列的最终表达。The time series output module decodes according to the feature vector mapped to the latent space by the input encoder to obtain the global features of the output diagnostic signal, and obtains the final expression of the time series of the diagnostic signal through the linear fully connected layer.

所述直接不确定度估计模块包括不确定度估计解码器和回归输出层。The direct uncertainty estimation module includes an uncertainty estimation decoder and a regression output layer.

其中，所述基于注意力机制的不确定度估计解码器由多头注意力机制和全连接神经网络组成，用于根据输入特征在潜在空间中的映射，解码得到时间序列不确定度的全局特征。所述多头注意力机制是指将输入的向量拆分为多个头部，并对每个头部进行注意力计算，最后将所有头部的结果拼接起来得到最终的输出向量，通过这种方法进行任务相关的信息提取，增加模型对不同方面信息的关注度，提高模型的泛化能力和效果，其中每个头部使用以下函数进行计算，Among them, the uncertainty estimation decoder based on the attention mechanism is composed of a multi-head attention mechanism and a fully connected neural network, and is used to decode and obtain the global characteristics of the time series uncertainty according to the mapping of input features in the latent space. The multi-head attention mechanism refers to splitting the input vector into multiple heads, performing attention calculations on each head, and finally splicing the results of all heads together to obtain the final output vector. Through this method Extract task-related information, increase the model's attention to different aspects of information, and improve the model's generalization ability and effect. Each head is calculated using the following function,

其中，Q、K、V是多头注意力机制的输入，在多头自注意力机制中Q、K、V相同；Softmax是一种激活函数，可以将一个数值向量归一化为一个概率分布向量，且各个概率之和为1；T表示转置操作；d_model是位置向量的维度，与整个模型的隐藏状态维度值相同；h表示多头注意力机制中头部的个数，i∈[1,h]；W_i ^Q、W_i ^K、W_i ^V分别是Q、K、V的权重矩阵。Among them, Q, K, and V are the inputs of the multi-head attention mechanism. In the multi-head self-attention mechanism, Q, K, and V are the same; Softmax is an activation function that can normalize a numerical vector into a probability distribution vector. And the sum of each probability is 1; T represents the transposition operation; d _model is the dimension of the position vector, which is the same as the hidden state dimension value of the entire model; h represents the number of heads in the multi-head attention mechanism, i∈[1, h]; W _i ^Q , W _i ^K , and W _i ^V are the weight matrices of Q, K, and V respectively.

所述全连接神经网络使用以下函数，对注意力机制的输出进行相对复杂的非线性处理，The fully connected neural network uses the following function to perform relatively complex nonlinear processing on the output of the attention mechanism,

其中，x是全连接神经网络的输入；Relu是激活函数中的线性整流函数；W₁、W₂是全连接神经网络中两个线性层的权重参数；T表示转置操作；b₁、b₂是全连接神经网络中两个线性层的偏置。Among them, x is the input of the fully connected neural network; Relu is the linear rectification function in the activation function; W ₁ and W ₂ are the weight parameters of the two linear layers in the fully connected neural network; T represents the transpose operation; b ₁ and b ₂ is the bias of two linear layers in a fully connected neural network.

所述线性输出层为简单的一维线性层，通过调整不确定度全局特征的权重，将其映射到样本的不确定度输出向量空间中，进行简单的线性映射，对齐了模型输出张量和0维诊断数据张量。The linear output layer is a simple one-dimensional linear layer. By adjusting the weight of the uncertainty global feature, it is mapped to the uncertainty output vector space of the sample, and a simple linear mapping is performed to align the model output tensor and 0-dimensional diagnostic data tensor.

具体的，如图4所示，步骤S3所述模型训练过程，包括以下步骤：Specifically, as shown in Figure 4, the model training process in step S3 includes the following steps:

S31、随机初始化神经网络的权重和偏置参数；S31. Randomly initialize the weights and bias parameters of the neural network;

S32、初始化神经网络超参数和随机梯度下降优化器，所述的神经网络超参数包括批大小、学习率、迭代次数；S32. Initialize the neural network hyperparameters and the stochastic gradient descent optimizer. The neural network hyperparameters include batch size, learning rate, and number of iterations;

S33、使用目标输出时间序列和时序输出模块的预测输出，根据损失函数L1计算损失，并使用误差反向传播算法进行模型参数优化；S33. Use the target output time series and the predicted output of the time series output module to calculate the loss according to the loss function L1, and use the error back propagation algorithm to optimize the model parameters;

S34、在已有模型参数基础上，根据损失函数L2计算实际偏差与不确定度估计模块预测输出的损失，并使用误差反向传播算法进行模型中不确定度估计网络的参数优化；S34. Based on the existing model parameters, calculate the loss of the actual deviation and uncertainty estimation module prediction output according to the loss function L2, and use the error back propagation algorithm to optimize the parameters of the uncertainty estimation network in the model;

所述损失函数L1、L2使用掩码机制，根据时间序列有效长度计算有效的均方误差损失VMSE和带权均方误差损失VWMSE；The loss functions L1 and L2 use a mask mechanism to calculate the effective mean square error loss VMSE and the weighted mean square error loss VWMSE according to the effective length of the time series;

有效均方误差损失函数VMSE的计算公式如下：The calculation formula of the effective mean square error loss function VMSE is as follows:

其中，n_V为时间序列的有效长度；W_V为一个仅含0、1的矩阵，用于针对时间序列Among them, n _V is the effective length of the time series; W _V is a matrix containing only 0 and 1, used for time series

^的均方误差提取有效部分；y是真实实验数据；y是模型预测输出。The mean square error of ^ extracts the effective part; y is the real experimental data; y is the model prediction output.

有效带权均方误差损失VWMSE的计算公式如下：The calculation formula of the effective weighted mean square error loss VWMSE is as follows:

其中，n_{V_in}是有效且在预测不确定度覆盖范围内的时间片个数；n_{V_out}是有效且在预测不确定度覆盖范围外的时间片个数；y_{V_in}是预测不确定度覆盖范围内的真实实验数据；是预测不确定度覆盖范围内的模型预测输出；y_{V_out}是预测不确定度覆盖范围内的真实实验数据；/>是预测不确定度覆盖范围内的模型预测输出；λ是用于平衡覆盖率和不确定度宽度两者重要性的参数；μ是预定义的目标不确定度覆盖率；PUCP是预测不确定度覆盖率，所述的PUCP计算公式如下：Among them, n _{V_in} is the number of time slices that are valid and within the prediction uncertainty coverage; n _{V_out} is the number of time slices that are valid and outside the prediction uncertainty coverage; y _{V_in} is within the prediction uncertainty coverage real experimental data; is the model prediction output within the coverage range of prediction uncertainty; y _{V_out} is the real experimental data within the coverage range of prediction uncertainty;/> is the model prediction output within the prediction uncertainty coverage; λ is a parameter used to balance the importance of coverage and uncertainty width; μ is the predefined target uncertainty coverage; PUCP is the prediction uncertainty Coverage, the PUCP calculation formula is as follows:

其中，N是时间序列长度，a_i是一个二进制值，计算公式如下：Among them, N is the time series length, a _i is a binary value, and the calculation formula is as follows:

其中y_i是目标值，是点预测值，uncertainty是不确定度。where _yi is the target value, is the point prediction value, and uncertainty is the uncertainty.

S35、最终得到带不确定度估计的时间序列预测回归最优模型。S35. Finally, the optimal time series prediction regression model with uncertainty estimation is obtained.

本实施例提供的时间序列预测方法可以运用于多种任务中，下面我们以托卡马克放电建模任务为例进行论述。The time series prediction method provided in this embodiment can be applied to a variety of tasks. Below we will discuss the tokamak discharge modeling task as an example.

所述针对托卡马克0维诊断物理量的放电建模及不确定度估计建模模型实现如下：所述针对托卡马克0维诊断物理量的放电建模及不确定度估计建模模型的输入数据是92道控制信号时间序列，包括等离子电流前馈、极向磁场线圈电流、环向磁场、低杂波电流驱动与加热系统的功率、中性光束注入系统、离子回旋共振加热系统、电子回旋共振加热/电流驱动系统、气体吹气系统、超音速分子束注射、颗粒注射系统、等离子体形状前馈；所述针对托卡马克0维诊断物理量的放电建模及不确定度估计建模模型的建模目标是11道诊断信号，包括实际的等离子电流I_p,托卡马克磁轴等离子体平均电子密度n_e，等离子体储能W_mhd，托卡马克环电压V_loop，托卡马克归一化磁比压β_n，环向磁比压β_t，极向磁比压β_p，拉长比k，内感li，安全因子q₀，在95％通量面上的安全因子q₉₅。如图2所示，堆叠的层数大小为：D0，D1，D2均为6，隐藏层大小d_model为512。The discharge modeling and uncertainty estimation modeling model for tokamak 0-dimensional diagnostic physical quantities are implemented as follows: the input data of the discharge modeling and uncertainty estimation modeling model for tokamak 0-dimensional diagnostic physical quantities It is a time series of 92 control signals, including plasma current feedforward, poloidal magnetic field coil current, toroidal magnetic field, power of low-clutter current drive and heating system, neutral beam injection system, ion cyclotron resonance heating system, and electron cyclotron resonance. Heating/current drive system, gas blowing system, supersonic molecular beam injection, particle injection system, plasma shape feedforward; the above-mentioned discharge modeling and uncertainty estimation modeling model for Tokamak 0-dimensional diagnostic physical quantities The modeling target is 11 diagnostic signals, including the actual plasma current I _p , the average electron density of the tokamak magnetic axis plasma n _e , the plasma energy storage W _mhd , the tokamak ring voltage V _loop , and the tokamak normalization The specific magnetic pressure β _n , the toroidal specific pressure β _t , the poloidal specific pressure β _p , the elongation ratio k, the internal inductance li, the safety factor q ₀ , and the safety factor q ₉₅ on the 95% flux surface. As shown in Figure 2, the stacked layer sizes are: D0, D1, and D2 are all 6, and the hidden layer size d _model is 512.

在设定目标不确定度覆盖率μ为0.9，参数λ为4的前提下，所述针对托卡马克0维诊断物理量的放电建模及不确定度估计建模结果如图5所示。选择了#73873炮作为测试数据，该炮的放电时间超过70s，序列长度超过7×10⁴。具体过程为：首先，设置托卡马克控制系统的输入量，本实施例中直接调用#73873炮对应的控制系统源文件，而后使用#73873炮的实际致动器输入信号作为系统的输入。下一步，将数据转换为Tensor类型加载到GPU中，并且加载训练好的深度学习模型到GPU中，利用训练好的深度学习模型计算输入数据得到11个0维诊断物理量及其预测不确定度的建模结果，最后将数据可视化。Under the premise that the target uncertainty coverage μ is set to 0.9 and the parameter λ is 4, the discharge modeling and uncertainty estimation modeling results for the tokamak 0-dimensional diagnostic physical quantities are shown in Figure 5. Gun #73873 was selected as test data. The discharge time of this gun exceeds 70s and the sequence length exceeds 7×10 ⁴ . The specific process is: first, set the input amount of the tokamak control system. In this embodiment, the control system source file corresponding to the #73873 gun is directly called, and then the actual actuator input signal of the #73873 gun is used as the input of the system. Next, convert the data into Tensor type and load it into the GPU, and load the trained deep learning model into the GPU. Use the trained deep learning model to calculate the input data to obtain 11 0-dimensional diagnostic physical quantities and their prediction uncertainties. Modeling results, and finally visualizing the data.

在合肥“东方超环(EAST)”托卡马克装置中，以共932炮作为测试集，测试了模型在整体测试集上的表现，本发明在目标不确定度覆盖率μ设定为0.9，λ设定为4的前提下，在0维诊断物理量放电建模的平均预测不确定度覆盖率PUCP达到90.891％，符合预期。在测试集上的良好表现可以说明本发明的建模结果准确可靠，有实用价值。In the Hefei "Eastern Super Ring (EAST)" tokamak device, a total of 932 shots were used as a test set to test the performance of the model on the overall test set. In this invention, the target uncertainty coverage μ is set to 0.9. Under the premise that λ is set to 4, the average prediction uncertainty coverage PUCP in 0-dimensional diagnostic physical quantity discharge modeling reaches 90.891%, which is in line with expectations. The good performance on the test set can illustrate that the modeling results of the present invention are accurate and reliable and have practical value.

另一方面，如图6所示，本发明一种带不确定度估计的时间序列回归预测系统，所述装置包括以下模块：On the other hand, as shown in Figure 6, the present invention is a time series regression prediction system with uncertainty estimation. The device includes the following modules:

如图7所示，本发明还提供了一种电子设备，包括可读存储介质、核心处理器、图形处理器。存储在所述可读存储介质上并可在所述核心处理器和图形处理器上运行的计算机程序，所述核心处理器和图形处理器执行所述计算机程序时将实现本实施例所述的一种带不确定度估计的时间序列回归预测方法步骤。As shown in Figure 7, the present invention also provides an electronic device, including a readable storage medium, a core processor, and a graphics processor. A computer program stored on the readable storage medium and executable on the core processor and graphics processor. When the core processor and graphics processor execute the computer program, they will implement what is described in this embodiment. The steps of a time series regression forecasting method with uncertainty estimation.

具体地，该可读存储介质即为通用的存储介质，如移动磁盘、硬盘、光盘等，该存储介质上存储计算机程序，计算机程序被处理器执行时实现上述一种带不确定度估计的时间序列预测方法实施例的各个过程，且能达到相同的技术效果，为避免重复，这里不再赘述。Specifically, the readable storage medium is a general storage medium, such as a removable disk, a hard disk, an optical disk, etc. A computer program is stored on the storage medium. When the computer program is executed by the processor, the above-mentioned time estimation with uncertainty is realized. Each process of the sequence prediction method embodiment can achieve the same technical effect. To avoid duplication, it will not be described again here.

尽管已描述了本申请实施例的优选实施例，但本领域内的技术人员一旦得知了基本创造性概念，则可对这些实施例做出另外的变更和修改。所以，所附权利要求意欲解释为包括优选实施例以及落入本申请实施例范围的所有变更和修改。Although preferred embodiments of the embodiments of the present application have been described, those skilled in the art may make additional changes and modifications to these embodiments once the basic inventive concepts are understood. Therefore, the appended claims are intended to be construed to include the preferred embodiments and all changes and modifications that fall within the scope of the embodiments of the present application.

Claims

1. A time series regression prediction method with uncertainty estimation, which is characterized by constructing a time series prediction and uncertainty model based on the attention mechanism, specifically including the following steps:

S1. Data acquisition and preprocessing: Obtain time series data, including input time series and output time series, and perform resampling, standardization, and discharge category label preprocessing operations on the data to establish a data set for model training and testing;

S2. Construct an attention-based time series prediction and uncertainty estimation model: Pass the input time series into the attention-based time series prediction and uncertainty estimation model. The model will extract features of the input time series and Map to the latent space, and obtain the output time series and its uncertainty through the time series output module and uncertainty estimation module respectively;

S3. Training model: First, use the target output time series and the predicted output of the time series output module to calculate the loss according to the loss function L1, and use the error back propagation algorithm to optimize the model parameters; then, based on the existing model parameters, according to The loss function L2 calculates the loss of the prediction output of the actual deviation and uncertainty estimation module, and uses the error back propagation algorithm to optimize the parameters of the uncertainty estimation network in the model, and finally obtains the optimal time series prediction regression with uncertainty estimation. Model;

S4. Test and verify the validity of the model: Input the test set data into the trained time series prediction and uncertainty model, and output the model prediction time series and its prediction uncertainty.

2. The time series regression prediction method with uncertainty estimation according to claim 1, characterized in that the data acquisition and preprocessing process in S1 includes the following steps:

S11. Obtain the original time series data. The original time series data is read from the MDSplus database and stored in different HDF5 files according to the experimental order for subsequent use;

S12. Use a fixed sampling rate to resample the original time series data to obtain a resampled time series, thereby reducing computing costs while retaining sufficient data information;

S13. Use the Z-Score standardization method to standardize the resampled time series data. After data processing, it conforms to the standard normal distribution, that is, the mean is 0 and the standard deviation is 1;

S14. Based on the above preprocessed time series data, divide it into two training sets and one test set according to the ratio of 4:4:2.

3. The time series regression prediction method with uncertainty estimation according to claim 1, characterized in that the attention mechanism-based time series prediction and uncertainty model constructed in S2 includes:

The position encoder will use a periodic function to encode the position information of the original tensor to obtain a time series tensor, and combine the time series tensor with the original input tensor, so that the model has the ability to learn the relative position information and absolute position information of the time series;

The input encoder performs feature extraction on the input time series, compresses it into a semantic vector of specified length, and maps it into the latent space;

The time series output module decodes the feature vector mapped to the latent space by the input encoder to obtain the global features of the output time series, and obtains the final expression of the output time series through the linear fully connected layer;

The direct uncertainty estimation module decodes the feature vector mapped to the latent space by the input encoder to obtain the global characteristics of the time series uncertainty, and obtains the final expression of the uncertainty time series through the linear fully connected layer;

The linear output layer performs a simple linear mapping of the outputs of the timing output module and the direct uncertainty estimation module to the target output dimension.

4. The time series regression prediction method with uncertainty estimation according to claim 1, characterized in that the loss functions L1 and L2 in the S3 use a mask mechanism to calculate an effective mean according to the effective length of the time series. square error loss and weighted mean square error loss.

5. The time series regression prediction method with uncertainty estimation according to claim 2, characterized in that the position encoder uses a periodic function to encode the position information of the original tensor to obtain the time series tensor, and The time series tensor is combined with the original input tensor to give the model the ability to learn time series information.

6. The time series regression prediction method with uncertainty estimation according to claim 2, characterized in that the direct uncertainty estimation module includes an uncertainty estimation decoder and a linear output layer, wherein the uncertainty estimation The decoder is used to decode and obtain the global features of the time series uncertainty based on the mapping of the input features in the latent space; the linear output layer maps the global features of the uncertainty into the uncertainty output vector space of the sample by adjusting the weight of the global features of the uncertainty. .

7. The time series regression prediction method with uncertainty estimation according to claim 5, characterized in that the uncertainty estimation decoder is composed of a multi-head attention mechanism and a fully connected neural network, and the multi-head attention mechanism uses The linear layer establishes multiple attentions, each paying attention to different parts of the input information, and then splicing them together, which can enhance the expression ability of the model; the fully connected neural network is a series connection of multiple linear layers and the activation function Relu. Through simple linear Composite mapping with the nonlinear processing unit can obtain relatively complex nonlinear processing capabilities.

8. The time series regression prediction method with uncertainty estimation according to claim 3, characterized in that the position encoder uses sine and cosine functions to add timing information to the original vector to help the model learn the relative relationship of the data. and absolute location information.

9. The time series regression prediction method with uncertainty estimation according to claim 2, characterized in that,

The uncertainty estimation decoder based on the attention mechanism is composed of a multi-head attention mechanism and a fully connected neural network; the multi-head attention mechanism refers to splitting the input vector into multiple heads and analyzing each head. Attention calculation is performed on the head, and finally the results of all heads are spliced together to obtain the final output vector. Through this method, task-related information is extracted, the model's attention to different aspects of information is increased, and the generalization ability and effect of the model are improved. , where each head is calculated using the following function,

Among them, Q, K, and V are the inputs of the multi-head attention mechanism. In the multi-head self-attention mechanism, Q, K, and V are the same; Softmax is an activation function that can normalize a numerical vector into a probability distribution vector. And the sum of each probability is 1; T represents the transposition operation; d _model is the dimension of the position vector, which is the same as the hidden state dimension value of the entire model; h represents the number of heads in the multi-head attention mechanism, i∈[1, h]; W _i ^Q , W _i ^K , and W _i ^V are the weight matrices of Q, K, and V respectively.

10. The time series regression prediction method with uncertainty estimation according to claim 9, characterized in that the fully connected neural network uses the following function to perform relatively complex nonlinear processing on the output of the attention mechanism,

Among them, x is the input of the fully connected neural network; Relu is the linear rectification function in the activation function; W ₁ and W ₂ are the weight parameters of the two linear layers in the fully connected neural network; T represents the transpose operation; b ₁ and b ₂ is the bias of two linear layers in a fully connected neural network.

11. The time series regression prediction method with uncertainty estimation according to claim 10, characterized in that the linear output layer is a simple one-dimensional linear layer, and a simple linear mapping is performed to align the model output tensor and 0-dimensional diagnostic data tensor.

12. The time series regression prediction method with uncertainty estimation according to claim 1, characterized in that the model training process in step S3 includes the following steps:

S31. Randomly initialize the weights and bias parameters of the neural network;

S32. Initialize the neural network hyperparameters and the stochastic gradient descent optimizer. The neural network hyperparameters include batch size, learning rate, and number of iterations;

S33. Use the target output time series and the predicted output of the time series output module to calculate the loss according to the loss function L1, and use the error back propagation algorithm to optimize the model parameters;

S34. Based on the existing model parameters, calculate the loss of the actual deviation and uncertainty estimation module prediction output according to the loss function L2, and use the error back propagation algorithm to optimize the parameters of the uncertainty estimation network in the model;

S35. Finally, the optimal time series prediction regression model with uncertainty estimation is obtained.

13. The time series regression prediction method with uncertainty estimation according to claim 4, characterized in that the loss functions L1 and L2 use a mask mechanism to calculate the effective mean square error loss VMSE according to the effective length of the time series. And the weighted mean square error loss VWMSE specifically includes:

The calculation formula of the effective mean square error loss function VMSE is as follows:

Among them, n _V is the effective length of the time series; W _V is a matrix containing only 0 and 1, used to extract the effective part for the mean square error of the time series; y is the real experimental data; is the model prediction output;

The calculation formula of the effective weighted mean square error loss VWMSE is as follows:

Among them, n _{V_in} is the number of time slices that are valid and within the prediction uncertainty coverage; n _{V_out} is the number of time slices that are valid and outside the prediction uncertainty coverage; y _{V_in} is within the prediction uncertainty coverage real experimental data; is the model prediction output within the coverage range of prediction uncertainty; y _{V_out} is the real experimental data within the coverage range of prediction uncertainty;/> is the model prediction output within the prediction uncertainty coverage; λ is a parameter used to balance the importance of coverage and uncertainty width; μ is the predefined target uncertainty coverage; PUCP is the prediction uncertainty Coverage.

14. The time series regression prediction method with uncertainty estimation according to claim 13, characterized in that the PUCP calculation formula is as follows:

Among them, N is the time series length, ai is a binary value, and the calculation formula is as follows:

Among them, _yi is the target value, is the point prediction value, and uncertainty is the uncertainty.

15. A time series regression prediction system with uncertainty estimation, characterized in that the system includes the following modules:

Data acquisition and preprocessing module: used to obtain time series data, including input time series and output time series, and perform resampling and standardized preprocessing operations on the data to establish a data set for model training and testing;

Neural network module: The neural network includes a position encoder, an input encoder, a timing output module, a direct uncertainty estimation module, and a linear output layer; used to obtain the output time series and its uncertainty based on the input time series;

Training module: used to train the neural network using the training data set of the task to obtain a trained neural network;

Test module: used to input the input time series data of the test set into the trained neural network model to obtain the predicted time series and its uncertainty.

16. An electronic device, characterized by comprising a readable storage medium, a central processing unit, and a graphics processor. Wherein, the readable storage medium is used to store one or more computer programs. When the processor executes the computer program, the time series regression with uncertainty estimation as described in any one of claims 1 to 14 is implemented. Steps in the forecasting method.

17. A readable storage medium, characterized in that a computer program is stored on the readable storage medium, and when the computer program is run by a processor, the computer program executes the method described in any one of claims 1 to 14 with uncertainty. Steps in the time series regression forecasting method of degree estimation.