CN116610919A

CN116610919A - A Spatial Time Series Forecasting Method Based on Recurrent Graph Operator Neural Network

Info

Publication number: CN116610919A
Application number: CN202310630846.2A
Authority: CN
Inventors: 丁元明; 彭勃; 夏清雨; 康伟; 李星达
Original assignee: Dalian University
Current assignee: Dalian University
Priority date: 2023-05-31
Filing date: 2023-05-31
Publication date: 2023-08-18

Abstract

The invention discloses a spatial time series prediction method based on a cyclic graph operator neural network, comprising: modeling the traffic road network to obtain a graph structure model, and abstracting the sensors in the traffic road network into the graph structure model nodes; standardize the data collected by sensors; build a variety of graph operator networks, as data feature capturers, which use different methods to aggregate the information of each node in the graph structure model; build a GGRU unit, respectively The operator network is embedded in the GRU unit of the neural network; the feature information captured by different graph operator networks is aggregated through the integrator; an encoder-decoder architecture is constructed by using multiple integrators to realize the spatial multivariate time series predict. This method has the ability to understand information from multiple angles, effectively improves the understanding of spatial multivariate time series data, and then improves the accuracy of prediction.

Description

A Spatial Time Series Forecasting Method Based on Recurrent Graph Operator Neural Network

技术领域technical field

本发明涉及空间时间序列预测技术领域，具体涉及一种基于循环图算子神经网络的空间时间序列预测方法。The invention relates to the technical field of space time series prediction, in particular to a space time series prediction method based on a cycle graph operator neural network.

背景技术Background technique

空间时间序列预测在现代拥有十分广泛的应用需求，比如：(1)气象预测：气象预测是空间时间序列预测的一种重要应用。通过对气象数据进行建模，可以预测未来的温度、降雨量、风速等气象变量的变化趋势，为天气预报、灾害预警、农业生产等提供决策支持。(2)能源需求预测：对于能源供应商和消费者来说，了解未来能源需求的变化趋势非常重要。通过对历史能源需求数据进行建模，可以预测未来的能源需求，以便调整生产计划和供应链管理。(3)交通预测：交通预测可以帮助城市规划者和交通管理者更好地管理交通流量，提高交通效率。通过对历史交通数据进行建模，可以预测未来道路拥堵情况、交通事故发生率等，以便采取相应的措施。(4)人口流动预测：人口流动是城市规划和社会政策制定的重要因素。通过对历史人口流动数据进行建模，可以预测未来的人口流动趋势，以便合理规划城市基础设施和社会资源分配。(5)股票价格预测：股票价格预测是金融领域的一个重要应用。通过对历史股票价格数据进行建模，可以预测未来股票价格的变化趋势，为投资者提供决策支持。Spatial time series forecasting has a very wide range of application requirements in modern times, such as: (1) Meteorological forecasting: Meteorological forecasting is an important application of space time series forecasting. By modeling meteorological data, it is possible to predict the changing trend of meteorological variables such as temperature, rainfall, and wind speed in the future, and provide decision support for weather forecasting, disaster warning, and agricultural production. (2) Energy demand forecast: For energy suppliers and consumers, it is very important to understand the changing trend of future energy demand. By modeling historical energy demand data, future energy demand can be forecasted so that production planning and supply chain management can be adjusted. (3) Traffic forecasting: Traffic forecasting can help urban planners and traffic managers to better manage traffic flow and improve traffic efficiency. By modeling historical traffic data, it is possible to predict future road congestion, traffic accident rates, etc., so that corresponding measures can be taken. (4) Prediction of population mobility: population mobility is an important factor in urban planning and social policy formulation. By modeling historical population flow data, future population flow trends can be predicted for rational planning of urban infrastructure and social resource allocation. (5) Stock price prediction: Stock price prediction is an important application in the financial field. By modeling historical stock price data, it is possible to predict future stock price trends and provide decision support for investors.

目前针对多元时间序列预测主要使用深度学习方法。这种基于神经网络的模型可以有效地用于多元时间序列预测。例如：(1)循环神经网络RNN：是一种可以处理序列数据的神经网络，它通过将前一时刻的输出作为当前时刻的输入来捕捉序列的时序关系。在多元时间序列预测中，可以使用多个RNN来分别处理每个序列，然后将它们的输出连接起来得到最终的预测结果。RNN在多元时间序列预测中的应用非常广泛，例如在交通流量预测、股票价格预测等领域都取得了很好的效果。(2)长短时记忆网络LSTM：是一种特殊的RNN，它可以更好地处理长序列和长期依赖关系。在多元时间序列预测中，可以使用多个LSTM来处理每个序列，然后将它们的输出连接起来得到最终的预测结果。与RNN相比，LSTM在处理长序列和长期依赖关系时表现更好，因此在某些应用中可能会更适合。(3)卷积神经网络CNN：是一种主要用于图像处理的神经网络，但在某些情况下也可以用于多元时间序列预测。在多元时间序列预测中，可以将多个序列看作多个通道，然后使用多个卷积核对每个通道进行卷积操作，最后将所有通道的输出连接起来得到最终的预测结果。CNN在多元时间序列预测中的应用相对较少，但在某些应用中可能会更适合。(4)注意力机制Attention：是一种可以根据输入数据的不同部分给予不同权重的机制，可以用于多元时间序列预测中。在多元时间序列预测中，使用注意力机制来自动学习序列中不同部分的重要性，并将注意力权重应用于每个序列的预测中。注意力机制在多元时间序列预测中的应用较新，但在某些应用中已经取得了很好的效果。At present, deep learning methods are mainly used for multivariate time series forecasting. This neural network based model can be effectively used for multivariate time series forecasting. For example: (1) Recurrent neural network RNN: It is a neural network that can process sequence data, and it captures the temporal relationship of the sequence by taking the output of the previous moment as the input of the current moment. In multivariate time series forecasting, multiple RNNs can be used to process each series separately, and then their outputs are concatenated to get the final forecast. RNN is widely used in multivariate time series forecasting, for example, it has achieved good results in traffic flow forecasting, stock price forecasting and other fields. (2) Long-short-term memory network LSTM: It is a special RNN that can better handle long sequences and long-term dependencies. In multivariate time series forecasting, multiple LSTMs can be used to process each series, and then their outputs are concatenated to get the final forecast. Compared to RNNs, LSTMs perform better at handling long sequences and long-term dependencies, so they may be more suitable in some applications. (3) Convolutional neural network CNN: It is a neural network mainly used for image processing, but it can also be used for multivariate time series prediction in some cases. In multivariate time series forecasting, multiple sequences can be regarded as multiple channels, and then multiple convolution kernels are used to perform convolution operations on each channel, and finally the outputs of all channels are connected to obtain the final prediction result. CNNs have relatively few applications in multivariate time series forecasting, but in some applications they may be better suited. (4) Attention mechanism Attention: It is a mechanism that can give different weights according to different parts of the input data, and can be used in multivariate time series prediction. In multivariate time series forecasting, an attention mechanism is used to automatically learn the importance of different parts of the sequence, and the attention weights are applied to the prediction of each sequence. The application of attention mechanisms to multivariate time series forecasting is relatively new, but has achieved good results in some applications.

然而，上述多元时间序列预测要么完全不考虑序列间的空间依赖关系，要么仅从某一个单一的角度考虑序列间的空间依赖关系。事实上，这些序列间的底层关系很复杂。因此，现在急需一种新的预测方法，保证其能够从多个不同的维度对这些底层关系进行充分挖掘。就像人类阅读文字一样，文字的隐喻往往比表层含义更加深刻。However, the above multivariate time series forecasting either does not consider the spatial dependence between the series at all, or only considers the spatial dependence between the series from a single perspective. In fact, the underlying relationships between these sequences are complex. Therefore, a new prediction method is urgently needed to ensure that it can fully mine these underlying relationships from multiple different dimensions. Just like humans read words, the metaphors of words are often deeper than the surface meaning.

发明内容Contents of the invention

本发明的目的在于，提供一种基于循环图算子神经网络的空间时间序列预测方法，其具备多角度理解信息的能力，有效提高对空间多元时序数据的理解，进而提高预测的精度。The purpose of the present invention is to provide a spatial time series prediction method based on a cyclic graph operator neural network, which has the ability to understand information from multiple angles, effectively improves the understanding of spatial multivariate time series data, and then improves the accuracy of prediction.

为实现上述目的，本申请提出一种基于循环图算子神经网络的空间时间序列预测方法，包括：In order to achieve the above purpose, this application proposes a spatial time series prediction method based on a cyclic graph operator neural network, including:

对交通路网进行建模得到图结构模型，将所述交通路网中的传感器抽象为图结构模型中的节点；Modeling the traffic road network to obtain a graph structure model, abstracting the sensors in the traffic road network into nodes in the graph structure model;

对传感器采集的数据进行规范化处理；Standardize the data collected by the sensor;

构建多种图算子网络，作为数据的特征捕捉器，其分别使用不同方式聚合图结构模型中各个节点的信息；Construct a variety of graph operator networks as data feature capturers, which use different methods to aggregate the information of each node in the graph structure model;

构建GGRU单元，分别将多种图算子网络嵌入到神经网络的GRU单元中；Construct the GGRU unit, and embed various graph operator networks into the GRU unit of the neural network;

通过整合器聚合不同图算子网络所捕获的特征信息；The feature information captured by different graph operator networks is aggregated through the integrator;

利用多个整合器，构建一种sequence-to-sequence的编码器-解码器架构，实现对空间多元时间序列的预测。Using multiple integrators, a sequence-to-sequence encoder-decoder architecture is constructed to realize the prediction of spatially multivariate time series.

进一步的，用表示所有传感器在r时刻采集到的交通流量数据，其中N表示传感器的数量，P表示每个传感器探测到的交通指标的数量；用/>表示T'个时间戳内的历史观测值，用/>表示未来T个时间戳的预测值；进而确定神经网络学习图结构模型的的函数：Further, use Indicates the traffic flow data collected by all sensors at time r, where N represents the number of sensors, and P represents the number of traffic indicators detected by each sensor; use /> Represents the historical observations within T' time stamps, use /> Represents the predicted value of T timestamps in the future; and then determines the function of the neural network learning graph structure model:

其中，为要拟合的神经网络。in, is the neural network to be fitted.

进一步的，对传感器采集的数据进行规范化处理具体方式为：Further, the specific way to normalize the data collected by the sensor is as follows:

其中，X表示训练集样本，X^*表示规范化后的数据，E(X)表示训练集样本均值，D(X)表示训练集样本方差。Among them, X represents the training set sample, X ^* represents the normalized data, E(X) represents the mean value of the training set sample, and D(X) represents the variance of the training set sample.

进一步的，构建多种图算子网络，作为数据的特征捕捉器，具体为：Further, build a variety of graph operator networks as data feature capturers, specifically:

构建图扩散卷积算子DC Operator，其为基于静态图的图算子网络；Construct a graph diffusion convolution operator DC Operator, which is a graph operator network based on a static graph;

构建图门控注意力算子GA Operator，其为基于动态图的图算子网络。Construct a graph-gated attention operator GA Operator, which is a dynamic graph-based graph operator network.

更进一步的，构建图扩散卷积算子DC Operator具体方式为：Furthermore, the specific method of constructing the graph diffusion convolution operator DC Operator is as follows:

图扩散卷积算子采用随机游走策略：The graph diffusion convolution operator uses a random walk strategy:

其中，X表示模型的输入，H表示模型的输出；其扩散过程使用有限的S步截断；D_O，D_I表示图中的出度矩阵和入度矩阵，表示前向状态转移矩阵和反向状态转移矩阵；α，β∈[0，1]分别表示随机游走的重启概率；/>表示模型参数，并且Θ_O[q，p，s]＝α(1-α)^s，Θ_I[q，p，k]＝β(1-β)^s；Among them, X represents the input of the model, H represents the output of the model; its diffusion process uses a limited S-step truncation; D _O , D _I represent the out-degree matrix and in-degree matrix in the figure, Represents the forward state transition matrix and reverse state transition matrix; α, β∈[0,1] represent the restart probability of random walk respectively; /> Represents model parameters, and Θ _{O[q, p, s]} = α(1-α) ^s , Θ _{I[q, p, k]} = β(1-β) ^s ;

使用基于距离的高斯核得到临界矩阵W：Use a distance-based Gaussian kernel to get the critical matrix W:

这里dist(v_i，v_j)表示传感器v_i和v_j之间的距离；σ是距离集合的标准差。Here dist(v _i , v _j ) denotes the distance between sensors v _i and v _j ; σ is the standard deviation of the set of distances.

更进一步的，构建图门控注意力算子GA Operator具体方式为：Furthermore, the specific method of constructing the graph-gated attention operator GA Operator is as follows:

在图门控注意力算子中使用K头注意力机制，则对于每一个节点i都有一个K维的门控向量g_i：Using the K-head attention mechanism in the graph-gated attention operator, there is a K-dimensional gating vector g _i for each node i:

其中，/>表示节点i的所有邻居节点集合；x_i＝X_i，：表示节点i的特征向量；/>节点i的所有邻接节点的参考向量，并且/> Max表示按元素取最大值；/>表示最终结果映射为K维，并缩放到[0，1]之间； where, /> Represents the set of all neighbor nodes of node i; x _i =X _i,: represents the feature vector of node i; /> the reference vectors of all neighbors of node i, and /> Max means to take the maximum value by element; /> Indicates that the final result is mapped to K dimensions and scaled to [0, 1];

动态获取节点间的注意力权重矩阵：Dynamically obtain the attention weight matrix between nodes:

其中，表示参数为θ_xa的线性变换；/>表示参数为θ_za的线性变换；即/>表示线性层；θ表示不同的参数。in, Represents a linear transformation whose parameter is θ _xa ; /> Represents a linear transformation whose parameter is θ _za ; ie /> denotes a linear layer; θ denotes different parameters.

得到节点i的输出向量y_i：Get the output vector y _{i of node i} :

其中，K表示注意力的头数，k表示当前执行的是第k个注意力头；表示参数为θ₀的线性变换；/>表示参数为/>的线性变换。Among them, K represents the number of attention heads, and k represents that the kth attention head is currently being executed; Represents a linear transformation whose parameter is θ ₀ ;/> Indicates that the parameter is /> linear transformation.

更进一步的，GGRU单元中用表示不同的图算子：Furthermore, the GGRU unit uses Represent different graph operators:

H^(t)＝u^(t)⊙H^(t-1)+(1-u^(t))⊙C^(t) H ^(t) ＝u ^(t) ⊙H ^(t-1) +(1-u ^(t) )⊙C ^(t)

其中，X^(t)，H^(t)表示第t个时间戳的输入与输出；r^(t)，u^(t)表示第t个时间戳的重置门(Reset gate)和更新门(Update gate)的门控状态；Θ_r，Θ_u，Θ_C是不同的滤波器参数；Γ_g表示在指定的图网络中执行/>算子；⊙表示Hadamard积。Among them, X ^(t) and H ^(t) represent the input and output of the t-th time stamp; r ^(t) and u ^(t) represent the reset gate (Reset gate) and the update gate (Update gate) of the t-th time stamp. gate); Θ _r , Θ _u , Θ _C are different filter parameters; Γ _g represents the specified graph network execute in /> Operator; ⊙ means Hadamard product.

作为更进一步的，通过整合器聚合不同图算子网络所捕获的特征信息，具体为：As a further step, the feature information captured by different graph operator networks is aggregated through the integrator, specifically:

其中，表示嵌入了图算子的GRU单元，I表示图算子的种类数，/>表示输出维度为d₀前馈神经网络，τ表示在最后一层中使用tanh激活函数。in, Indicates the GRU unit embedded with graph operators, I indicates the number of types of graph operators, /> Indicates that the output dimension is d ₀ feed-forward neural network, and τ indicates that the tanh activation function is used in the last layer.

作为更进一步的，编码器的输入为历史观测值解码器的输出为未来的预测值/> As a further step, the input to the encoder is historical observations The output of the decoder is the future prediction value />

本发明采用的以上技术方案，与现有技术相比，具有的优点是：本发明采用深度学习模型，结合多个图算子网络对节点间的空间依赖关系进行多角度建模，同时通过整合器使多个图算子捕获到的信息能够有效聚合。增强了模型对多元时间序列间底层依赖关系的理解能力，提高了预测精度。Compared with the prior art, the above technical solution adopted by the present invention has the advantage that the present invention uses a deep learning model and combines multiple graph operator networks to model the spatial dependencies between nodes from multiple angles, and at the same time through the integration The device enables the effective aggregation of information captured by multiple graph operators. It enhances the model's ability to understand the underlying dependencies between multivariate time series, and improves the prediction accuracy.

附图说明Description of drawings

图1为使用并联单输入，单输出结构图；Figure 1 is a structure diagram using parallel single input and single output;

图2为使用并联多输入，多输出结构图；Figure 2 is a structure diagram using parallel multi-input and multi-output;

图3为使用并联单输入，单输出以及类残差结构图；Figure 3 is a structure diagram using parallel single input, single output and class residuals;

图4为使用并联多输入，多输出以及类残差结构图；Figure 4 is a structure diagram using parallel multi-input, multi-output and class residuals;

图5为使用串联单输入，单输出结构图；Figure 5 is a structure diagram using serial single input and single output;

图6为使用串联多输入，多输出结构图；Fig. 6 is a structural diagram of multi-input and multi-output using serial connection;

图7为使用串联单输入，单输出以及类残差结构图；Fig. 7 is a structure diagram using series single input, single output and class residual;

图8为使用串联多输入，多输出以及类残差结构图；Fig. 8 is a structure diagram using serial multi-input, multi-output and class residuals;

图9为GGRU模型结构图；Figure 9 is a structural diagram of the GGRU model;

图10为编码器解码器架构图。FIG. 10 is an architecture diagram of an encoder decoder.

具体实施方式Detailed ways

为了使本申请的目的、技术方案及优点更加清楚明白，以下结合附图及实施例，对本申请进行进一步详细说明。应当理解，此处所描述的具体实施例仅用以解释本申请，并不用于限定本申请，即所描述的实施例仅是本申请一部分实施例，而不是全部的实施例。In order to make the purpose, technical solution and advantages of the present application clearer, the present application will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present application, and are not intended to limit the present application, that is, the described embodiments are only part of the embodiments of the present application, rather than all the embodiments.

因此，以下对在附图中提供的本申请的实施例的详细描述并非旨在限制要求保护的本申请的范围，而是仅仅表示本申请的选定实施例。基于本申请的实施例，本领域技术人员在没有做出创造性劳动的前提下所获得的所有其他实施例，都属于本申请保护的范围。Accordingly, the following detailed description of the embodiments of the application provided in the accompanying drawings is not intended to limit the scope of the claimed application, but merely represents selected embodiments of the application. Based on the embodiments of the present application, all other embodiments obtained by those skilled in the art without making creative efforts belong to the scope of protection of the present application.

本申请提供一种基于循环图算子神经网络的空间时间序列预测方法，具体包括：This application provides a spatial time series prediction method based on a cyclic graph operator neural network, which specifically includes:

步骤1：对交通路网进行建模得到图结构模型，将所述交通路网中的传感器抽象为图结构模型中的节点；Step 1: Modeling the traffic road network to obtain a graph structure model, abstracting the sensors in the traffic road network into nodes in the graph structure model;

具体的，用表示所有传感器在t时刻采集到的交通流量数据，其中N表示传感器的数量，P表示每个传感器探测到的交通指标的数量；用/>表示T′个时间戳内的历史观测值，用/>表示未来T个时间戳的预测值；进而确定神经网络学习图结构模型的的函数：Specifically, use Indicates the traffic flow data collected by all sensors at time t, where N represents the number of sensors, and P represents the number of traffic indicators detected by each sensor; use /> Indicates the historical observations within T′ time stamps, use /> Represents the predicted value of T timestamps in the future; and then determines the function of the neural network learning graph structure model:

其中，为要拟合的神经网络。in, is the neural network to be fitted.

步骤2：对传感器采集的数据进行规范化处理：Step 2: Normalize the data collected by the sensor:

步骤3：构建多种图算子网络，作为数据的特征捕捉器，其分别使用不同方式聚合图结构模型中各个节点的信息；Step 3: Construct a variety of graph operator networks as data feature capturers, which use different methods to aggregate the information of each node in the graph structure model;

具体的，本发明包括两种类型的图算子，其分别使用不同方式聚合图结构模型中各个节点的信息，并将其嵌入到GRU单元中，作为原始线性单元的替换。Specifically, the present invention includes two types of graph operators, which use different methods to aggregate the information of each node in the graph structure model, and embed it into the GRU unit as a replacement for the original linear unit.

步骤3.1：构建图扩散卷积算子DC Operator，其为基于静态图的图算子网络；Step 3.1: Construct a graph diffusion convolution operator DC Operator, which is a graph operator network based on a static graph;

步骤3.1.1：图扩散卷积算子采用随机游走策略：Step 3.1.1: The graph diffusion convolution operator adopts a random walk strategy:

这里，X表示模型的输入，H表示模型的输出。其扩散过程使用有限的K步截断。D_O，D_I表示图中的出度和入度矩阵，表示前向状态转移矩阵和反向状态转移矩阵。α，β∈[0，1]分别表示随机游走的重启概率。/>表示模型参数，并且Θ_O[q，p，s]＝α(1-α)^s，Θ_I[q，p，s]＝β(1-β)^s。/>表示从第i个节点到它邻居节点执行S步扩散的概率值，前者表示前向传播概率，后者表示反向传播概率。Here, X represents the input of the model, and H represents the output of the model. Its diffusion process uses a finite K-step truncation. D _O , D _I represent the out-degree and in-degree matrices in the graph, Represents the forward state transition matrix and the reverse state transition matrix. α, β ∈ [0, 1] represent the restart probability of the random walk, respectively. /> denote model parameters, and Θ _{O[q, p, s]} = α(1-α) ^s , Θ _{I[q, p, s]} = β(1-β) ^s . /> Represents the probability value of performing S-step diffusion from the i-th node to its neighbor nodes, the former represents the forward propagation probability, and the latter represents the back propagation probability.

步骤3.1.2：使用基于距离的高斯核得到临界矩阵W：Step 3.1.2: Use the distance-based Gaussian kernel to get the critical matrix W:

步骤3.2：构建图门控注意力算子GA Operator，其为基于动态图的图算子网络；Step 3.2: Build a graph-gated attention operator GA Operator, which is a graph operator network based on dynamic graphs;

步骤3.2.1：在图门控注意力算子中使用K头注意力机制，则对于每一个节点i都有一个K维的门控向量g_i：Step 3.2.1: Using the K-head attention mechanism in the graph-gated attention operator, there is a K-dimensional gating vector g _i for each node i:

其中，表示节点i的所有邻居节点集合；x_i＝X_i，：表示节点i的特征向量；节点i的所有邻接节点的参考向量，并且/> Max表示按元素取最大值；/>表示最终结果映射为K维，并缩放到[0，1]之间；in, Represents the set of all neighbor nodes of node i; x _i =X _i,: represents the feature vector of node i; the reference vectors of all neighbors of node i, and /> Max means to take the maximum value by element; /> Indicates that the final result is mapped to K dimensions and scaled to [0, 1];

步骤3.2.2：动态获取节点间的注意力权重矩阵：Step 3.2.2: Dynamically obtain the attention weight matrix between nodes:

其中，表示参数为θ_xa的线性变换；/>表示参数为θ_za的线性变换；in, Represents a linear transformation whose parameter is θ _xa ; /> Represents a linear transformation whose parameter is θ _za ;

步骤3.2.3：得到节点i的输出向量y_i：Step 3.2.3: Get the output vector y _i of node i:

步骤4：构建GGRU单元，分别将多种图算子网络嵌入到神经网络的GRU单元中，参见图9，这里表示不同的图算子：Step 4: Construct the GGRU unit, and embed various graph operator networks into the GRU unit of the neural network, see Figure 9, here Represent different graph operators:

其中，X^(t)，H^(t)表示第t个时间戳的输入与输出；r^(t)，u^(t)表示第t个时间戳的重置门(Reset gate)和更新门(Update gate)的门控状态；Θ_r，Θ_u，Θ_C是不同的滤波器参数；表示在指定的图网络/>中执行/>算子；⊙表示Hadamard积。Among them, X ^(t) and H ^(t) represent the input and output of the t-th time stamp; r ^(t) and u ^(t) represent the reset gate (Reset gate) and the update gate (Update gate) of the t-th time stamp. gate); Θ _r , Θ _u , Θ _C are different filter parameters; Represents a network in the specified graph /> execute in /> Operator; ⊙ means Hadamard product.

步骤5：通过整合器聚合不同图算子网络所捕获的特征信息；Step 5: aggregate feature information captured by different graph operator networks through the integrator;

其中，表示嵌入了图算子的GRU单元。图1-图8给出了八种聚合器结构。in, Represents a GRU unit embedded with graph operators. Figures 1-8 show eight aggregator structures.

步骤6：利用多个整合器，构建一种sequence-to-sequence的编码器-解码器架构，实现对空间多元时间序列的预测，参见图10；Step 6: Use multiple integrators to construct a sequence-to-sequence encoder-decoder architecture to realize the prediction of spatial multivariate time series, see Figure 10;

具体的，编码器的输入为历史观测值解码器的输出为未来的预测值/> Specifically, the input of the encoder is the historical observation value The output of the decoder is the future prediction value />

本实施例以Ubuntu系统为开发环境，Python为开发语言，使用Pytorch搭建框架。采用本发明的一种基于循环图算子神经网络的空间时间序列预测方法，进行交通流量的预测：获取交通路网的图结构模型表示，采用公开的METR-LA数据集；将数据集划分为训练集、验证集和测试集，比例为7：1：2。为了比较算法性能，与一些常用的多元时间序列预测方法进行了对比。其中包括统计学习方法ARIMA；机器学习方法LSVR；深度学习方法FC-LSTM；以及基于图的深度学习方法DCRNN、STGCN、GaAN、ASTGCN、GMAN。同时使用了多种评价指标对模型的性能进行了较为全面的评估，包括MAE、RMSE、MAPE等。In this embodiment, the Ubuntu system is used as the development environment, Python is used as the development language, and Pytorch is used to build the framework. Adopt a kind of spatial time series forecasting method based on cycle graph operator neural network of the present invention, carry out the forecast of traffic flow: obtain the graph structure model representation of traffic road network, adopt public METR-LA data set; Divide data set into The ratio of training set, validation set and test set is 7:1:2. In order to compare the performance of the algorithm, it is compared with some commonly used multivariate time series forecasting methods. These include statistical learning method ARIMA; machine learning method LSVR; deep learning method FC-LSTM; and graph-based deep learning methods DCRNN, STGCN, GaAN, ASTGCN, GMAN. At the same time, a variety of evaluation indicators are used to evaluate the performance of the model comprehensively, including MAE, RMSE, MAPE, etc.

其中，x_i表示实际数据，表示模型预测的数据。Among them, _xi represents the actual data, Represents the data predicted by the model.

对比结果见表1：其中所有算法中最好的评测指标将被加粗。The comparison results are shown in Table 1: the best evaluation indicators among all algorithms will be bolded.

表1：METR-LA数据集上不同交通速度预测模型的性能比较Table 1: Performance comparison of different traffic speed prediction models on the METR-LA dataset

表1比较了iGoRNN模型与其他基线模型的预测结果。通过比较每个模型在长期和短期时间序列预测上的得分，可以看到，基于图形的神经网络模型取得了更好的预测精度。对于较平滑的序列PeMS-BAY，一些传统模型即使在短期预测(15分钟)中也取得了良好的效果，但在长期预测(60分钟)中表现不佳。对于不稳定的序列Metr-LA，传统模型在短期和长期预测中都表现不佳。另一方面，GaAN模型在Metr-LA数据集上表现良好。GMAN更适用于较平滑的序列。所有这些都说明了在复杂时间序列预测中引入图结构的重要性。此外，本发明设计的模型在长期和短期预测中显示出最优或次优的性能，特别是对于更复杂的时间序列，因为它同时引入了多种图结构的优点。Table 1 compares the prediction results of the iGoRNN model with other baseline models. By comparing the scores of each model on long-term and short-term time series forecasting, it can be seen that the graph-based neural network model achieves better forecast accuracy. For the smoother sequence PeMS-BAY, some conventional models achieve good results even in short-term forecasts (15 minutes), but poorly in long-term forecasts (60 minutes). For the unstable sequence Metr-LA, traditional models perform poorly in both short-term and long-term forecasts. On the other hand, the GaAN model performs well on the Metr-LA dataset. GMAN is more suitable for smoother sequences. All of these illustrate the importance of introducing graph structures in complex time series forecasting. In addition, the model designed by the present invention shows optimal or suboptimal performance in long-term and short-term forecasting, especially for more complex time series, because it simultaneously introduces the advantages of multiple graph structures.

综上所述，本发明提出的一种基于循环图算子神经网络的空间时间序列预测方法同其他较先进的方法相比，具有更好的性能和鲁棒性，能够适应更为复杂的多元时间序列预测问题。To sum up, compared with other more advanced methods, a spatial time series prediction method based on cyclic graph operator neural network proposed by the present invention has better performance and robustness, and can adapt to more complex multivariate Time series forecasting problems.

前述对本发明的具体示例性实施方案的描述是为了说明和例证的目的。这些描述并非想将本发明限定为所公开的精确形式，并且很显然，根据上述教导，可以进行很多改变和变化。对示例性实施例进行选择和描述的目的在于解释本发明的特定原理及其实际应用，从而使得本领域的技术人员能够实现并利用本发明的各种不同的示例性实施方案以及各种不同的选择和改变。本发明的范围意在由权利要求书及其等同形式所限定。The foregoing descriptions of specific exemplary embodiments of the present invention have been presented for purposes of illustration and description. These descriptions are not intended to limit the invention to the precise form disclosed, and obviously many modifications and variations are possible in light of the above teaching. The exemplary embodiments were chosen and described in order to explain the specific principles of the invention and its practical application, thereby enabling others skilled in the art to make and use various exemplary embodiments of the invention, as well as various Choose and change. It is intended that the scope of the invention be defined by the claims and their equivalents.

Claims

1. The space time sequence prediction method based on the cyclic graph algorithm sub-neural network is characterized by comprising the following steps of:

modeling a traffic road network to obtain a graph structure model, and abstracting a sensor in the traffic road network into nodes in the graph structure model;

carrying out standardization processing on data acquired by a sensor;

constructing various graph calculation sub-networks as feature traps of data, wherein the feature traps respectively use different modes to aggregate information of each node in the graph structure model;

constructing GGRU units, and respectively embedding various graph calculation sub-networks into GRU units of the neural network;

aggregating the characteristic information captured by different graph sub-networks through an integrator;

a sequence-to-sequence encoder-decoder architecture is constructed using a plurality of integrators to implement prediction of spatial multi-element time series.

2. The method for predicting the space-time sequence based on the cyclic graph sub-neural network according to claim 1, whereinThe traffic flow data acquired by all the sensors at the time t are represented, wherein N represents the number of the sensors, and P represents the number of traffic indexes detected by each sensor; use->Representing historical observations within T' time stamps, with +.>Predicted values representing T time stamps in the future; and then determining a function of the neural network learning graph structure model:

wherein ,is the neural network to be fitted.

3. The space-time sequence prediction method based on the cyclic graph sub-neural network according to claim 1, wherein the specific mode of carrying out normalization processing on the data acquired by the sensor is as follows:

wherein X represents training set sample, X ^* Representing normalized data, E (X) represents training set sample mean and D (X) represents training set sample variance.

4. The space-time sequence prediction method based on the cyclic graph sub-neural network according to claim 1, wherein a plurality of graph sub-networks are constructed as feature traps of data, specifically:

constructing a graph diffusion convolution Operator DC Operator, which is a graph calculation sub-network based on a static graph;

a graph gating attention Operator GA Operator is constructed, which is a graph computation sub-network based on dynamic graphs.

5. The space-time sequence prediction method based on the cyclic graph Operator neural network according to claim 4, wherein the specific mode of constructing the graph diffusion convolution Operator DC Operator is as follows:

the graph diffusion convolution operator adopts a random walk strategy:

wherein X represents the input of the model and H represents the output of the model; the diffusion process uses limited S steps for cutting; d (D) _O ,D _I The out-degree matrix and in-degree matrix in the graph are represented,representing a forward state transition matrix and a reverse state transition matrix; alpha, beta E [0,1 ]]Respectively representing the restarting probability of random walk; />Represents model parameters, and Θ _O[q,p,s] ＝α(1-α) ^s ,Θ _I[q,p,k] ＝β(1-β) ^s ；

The critical matrix W is obtained using a distance-based gaussian kernel:

here dist (v) _i ,v _j ) Representing sensor v _i and v_j A distance therebetween; σ is the standard deviation of the distance set.

6. The space-time sequence prediction method based on the cyclic graph Operator neural network according to claim 4, wherein the specific mode of constructing the graph gating attention Operator GA Operator is as follows:

using the K-head attention mechanism in the graph gating attention operator, there is one K-dimensional gating vector g for each node i _i ：

wherein ,representing all neighbor node sets of node i; x is x _i ＝X _i,: A feature vector representing node i; />Reference vectors of all neighboring nodes of node i, and z _i ＝/>Max represents taking the maximum value by element; />Representing the final result mapped to K-dimension and scaled to [0,1 ]]Between them;

dynamically acquiring an attention weight matrix among nodes:

wherein ,representing the parameter theta _xa Is a linear transformation of (2); />Representing the parameter theta _za Is a linear transformation of (2);

obtaining an output vector y of the node i _i :

Where K represents the number of attention heads and K represents the kth note currently being performedA force-imparting head;representing the parameter theta ₀ Is a linear transformation of (2); />The expression parameter is->Is a linear transformation of (a).

7. The method for spatial-temporal sequence prediction based on a cyclic graph sub-neural network of claim 1, wherein the GGRU unit is used inRepresenting different graphic operators:

H ^(t) ＝u ^(t) ⊙H ^(t-1) +(1-u ^(t) )⊙C ^(t)

wherein ,H^(t) ,H ^(t) Input and output representing a t-th timestamp; r is (r) ^(t) ,u ^(t) A reset gate and a gate state of the update gate representing a t-th timestamp; theta (theta) _r ,Θ _u ,Θ _C Is a different filter parameter;represented in the specified graph network +.>Middle execution->An operator; as indicated by the letter Hadamard product.

8. The space-time sequence prediction method based on cyclic graph sub-neural network according to claim 1, wherein the integrator is used for aggregating the characteristic information captured by different graph sub-networks, specifically:

wherein ,representing GRU units embedded with graphic operators, I representing the number of categories of graphic operators,/-, etc.>Representing an output dimension d ₀ Feedforward neural network, τ, means using the tanh activation function in the last layer.

9. The method of claim 1, wherein the encoder input is a historical observationThe output of the decoder is the future prediction value +.>