CN115659609A - A Noise Prediction Method for Chemical Industry Park Based on DTW-DCRNN - Google Patents

A Noise Prediction Method for Chemical Industry Park Based on DTW-DCRNN Download PDF

Info

Publication number
CN115659609A
CN115659609A CN202211238200.1A CN202211238200A CN115659609A CN 115659609 A CN115659609 A CN 115659609A CN 202211238200 A CN202211238200 A CN 202211238200A CN 115659609 A CN115659609 A CN 115659609A
Authority
CN
China
Prior art keywords
matrix
noise
sequence
prediction
diffusion
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211238200.1A
Other languages
Chinese (zh)
Inventor
陈赓
曾庆田
梁宇
段华
姚文静
张煜东
周玉祥
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong University of Science and Technology
Original Assignee
Shandong University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong University of Science and Technology filed Critical Shandong University of Science and Technology
Priority to CN202211238200.1A priority Critical patent/CN115659609A/en
Publication of CN115659609A publication Critical patent/CN115659609A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a DTW-DCRNN-based chemical industry park noise prediction method, belongs to the field of signal and information processing, and solves the problem of long-term global noise prediction of a chemical industry park. And (3) introducing a time dynamic regularization theory into the diffusion convolution recurrent neural network, and establishing a space-time noise prediction network model consisting of a time dynamic regularization method, the diffusion convolution recurrent neural network and Kalman filtering. Improving a DTW algorithm by using a penalty coefficient, and reconstructing the spatial relationship of the neural network through a DTW model; the output noise prediction of the neural network is dynamically adjusted using a Kalman method in conjunction with the traffic flow characteristics. According to the method, a spatial relationship is reconstructed based on time sequence similarity, and then data of each station are sent into a model for prediction and dynamic correction, so that multi-station noise prediction, long-term noise prediction and global noise level prediction are realized, and risks of noise disturbing residents and health damage can be avoided in advance.

Description

一种基于DTW-DCRNN的化工园区噪声预测方法A Noise Prediction Method for Chemical Industry Park Based on DTW-DCRNN

技术领域technical field

本发明属于信号与信息处理领域,具体涉及一种基于DTW-DCRNN的化工园区噪声预测方法。The invention belongs to the field of signal and information processing, and in particular relates to a DTW-DCRNN-based noise prediction method for chemical industry parks.

背景技术Background technique

随着工业的不断发展,环境噪声污染问题随着城市社会和经济的快速发展而日趋严峻,噪声污染防治工作存在的主要问题也越来越明显。而目前神经网络研究不断进步,所以亟需一种基于神经网络的方法来进行环境噪声的长期预测。With the continuous development of industry, the problem of environmental noise pollution has become increasingly severe with the rapid development of urban society and economy, and the main problems in the prevention and control of noise pollution have become more and more obvious. At present, the research on neural network continues to advance, so there is an urgent need for a method based on neural network for long-term prediction of environmental noise.

“智慧化工园区”概念的兴起为各类数据的精准、有效收集奠定了良好的基础。东明工程塑料产业园的智慧园区建设较为完善,各项监测数据齐全且数据量充足,为开展化工园区噪声预测研究提供了良好的数据基础。基于深度学习方法对数据进行建模分析与预测、挖掘噪声数据规律,可以有助于园区管理者在噪音分贝较高时段提前采取预防措施,如:园区操作工人在规定时间段佩戴降噪耳机、提前设立噪声挡板或将生产噪声过高的设备远离居民区放置等。这些工作一方面可以通过预测提醒与预警分析为操作工人提供相对安全的工作环境,另一方面也能够减小噪音对园区附近居民生活的影响。此外,园区内不同位置噪声水平也不同,对于环境噪声的全局预测对园区职业危险评估、环境风险预测也有着极为重要的意义。The rise of the concept of "smart chemical park" has laid a good foundation for the accurate and effective collection of various data. The smart park construction of Dongming Engineering Plastics Industrial Park is relatively complete, and the monitoring data is complete and the data volume is sufficient, which provides a good data basis for the development of noise prediction research in chemical parks. Modeling, analysis and prediction of data based on deep learning methods, and mining of noise data rules can help park managers take preventive measures in advance during periods of high noise decibels, such as: park operators wear noise-canceling headphones during specified time periods, Set up noise baffles in advance or place equipment with excessive production noise away from residential areas. On the one hand, these tasks can provide operators with a relatively safe working environment through predictive reminders and early warning analysis, and on the other hand, they can also reduce the impact of noise on the lives of residents near the park. In addition, different locations in the park have different noise levels, and the global prediction of environmental noise is also of great significance to the assessment of occupational hazards in the park and the prediction of environmental risks.

发明内容Contents of the invention

基于上述问题,本发明提出了一种基于DTW-DCRNN的化工园区噪声预测方法,从时空预测角度出发,构建了由改进时间动态规整算法、扩散卷积递归神经网络、卡尔曼滤波构成的时空噪声预测网络模型,来对化工园区的噪声进行长期、全局预测。Based on the above problems, the present invention proposes a noise prediction method for chemical industry parks based on DTW-DCRNN. From the perspective of space-time prediction, a space-time noise composed of improved time dynamic warping algorithm, diffusion convolution recurrent neural network and Kalman filter is constructed. A predictive network model is used to make long-term and global predictions of noise in chemical industrial parks.

为了实现上述目的,本发明采用如下技术方案:In order to achieve the above object, the present invention adopts the following technical solutions:

一种基于DTW-DCRNN的化工园区噪声预测方法,采用改进的时间动态规整算法并通过监测站点的图结构重构空间关系,实现对化工园区噪声的长期时空预测,具体包括如下步骤:A noise prediction method for chemical industry parks based on DTW-DCRNN, which uses an improved time dynamic warping algorithm and reconstructs the spatial relationship through the graph structure of monitoring stations to realize long-term spatiotemporal prediction of noise in chemical industry parks. It specifically includes the following steps:

步骤1、采集化工园区内的监测站点信息和车辆信息数据并进行处理;Step 1. Collect and process the monitoring site information and vehicle information data in the chemical industry park;

步骤2、引入惩罚系数改进原始时间动态规整算法,并将其引入到扩散卷积递归神经网络中,再结合卡尔曼滤波构建噪声预测网络模型;Step 2. Introduce the penalty coefficient to improve the original time dynamic warping algorithm, and introduce it into the diffusion convolution recurrent neural network, and then combine the Kalman filter to construct the noise prediction network model;

步骤3、对构建的噪声预测网络模型进行训练,输出训练完成的模型;Step 3, train the constructed noise prediction network model, and output the trained model;

步骤4、实时采集监测站点信息并基于训练完成的模型进行噪声的实时预测。Step 4. Collect monitoring site information in real time and perform real-time prediction of noise based on the trained model.

进一步地,步骤1中,化工园区内设置道闸口和若干监测站点用于实时采集数据,同步记录各个监测站点的位置信息数据;道闸口和各个监测站点均设有传感器;道闸口采集进出车辆的信息数据,包括车牌号、车辆进出时间、车型的车流量特征;监测站点实时采集化工园区内的噪声与自然环境数据;Further, in step 1, the road gate and several monitoring stations are set in the chemical industry park for real-time data collection, and the location information data of each monitoring station are recorded synchronously; the road gate and each monitoring station are equipped with sensors; Information data, including license plate number, vehicle entry and exit time, and vehicle flow characteristics of vehicle models; monitoring stations collect real-time noise and natural environment data in the chemical industrial park;

站点位置数据、噪声与自然环境数据、进出车辆信息数据均会传到网关设备,之后与数据库服务器交互传输,在进行噪声预测时,实时进行数据提取,并采用3σ准则进行处理。Site location data, noise and natural environment data, and vehicle information data entering and leaving will be transmitted to the gateway device, and then interactively transmitted with the database server. When noise prediction is performed, data is extracted in real time and processed using the 3σ criterion.

进一步地,步骤2的具体过程如下:Further, the specific process of step 2 is as follows:

步骤2.1、引入惩罚系数改进原始时间动态规整算法,计算序列相似度距离;Step 2.1, introducing a penalty coefficient to improve the original time dynamic warping algorithm, and calculating the sequence similarity distance;

改进的时间动态规整算法的原理为:求得最优路径与公共子序列数目以计算惩罚系数,结合惩罚系数计算站与站之间的相似度距离;具体过程如下:The principle of the improved time dynamic warping algorithm is: obtain the optimal path and the number of common subsequences to calculate the penalty coefficient, and combine the penalty coefficient to calculate the similarity distance between stations; the specific process is as follows:

步骤2.1.1、计算距离矩阵;Step 2.1.1, calculate the distance matrix;

假设有两个噪声序列S和T,如式(1):Suppose there are two noise sequences S and T, such as formula (1):

Figure BDA0003883599360000021
Figure BDA0003883599360000021

序列S长为o,序列T长为m,o,m∈Z+,构建基于噪声序列m×o的距离矩阵d[s][t],其中第i个序列点si(i=1,2,…,o)和第j个序列点tj(j=1,2,…,m)之间的距离为矩阵(si,tj)元素的值,欧氏距离:dis(si,tj)=(si-tj)2,矩阵距离计算标准如式(2)所示:The length of the sequence S is o, the length of the sequence T is m, o, m∈Z + , construct a distance matrix d[s][t] based on the noise sequence m×o, where the i-th sequence point s i (i=1, 2,...,o) and the jth sequence point t j (j=1,2,...,m) is the value of the element of the matrix (s i ,t j ), Euclidean distance: dis(s i ,t j )=(s i -t j ) 2 , the matrix distance calculation standard is shown in formula (2):

Figure BDA0003883599360000022
Figure BDA0003883599360000022

步骤2.1.2、求矩阵最优路径;Step 2.1.2, seeking the optimal path of the matrix;

最优路径为满足下边界条件的积累距离最小的路径,下边界条件为:满足从矩阵右上角d[o][m]点出发,找左下三个点中最小的点作为下一个节点,直到左下角d[0][0]矩阵点结束,除此之外还要保证连续性与单调性,不可跨越或者遗漏某一点进行匹配,只能与相邻的点对齐或匹配,时间序列的先后次序不可变化;The optimal path is the path with the minimum cumulative distance that satisfies the lower boundary condition. The lower boundary condition is: starting from the point d[o][m] in the upper right corner of the matrix, find the smallest point among the three lower left points as the next node until The d[0][0] matrix point in the lower left corner ends. In addition, continuity and monotonicity must be ensured. A point cannot be crossed or missed for matching. It can only be aligned or matched with adjacent points. The sequence of time series The order cannot be changed;

步骤2.1.3、计算公共子序列数目和最优路径序列长度,进而求得权重,得到惩罚系数;Step 2.1.3, calculating the number of common subsequences and the length of the optimal path sequence, and then obtaining the weight and penalty coefficient;

其中,最优路径序列长度为步骤2.1.2中最终求得的矩阵最优路径的长度;Wherein, the optimal path sequence length is the length of the matrix optimal path finally obtained in step 2.1.2;

公共子序列数目计算过程为:首先构建记录空列表record,循环遍历序列s1与s2,出现共有序列后用record列表记录共有序列长度ls1或ls2与所处序列位置g,将该公共子序列从s1序列中提取出来放置在s_sum数组中存储;遍历完成后s_sum的长度就是公共子序列的个数;计算公共子序列数目时,设置阈值为1,将长度大于1的子序列算作公共子序列,最后统计公共子序列的总数即为所需的公共噪声子序列数目;The calculation process of the number of common subsequences is as follows: firstly construct the record empty list record, loop through the sequences s1 and s2, use the record list to record the length ls1 or ls2 of the common sequence and the position g of the common sequence after the common sequence appears, and transfer the common subsequence from s1 The sequence is extracted and stored in the s_sum array; after the traversal, the length of s_sum is the number of common subsequences; when calculating the number of common subsequences, set the threshold to 1, and count subsequences with a length greater than 1 as common subsequences. Finally, the total number of common subsequences is counted as the required number of common noise subsequences;

权重w的计算公式如式(3):The calculation formula of weight w is as formula (3):

Figure BDA0003883599360000031
Figure BDA0003883599360000031

其中,subseq为公共噪声子序列长度,seq为矩阵最优路径的序列长度;Among them, subseq is the length of the common noise subsequence, and seq is the sequence length of the optimal path of the matrix;

设置惩罚系数α的计算过程如式(4),式中x表示公共子序列数目,wi表示第i个序列点的权重,The calculation process of setting the penalty coefficient α is shown in formula (4), where x represents the number of common subsequences, w i represents the weight of the i-th sequence point,

Figure BDA0003883599360000032
Figure BDA0003883599360000032

步骤2.1.4、将最优路径表示为(r1,r2,…,rseq),ri(i=1,2,…,seq)表示最优路径序列中第i个序列点的值,最后用惩罚系数与原DTW计算得到的最优路径中数值之和相乘得到改进后的序列相似度距离disDTWStep 2.1.4. Express the optimal path as (r 1 ,r 2 ,…,r seq ), r i (i=1,2,…,seq) represents the value of the i-th sequence point in the optimal path sequence , and finally multiply the penalty coefficient with the sum of values in the optimal path calculated by the original DTW to obtain the improved sequence similarity distance dis DTW ;

Figure BDA0003883599360000033
Figure BDA0003883599360000033

步骤2.2、将各个监测站点间的噪声序列相似度距离构成矩阵并计算为邻接矩阵,构建图关系拓扑结构;具体过程为:Step 2.2, the noise sequence similarity distance between each monitoring site is formed into a matrix and calculated as an adjacency matrix to construct a graph relationship topology; the specific process is:

步骤2.2.1、根据各个监测站点的噪声序列相似度距离信息,计算站点之间的相似度距离矩阵M,矩阵M如下:Step 2.2.1, according to the noise sequence similarity distance information of each monitoring station, calculate the similarity distance matrix M between the stations, the matrix M is as follows:

Figure BDA0003883599360000034
Figure BDA0003883599360000034

其中,a为监测站点数目;Among them, a is the number of monitoring stations;

步骤2.2.2、计算相似度距离矩阵M的标准差λ,Mc为矩阵中第c个元素值,N为M矩阵中元素数目,μM为矩阵均值,计算过程如下:Step 2.2.2, calculate the standard deviation λ of the similarity distance matrix M, M c is the cth element value in the matrix, N is the number of elements in the M matrix, μ M is the matrix mean value, and the calculation process is as follows:

Figure BDA0003883599360000035
Figure BDA0003883599360000035

步骤2.2.3、根据相似距离矩阵M分析监测站点之间的相似性,利用相似距离矩阵中所有非无穷大的数的标准差λ构建邻接矩阵Md,邻接矩阵Md中的各个元素

Figure BDA0003883599360000036
的计算过程如下:Step 2.2.3, analyze the similarity between monitoring sites according to the similarity distance matrix M, use the standard deviation λ of all non-infinite numbers in the similarity distance matrix to construct the adjacency matrix M d , each element in the adjacency matrix M d
Figure BDA0003883599360000036
The calculation process is as follows:

Figure BDA0003883599360000041
Figure BDA0003883599360000041

其中,c表示矩阵元素序号,相似距离越大,则

Figure BDA0003883599360000042
越小,设置阈值为0.1,若
Figure BDA0003883599360000043
<阈值,则视为两站点间相似度太低,无相互影响关系,视作非邻接站点,在邻接矩阵中权值为零不构成邻接关系;Among them, c represents the serial number of matrix elements, and the larger the similarity distance is, the
Figure BDA0003883599360000042
The smaller the value, set the threshold to 0.1, if
Figure BDA0003883599360000043
<Threshold value, it is considered that the similarity between the two sites is too low, there is no mutual influence relationship, it is regarded as a non-adjacent site, and the weight value of zero in the adjacency matrix does not constitute an adjacency relationship;

步骤2.3、引入图卷积神经网络GCN构建邻接矩阵图结构,将邻接矩阵图结构输入到扩散卷积递归神经网络中进行噪声预测,得到初步预测结果;具体过程如下:Step 2.3, introduce the graph convolutional neural network GCN to construct the adjacency matrix graph structure, input the adjacency matrix graph structure into the diffusion convolution recurrent neural network for noise prediction, and obtain preliminary prediction results; the specific process is as follows:

图卷积神经网络GCN网络层之间的传播方式为式(12),L表示第L个网络层:The propagation mode between the network layers of the graph convolutional neural network GCN is formula (12), and L represents the Lth network layer:

Figure BDA0003883599360000044
Figure BDA0003883599360000044

其中,σ为非线性激活函数;WL是可训练权重矩阵,

Figure BDA0003883599360000045
为度矩阵,其构成为:
Figure BDA0003883599360000046
I表示邻接矩阵A第I行,J表示邻接矩阵A第J列,
Figure BDA0003883599360000047
表示保留站点本身的特征信息,H为当前层提取到的特征,如果是输入层则X=H;Among them, σ is a nonlinear activation function; W L is a trainable weight matrix,
Figure BDA0003883599360000045
is a degree matrix, and its composition is:
Figure BDA0003883599360000046
I represents row I of adjacency matrix A, J represents column J of adjacency matrix A,
Figure BDA0003883599360000047
Indicates to keep the feature information of the site itself, H is the feature extracted by the current layer, if it is the input layer, then X=H;

在GCN图卷积神经网络的基础上,使用扩散卷积递归神经网络对噪声序列的时空关系建模;Based on the GCN graph convolutional neural network, the spatial-temporal relationship of the noise sequence is modeled using a diffusion convolutional recurrent neural network;

扩散过程的平稳分布表示为图上无限随机游动的加权组合,扩散过程表示为式(13),并以封闭形式计算:The stationary distribution of the diffusion process is expressed as a weighted combination of infinite random walks on the graph, and the diffusion process is expressed as Equation (13) and calculated in closed form:

Figure BDA0003883599360000048
Figure BDA0003883599360000048

其中,W是节点相似性矩阵,

Figure BDA0003883599360000049
是出度矩阵的逆矩阵,β∈[0,1]表示重启概率,k是扩散程度,ε表示从节点扩散的可能性,DCRNN模型中扩散过程是双向的;where W is the node similarity matrix,
Figure BDA0003883599360000049
is the inverse matrix of the out-degree matrix, β∈[0,1] represents the restart probability, k is the diffusion degree, ε represents the possibility of spreading from the node, and the diffusion process in the DCRNN model is bidirectional;

因此,在空间关系上,基于图信号特征矩阵

Figure BDA00038835993600000410
与滤波器fθ双向扩散卷积运算定义为式(14):Therefore, in terms of spatial relationship, based on the graph signal feature matrix
Figure BDA00038835993600000410
The bidirectional diffusion convolution operation with the filter f θ is defined as formula (14):

Figure BDA00038835993600000411
Figure BDA00038835993600000411

其中,θ为滤波器参数,G表示图G,p表示第p个特征维度,P表示特征维度总数,K表示扩散过程的有限K步截断,★G表示对图G的扩散卷积,X:,p表示所有节点对第p个特征的卷积运算,θk,1表示出度计算的卷积核参数,θk,2表示入度计算的卷积核参数,

Figure BDA00038835993600000412
Figure BDA00038835993600000413
分别表示扩散与逆扩散的过度矩阵,定义卷积运算后,构建带有映射关系的扩散卷积层如式(15):Among them, θ is the filter parameter, G represents the graph G, p represents the pth feature dimension, P represents the total number of feature dimensions, K represents the finite K-step truncation of the diffusion process, G represents the diffusion convolution on the graph G, X: , p represents the convolution operation of all nodes on the p-th feature, θ k,1 represents the convolution kernel parameter for out-degree calculation, θ k,2 represents the convolution kernel parameter for in-degree calculation,
Figure BDA00038835993600000412
and
Figure BDA00038835993600000413
Represent the transition matrix of diffusion and inverse diffusion respectively. After defining the convolution operation, construct the diffusion convolution layer with mapping relationship as formula (15):

Figure BDA0003883599360000051
Figure BDA0003883599360000051

其中,q表示第q个特征,Q表示映射特征总数,H:,q表示所有节点对第q个特征扩散卷积运算,特征矩阵

Figure BDA0003883599360000052
为输入,当前层提取到的特征
Figure BDA0003883599360000053
为输出,
Figure BDA0003883599360000054
表示滤波器,Θ为参数张量,σ为激活函数;Among them, q represents the qth feature, Q represents the total number of mapping features, H:, q represents the diffusion and convolution operation of all nodes on the qth feature, and the feature matrix
Figure BDA0003883599360000052
As input, the features extracted by the current layer
Figure BDA0003883599360000053
for the output,
Figure BDA0003883599360000054
Represents a filter, Θ is a parameter tensor, and σ is an activation function;

在时间关系上,选用GRU与扩散卷积结合构建扩散卷积门控递归单元DCGRU,DCGRU表示为式(16):In terms of time relationship, the combination of GRU and diffusion convolution is used to construct the diffusion convolution gated recursive unit DCGRU, which is expressed as formula (16):

Figure BDA0003883599360000055
Figure BDA0003883599360000055

★G表示对图G的扩散卷积,r(t)表示在t时刻的复位门,Θr★G表示复位门的参数张量,X(t)表示在t时刻的输入,H(t-1)表示在t-1时刻的输出,br表示复位门的偏置向量;u(t)表示在t时刻的更新门,Θu★G表示更新门的参数张量,bu表示更新门的偏置向量;C(t)表示在t时刻下一个细胞的隐藏状态,ΘC★G表示隐藏状态的参数张量,bC表示隐藏状态的偏置向量;H(t)表示在t时刻的输出;★G denotes the diffuse convolution on the graph G, r (t) denotes the reset gate at time t, Θ r★G denotes the parameter tensor of the reset gate, X (t) denotes the input at time t, H (t- 1) represents the output at time t-1, b r represents the bias vector of the reset gate; u (t) represents the update gate at time t, Θ u G represents the parameter tensor of the update gate, and b u represents the update gate C (t) represents the hidden state of the next cell at time t, Θ C G represents the parameter tensor of the hidden state, b C represents the bias vector of the hidden state; H (t) represents the cell at time t Output;

步骤2.4、实时提取当前时间对应的道闸口处的车流量特征,使用Kalman滤波方法动态调整DCRNN噪声预测值;Step 2.4, extract the traffic flow characteristics at the gate corresponding to the current time in real time, and use the Kalman filter method to dynamically adjust the DCRNN noise prediction value;

Kalman滤波使用上一状态的估计,做出对当前状态的预测;最后利用对当前状态的观测值修正在预测阶段获得的预测值,以获得一个更接进真实值的新估计值;具体实现是将归一化的车流量特征与噪声序列通过先验估计和协方差矩阵的预测,计算出噪声与飘移的卡尔曼增益,用卡尔曼增益修正更新协方差矩阵,继而修正预测当前时刻的噪声值。Kalman filtering uses the estimate of the previous state to make a prediction of the current state; finally, the observed value of the current state is used to correct the predicted value obtained in the prediction stage to obtain a new estimated value that is closer to the real value; the specific implementation is The normalized traffic flow characteristics and noise sequence are estimated and predicted by the covariance matrix to calculate the Kalman gain of noise and drift, and the covariance matrix is updated by Kalman gain correction, and then the noise value at the current moment is corrected and predicted .

进一步地,步骤3的具体过程如下:Further, the specific process of step 3 is as follows:

提取各监测站点历史数据用于改进的DTW算法重构数据的时空关系,训练噪声预测网络模型DTW-DCRNN,选用数据集的70%作为训练集,10%为验证集,最后20%为测试集;除此之外在前60个Epoch中不记录训练结果,60-100个Epoch中每10个Epoch记录一次训练结果并输出模型参数,100个Epoch后每5个Epoch记录一次训练结果并输出模型参数;最后根据测得的验证损失提前停止训练,以确保在模型快要过拟合的时候捕获模型。Extract the historical data of each monitoring station and use the improved DTW algorithm to reconstruct the spatio-temporal relationship of the data, train the noise prediction network model DTW-DCRNN, select 70% of the data set as the training set, 10% as the verification set, and the last 20% as the test set ;In addition, the training results are not recorded in the first 60 Epochs, the training results are recorded every 10 Epochs in the 60-100 Epochs and the model parameters are output, and the training results are recorded and the model is output every 5 Epochs after 100 Epochs parameter; finally stop training early based on the measured validation loss to ensure that the model is caught when it is close to overfitting.

本发明所带来的有益技术效果:Beneficial technical effects brought by the present invention:

引入惩罚系数改进原始时间动态规则算法,通过相似距离矩阵构建邻接矩阵关系来表征空间距离关系能够较好地弥补原始物理距离方法的局限性;经过Kalman滤波器调整后的预测结果也更加接近真实值,且带有补偿性质,3σ噪声处理方法有效避免模型训练预测结果大偏离的情况。本发明建立的由时间动态规整方法、扩散卷积递归神经网络、Kalman滤波构成的时空噪声预测网络模型,可以实现多站点噪声预测、长期噪声预测、全局噪声水平预测,进而能够提前规避噪声扰民、损害健康的风险。Introducing the penalty coefficient to improve the original time dynamic rule algorithm, constructing the adjacency matrix relationship through the similar distance matrix to represent the spatial distance relationship can better make up for the limitations of the original physical distance method; the prediction result adjusted by the Kalman filter is also closer to the real value , and with compensation properties, the 3σ noise processing method can effectively avoid the large deviation of the model training prediction results. The spatio-temporal noise prediction network model established by the present invention is composed of time dynamic warping method, diffusion convolution recursive neural network, and Kalman filter, which can realize multi-site noise prediction, long-term noise prediction, and global noise level prediction, thereby avoiding noise disturbance in advance, Risk of damage to health.

附图说明Description of drawings

图1为本发明基于DTW-DCRNN的化工园区噪声预测方法的流程图;Fig. 1 is the flowchart of the chemical industry park noise prediction method based on DTW-DCRNN of the present invention;

图2为本发明的数据采集流程示意图;Fig. 2 is a schematic diagram of the data collection process of the present invention;

图3为本发明噪声预测深度学习网络模型的原理图;Fig. 3 is a schematic diagram of the noise prediction deep learning network model of the present invention;

图4为本发明的改进DTW算法的示意图;Fig. 4 is the schematic diagram of improved DTW algorithm of the present invention;

图5为本发明公共子序列数目计算的流程图;Fig. 5 is the flowchart of calculating the number of common subsequences of the present invention;

图6为本发明的扩散卷积递归神经网络模型的结构示意图;Fig. 6 is the structural representation of the diffusion convolution recursive neural network model of the present invention;

图7为本发明实验1中邻接矩阵变化对比图;其中,(a)展示的是原始DTW算法构建的邻接矩阵,(b)展示的是改进后的DTW算法构建的邻接矩阵;Fig. 7 is a comparison diagram of adjacency matrix changes in Experiment 1 of the present invention; wherein, (a) shows the adjacency matrix built by the original DTW algorithm, and (b) shows the adjacency matrix built by the improved DTW algorithm;

图8为本发明实验1中基于时间步的指标变化趋势图;Fig. 8 is the index variation trend figure based on time step in experiment 1 of the present invention;

图9为本发明实验2中本发明预测模型与DCRNN模型的性能指标MAE和RMSE的曲线对比图;Fig. 9 is the curve comparison diagram of the performance index MAE and RMSE of the prediction model of the present invention and the DCRNN model in experiment 2 of the present invention;

图10为本发明实验2中本发明预测模型与DCRNN模型的长期性能指标MAPE的曲线对比图。Fig. 10 is a curve comparison chart of the long-term performance index MAPE between the prediction model of the present invention and the DCRNN model in Experiment 2 of the present invention.

具体实施方式Detailed ways

下面结合附图以及具体实施方式对本发明作进一步详细说明:Below in conjunction with accompanying drawing and specific embodiment the present invention is described in further detail:

本发明主要从以下四个方面解决化工园区噪声预测中存在的问题:噪声数据存在零值,会极大干扰预测精度,因此对于噪声序列监测数据处理问题,选用3σ准则去除零值,实现对零值的过滤,减小样本无偏比标准差,进而降低预测误差。单一监测站点噪声预测与其他站点无关联,权重无法共享,因此面对单站预测之间数据无法互通,各站点单独预测计算性能耗费高的问题,通过图神经网络实现站点间的数据信息预测权重共享。目前非相邻监测站点间噪声相似度未参与计算,因此面对相邻噪声监测站点间,非相邻站点存在的站点相似性计算问题,用时间动态规整DTW算法从时间序列相似度角度重构空间关系。面对噪声具有可加性,对噪声产生影响相关性高的因素无法在时空预测中发挥作用的问题,而卡尔曼Kalman滤波能够结合其他影响因素调整与修正预测结果,因此本发明采用Kalman的方法根据相关因素的特征动态调整预测结果。The present invention mainly solves the problems existing in the noise prediction of chemical industry parks from the following four aspects: the noise data has zero values, which will greatly interfere with the prediction accuracy. Value filtering can reduce the standard deviation of the sample unbiased ratio, thereby reducing the prediction error. The noise prediction of a single monitoring site is not related to other sites, and the weights cannot be shared. Therefore, in the face of the problem that the data cannot be communicated between the single-site predictions, and the calculation performance of each site is high, the prediction weight of the data information between the sites is realized through the graph neural network. shared. At present, the noise similarity between non-adjacent monitoring stations is not involved in the calculation. Therefore, facing the problem of station similarity calculation between adjacent noise monitoring stations and non-adjacent stations, the temporal dynamic warping DTW algorithm is used to reconstruct from the perspective of time series similarity Spatial Relations. Facing the problem that the noise is additive, and the factors with high correlation to the noise cannot play a role in the spatio-temporal prediction, and the Kalman Kalman filter can adjust and correct the prediction results in combination with other influencing factors, so the present invention adopts the Kalman method Dynamically adjust the prediction results according to the characteristics of the relevant factors.

具体表现为:将时间动态规整理论引入到扩散卷积递归神经网络DCRNN中,建立由时间动态规整方法、扩散卷积递归神经网络、Kalman滤波构成的时空噪声预测网络模型。使用惩罚系数改进DTW算法,通过DTW模型重构神经网络的空间关系;使用Kalman方法结合车流量特征动态调整神经网络的输出噪声预测。该方法首先基于时间序列相似度重构空间关系,之后将各站点的数据送入模型中进行预测与动态修正,从而实现多站点噪声预测、长期噪声预测、全局噪声水平预测,进而能够提前规避噪声扰民、损害健康的风险。The specific performance is: the temporal dynamic warping theory is introduced into the diffuse convolution recurrent neural network DCRNN, and a spatio-temporal noise prediction network model composed of the temporal dynamic warping method, the diffuse convolution recurrent neural network, and Kalman filtering is established. The penalty coefficient is used to improve the DTW algorithm, and the spatial relationship of the neural network is reconstructed through the DTW model; the Kalman method is used to dynamically adjust the output noise prediction of the neural network in combination with the traffic flow characteristics. This method first reconstructs the spatial relationship based on time series similarity, and then sends the data of each site into the model for prediction and dynamic correction, so as to realize multi-site noise prediction, long-term noise prediction, and global noise level prediction, and then avoid noise in advance Risk of nuisance and damage to health.

本发明基于DTW算法与DCRNN模型的特点,提出的时空噪声预测网络模型,弥补了基于距离构建空间关系的局限性,能够有效利用园区位置信息与两种方法的优势,实现更高精度的噪声预测。Based on the characteristics of the DTW algorithm and the DCRNN model, the present invention proposes a spatio-temporal noise prediction network model, which makes up for the limitations of constructing spatial relationships based on distance, and can effectively utilize the park location information and the advantages of the two methods to achieve higher-precision noise prediction .

如图1所示,一种基于DTW-DCRNN的化工园区噪声预测方法,具体包括如下步骤:As shown in Figure 1, a DTW-DCRNN-based noise prediction method for chemical industry parks specifically includes the following steps:

步骤1、采集化工园区内的监测站点信息和车辆信息数据并进行处理。Step 1. Collect and process the monitoring site information and vehicle information data in the chemical industry park.

化工园区内设置道闸口和若干监测站点用于实时采集数据,同步记录各个监测站点的位置信息数据;道闸口和各个监测站点均设有传感器;道闸口采集进出车辆的信息数据,包括车牌号、车辆进出时间、车型等车流量特征;监测站点实时采集化工园区内的噪声与自然环境数据;The road gate and several monitoring stations are set up in the chemical industry park for real-time data collection, and the location information data of each monitoring station is recorded synchronously; the road gate and each monitoring station are equipped with sensors; the road gate collects the information data of vehicles entering and leaving, including the license plate number, Traffic flow characteristics such as vehicle entry and exit time, vehicle type, etc.; monitoring stations collect real-time noise and natural environment data in the chemical industrial park;

如图2所示,站点位置数据、噪声与自然环境数据、进出车辆信息数据均会传到网关设备,之后与数据库服务器交互传输,在进行噪声预测时,实时进行数据提取,并采用3σ准则进行处理。As shown in Figure 2, the site location data, noise and natural environment data, and vehicle information data entering and leaving will be transmitted to the gateway device, and then interactively transmitted with the database server. When noise prediction is performed, data is extracted in real time, and the 3σ criterion is used for deal with.

3σ准则是先假设一组数据只含有随机误差,计算标准偏差,按概率确定区间,认为凡超过这个区间的误差,就不属于随机误差而是粗大误差,含有该误差的数据应予以剔除或替换,噪声值noise∈(u-3σ,u+3σ)区间占比约为99.74%,u表示噪声均值,σ为噪声标准差,noise为噪声值。将噪声范围处于0≤noise<u-3σ(dB)替换为均值。The 3σ criterion is to assume that a set of data contains only random errors, calculate the standard deviation, and determine the interval according to the probability. It is believed that any error exceeding this interval is not a random error but a gross error, and the data containing this error should be removed or replaced. , the interval of noise value noise∈(u-3σ noise , u+3σ noise ) accounts for about 99.74%, u represents the noise mean, σ noise is the noise standard deviation, and noise is the noise value. Replace the noise range in 0≤noise<u-3σ noise (dB) with the mean value.

传感器采集到的原始噪声序列,含有稀疏突变点零值。经上述3σ准则处理后的噪声值,样本无偏标准差降低、突变值减少,进而可以提高神经网络噪声预测的精确度。The original noise sequence collected by the sensor contains sparse mutation points with zero values. After the noise value processed by the above 3σ criterion, the unbiased standard deviation of the sample is reduced, and the mutation value is reduced, which can improve the accuracy of the noise prediction of the neural network.

步骤2、引入惩罚系数改进原始时间动态规整算法,并将其引入到扩散卷积递归神经网络中,再结合卡尔曼滤波构建噪声预测网络模型。Step 2. Introduce the penalty coefficient to improve the original time dynamic warping algorithm, and introduce it into the diffusion convolution recurrent neural network, and then combine the Kalman filter to construct the noise prediction network model.

如图3所示,该噪声预测网络模型的原理为:首先基于提取的各个监测站点的数据采用原始时间动态规整算法计算最优路径,然后通过引入惩罚系数改进原始时间动态规整算法,通过改进后的时间动态规整算法计算序列相似距离,将各个监测站点构成矩阵,进而构建邻接矩阵图结构;然后再将邻接矩阵输入到扩散卷积递归神经网络中进行噪声初步预测,最后再结合车流量特征和卡尔曼滤波进行噪声预测结果的动态调整,得到最终的噪声预测结果;具体过程如下:As shown in Figure 3, the principle of the noise prediction network model is as follows: first, based on the extracted data of each monitoring station, the original time dynamic warping algorithm is used to calculate the optimal path, and then the original time dynamic warping algorithm is improved by introducing a penalty coefficient. The temporal dynamic warping algorithm calculates the sequence similarity distance, forms a matrix for each monitoring site, and then constructs an adjacency matrix graph structure; then inputs the adjacency matrix into the diffusion convolution recurrent neural network for preliminary prediction of noise, and finally combines traffic flow characteristics and The Kalman filter dynamically adjusts the noise prediction results to obtain the final noise prediction results; the specific process is as follows:

步骤2.1、引入惩罚系数改进原始时间动态规整算法,计算序列相似度距离;Step 2.1, introducing a penalty coefficient to improve the original time dynamic warping algorithm, and calculating the sequence similarity distance;

改进时间动态规整算法的原理为:求得最优路径与公共子序列数目以计算惩罚系数,结合惩罚系数计算站与站之间的相似度距离。The principle of the improved time dynamic warping algorithm is: obtain the optimal path and the number of common subsequences to calculate the penalty coefficient, and combine the penalty coefficient to calculate the similarity distance between stations.

主要工作流程如图4,具体过程为:The main workflow is shown in Figure 4, and the specific process is as follows:

步骤2.1.1、计算距离矩阵;Step 2.1.1, calculate the distance matrix;

假设有两个噪声序列S和T,如式(1):Suppose there are two noise sequences S and T, such as formula (1):

Figure BDA0003883599360000081
Figure BDA0003883599360000081

序列S长为o,序列T长为m,o,m∈Z+,传统的欧式距离在波峰波谷重合度高的情况下可以直接计算距离,但无法用于不同相位的周期序列的计算,因此本发明构建基于噪声序列m×o的距离矩阵d[s][t],其中第i个序列点si(i=1,2,…,o)和第j个序列点tj(j=1,2,…,m)之间的距离为矩阵(si,tj)元素的值,即欧氏距离:dis(si,tj)=(si-tj)2,矩阵距离计算标准如式(2)所示:The length of the sequence S is o, the length of the sequence T is m, o,m∈Z + , the traditional Euclidean distance can directly calculate the distance when the peak and trough coincidence degree is high, but it cannot be used for the calculation of the periodic sequence of different phases, so The present invention constructs a distance matrix d[s][t] based on the noise sequence m×o, where the i-th sequence point s i (i=1,2,...,o) and the j-th sequence point t j (j= 1,2,...,m) is the value of the matrix (s i ,t j ) element, that is, the Euclidean distance: dis(s i ,t j )=(s i -t j ) 2 , the matrix distance The calculation standard is shown in formula (2):

Figure BDA0003883599360000082
Figure BDA0003883599360000082

步骤2.1.2、求矩阵最优路径;Step 2.1.2, seeking the optimal path of the matrix;

上述2.1.1计算得到的距离矩阵中矩阵的行表示序列S,矩阵的列为序列T。行列与序列顺序一一对应,但S序列某时刻点与T序列中多个时刻的点对应,通过动态规整求取矩阵最优路径。路径的选择需要满足以下边界条件:必须满足从矩阵右上角d[o][m]点出发,找左下三个点中最小的点作为下一个节点,直到左下角d[0][0]矩阵点结束,除此之外还要保证连续性与单调性,不可跨越或者遗漏某一点进行匹配,只能与相邻的点对齐或匹配,时间序列的先后次序不可变化。最优路径是满足上述条件的积累距离最小的路径。In the distance matrix calculated in 2.1.1 above, the rows of the matrix represent the sequence S, and the columns of the matrix represent the sequence T. Rows and columns correspond to sequence order one by one, but a certain point in S sequence corresponds to multiple points in T sequence, and the optimal path of the matrix is obtained through dynamic regularization. The selection of the path needs to meet the following boundary conditions: starting from the d[o][m] point in the upper right corner of the matrix, find the smallest point among the three lower left points as the next node until the d[0][0] matrix in the lower left corner In addition to ensuring continuity and monotonicity, a point cannot be crossed or missed for matching, only adjacent points can be aligned or matched, and the order of the time series cannot be changed. The optimal path is the path with the minimum cumulative distance that satisfies the above conditions.

单一的DTW算法求解距离对非对齐周期性序列并不友好,计算距离并不理想。因此在DTW算法的基础上需要进行改进。改进思路是先用DTW算法生成距离矩阵并求出最优路径:以点(i,j)为例寻找左下三个点中最小值,若该值为左下对角值,则(i-1,j-1)为下一个节点;若该值为左边相邻点,则下一节点为(i-1,j);若该值为下方相邻点,则(i,j-1)为下一个节点,直到没有下一节点为止。得到的最优路径为后面求噪声序列的公共子序列提供验证基础。The single DTW algorithm to solve the distance is not friendly to the unaligned periodic sequence, and the distance calculation is not ideal. Therefore, it needs to be improved on the basis of DTW algorithm. The improvement idea is to first use the DTW algorithm to generate the distance matrix and find the optimal path: take the point (i, j) as an example to find the minimum value among the three points on the lower left, if the value is the lower left diagonal value, then (i-1, j-1) is the next node; if the value is the left adjacent point, then the next node is (i-1, j); if the value is the lower adjacent point, then (i, j-1) is the next node A node until there are no next nodes. The obtained optimal path provides a verification basis for finding the common subsequence of the noise sequence later.

步骤2.1.3、计算公共子序列数目和最优路径序列长度,进而求得权重,得到惩罚系数;Step 2.1.3, calculating the number of common subsequences and the length of the optimal path sequence, and then obtaining the weight and penalty coefficient;

计算步骤2.1.2中最终求得的矩阵最优路径的长度即为所需的最优路径序列长度。The length of the matrix optimal path finally obtained in the calculation step 2.1.2 is the required optimal path sequence length.

噪声公共子序列数目计算实现步骤如图5所示,首先构建记录空列表(record),循环遍历序列s1与s2,出现共有序列后用record列表记录共有序列长度ls1或ls2与所处序列位置g,将该公共子序列从s1序列中提取出来放置在s_sum数组中存储。遍历完成后s_sum的长度就是公共子序列的个数。计算公共子序列数目时,设置阈值为1,即长度大于1的子序列可以算作公共子序列,最后统计公共子序列的总数即为所需的公共噪声子序列数目。The steps for calculating the number of noise common subsequences are shown in Figure 5. First, construct an empty record list (record), loop through the sequences s1 and s2, and use the record list to record the length of the consensus sequence ls1 or ls2 and the sequence position g , extract the common subsequence from the s1 sequence and store it in the s_sum array. After the traversal is completed, the length of s_sum is the number of common subsequences. When calculating the number of common subsequences, set the threshold to 1, that is, subsequences with a length greater than 1 can be counted as common subsequences, and finally count the total number of common subsequences as the required number of common noise subsequences.

计算方法见算法1:See Algorithm 1 for the calculation method:

Figure BDA0003883599360000091
Figure BDA0003883599360000091

将所有的公共噪声子序列数目用于计算惩罚系数;Use all the common noise subsequence numbers to calculate the penalty coefficient;

首先计算权重w,计算公式如式(3):First calculate the weight w, the calculation formula is as formula (3):

Figure BDA0003883599360000101
Figure BDA0003883599360000101

其中,subseq为公共噪声子序列长度,seq为矩阵最优路径的序列长度;Among them, subseq is the length of the common noise subsequence, and seq is the sequence length of the optimal path of the matrix;

然后,设置惩罚系数α的计算过程如式(4),式中x表示公共子序列数目,wi表示第i个序列点的权重,满足优路径序列长度越长,公共噪声子序列数量越多、长度越长则惩罚系数越小的标准:Then, the calculation process of setting the penalty coefficient α is shown in formula (4), where x represents the number of common subsequences, and w i represents the weight of the i-th sequence point. The longer the sequence length of the optimal path, the more the number of common noise subsequences , the longer the length, the smaller the penalty coefficient:

Figure BDA0003883599360000102
Figure BDA0003883599360000102

步骤2.1.4、将最优路径表示为(r1,r2,…,rseq),ri(i=1,2,…,seq)表示最优路径序列中第i个序列点的值,最后用惩罚系数与原DTW计算得到的最优路径中数值之和相乘得到改进后的序列相似度距离disDTWStep 2.1.4. Express the optimal path as (r 1 ,r 2 ,…,r seq ), r i (i=1,2,…,seq) represents the value of the i-th sequence point in the optimal path sequence , and finally the penalty coefficient is multiplied by the sum of values in the optimal path calculated by the original DTW to obtain the improved sequence similarity distance dis DTW .

Figure BDA0003883599360000103
Figure BDA0003883599360000103

步骤2.2、将各个监测站点间的噪声序列相似度距离构成矩阵并计算为邻接矩阵,构建图关系拓扑结构。具体过程为:In step 2.2, the noise sequence similarity distance between each monitoring site is formed into a matrix and calculated as an adjacency matrix to construct a graph relationship topology. The specific process is:

步骤2.2.1、根据各个监测站点的噪声序列相似度距离信息,计算站点之间的相似度距离矩阵M(数值单位:km),矩阵M如下:Step 2.2.1, according to the noise sequence similarity distance information of each monitoring station, calculate the similarity distance matrix M (numerical unit: km) between the stations, the matrix M is as follows:

Figure BDA0003883599360000104
Figure BDA0003883599360000104

其中,a为监测站点数目;Among them, a is the number of monitoring stations;

比如,本发明实施例的化工园区内共设置了11个监测站点,各个站点组成的矩阵如下所示。For example, a total of 11 monitoring stations are set up in the chemical industry park in the embodiment of the present invention, and the matrix composed of each station is as follows.

步骤2.2.2、计算相似度距离矩阵M的标准差λ,Mc为矩阵中第c个元素值,N为M矩阵中元素数目,μM为矩阵均值,计算过程如下:Step 2.2.2, calculate the standard deviation λ of the similarity distance matrix M, M c is the cth element value in the matrix, N is the number of elements in the M matrix, μ M is the matrix mean value, and the calculation process is as follows:

Figure BDA0003883599360000105
Figure BDA0003883599360000105

步骤2.2.3、根据相似距离矩阵M分析监测站点之间的相似性,利用相似距离矩阵中所有非无穷大的数的标准差λ构建邻接矩阵Md,邻接矩阵Md中的各个元素

Figure BDA0003883599360000111
的计算过程如下:Step 2.2.3, analyze the similarity between monitoring sites according to the similarity distance matrix M, use the standard deviation λ of all non-infinite numbers in the similarity distance matrix to construct the adjacency matrix M d , each element in the adjacency matrix M d
Figure BDA0003883599360000111
The calculation process is as follows:

Figure BDA0003883599360000112
Figure BDA0003883599360000112

其中,c表示矩阵元素序号,相似距离越大,则

Figure BDA0003883599360000113
越小,设置阈值为0.1,若
Figure BDA0003883599360000114
<阈值,则视为两站点间相似度太低,无相互影响关系,视作非邻接站点,在邻接矩阵中权值为零不构成邻接关系。Among them, c represents the serial number of matrix elements, and the larger the similarity distance is, the
Figure BDA0003883599360000113
The smaller the value, set the threshold to 0.1, if
Figure BDA0003883599360000114
<Threshold value, it is considered that the similarity between the two sites is too low, there is no mutual influence relationship, and it is regarded as a non-adjacent site, and the weight value of zero in the adjacency matrix does not constitute an adjacency relationship.

上述方法步骤满足周期性噪声时间序列的相似度计算要求。使用该方法步骤重构空间关系,可以为后续图卷积神经网络工作提供更加完善的图表示关系奠定了基础。The above method steps meet the similarity calculation requirements of periodic noise time series. Using this method to reconstruct the spatial relationship can lay the foundation for providing a more complete graph representation relationship for the subsequent graph convolutional neural network work.

步骤2.3、引入图卷积神经网络GCN构建邻接矩阵图结构,将邻接矩阵图结构输入到扩散卷积递归神经网络中进行噪声预测,得到初步预测结果;Step 2.3, introduce the graph convolutional neural network GCN to construct the adjacency matrix graph structure, input the adjacency matrix graph structure into the diffusion convolution recurrent neural network for noise prediction, and obtain the preliminary prediction result;

首先从信号处理角度理解卷积,单词“Convolve”本意表示翻转,在卷积中被翻译作“卷”,“积”本身是一种运算方式,意为“乘积”,噪声可以分类为离散噪声或连续噪声,卷积一般指连续信号,声音的监测手段是经由传感器实现的,传感器采集到的噪声为间隔30s离散值,因此在此只探讨离散噪声卷积和。First understand convolution from the perspective of signal processing. The word "Convolve" originally means flip, and is translated as "volume" in convolution. "Product" itself is an operation method, meaning "multiplication". Noise can be classified as discrete noise Or continuous noise. Convolution generally refers to continuous signals. The sound monitoring method is realized through sensors. The noise collected by the sensor is a discrete value at an interval of 30s. Therefore, only the discrete noise convolution sum is discussed here.

定义脉冲信号为:

Figure BDA0003883599360000115
n表示位移位数,l表示离散时间,离散噪声x[n]可用式(9)表示:Define the pulse signal as:
Figure BDA0003883599360000115
n represents the number of bits shifted, l represents the discrete time, and the discrete noise x[n] can be expressed by formula (9):

Figure BDA0003883599360000116
Figure BDA0003883599360000116

其中,x[l]表示在时间l处的离散噪声;where x[l] represents the discrete noise at time l;

离散噪声卷积过程可以理解成为先把噪声序列翻转再进行移位,最后相乘求和。意味着通过系统的单位冲激响应h[n]来表征该系统对输入的离散噪声序列x[n]的响应,离散噪声卷积的卷积和ydis[n]为式(10)。The discrete noise convolution process can be understood as first flipping the noise sequence, then shifting, and finally multiplying and summing. It means that the response of the system to the input discrete noise sequence x[n] is characterized by the unit impulse response h[n] of the system, and the convolution and y dis [n] of the discrete noise convolution are formula (10).

Figure BDA0003883599360000117
Figure BDA0003883599360000117

其中,卷积时先对单位冲激响应进行反转得到h[-l],再位移n得位移后的函数h[n-l];Among them, during convolution, first invert the unit impulse response to obtain h[-l], and then shift n to obtain the displaced function h[n-l];

而卷积神经网络(Convolutional Neural Network,CNN)中卷积的定义对比上述的卷积原理缺少翻转的过程,直接位移n即可得到h[n+l],卷积神经网络的卷积和ycnn[n]表示为式(11),However, the definition of convolution in Convolutional Neural Network (CNN) lacks the process of flipping compared with the above-mentioned convolution principle, and h[n+l] can be obtained by directly shifting n. The convolution and y of the convolutional neural network cnn [n] is expressed as formula (11),

Figure BDA0003883599360000121
Figure BDA0003883599360000121

它并非完整意义的卷积和而是信号处理中的互相关,因此卷积核也是滤波器。在卷积神经网络中使用卷积的根本目的是加权求和与提取特征,翻转并无必要。It is not a complete convolution sum but a cross-correlation in signal processing, so the convolution kernel is also a filter. The fundamental purpose of using convolution in convolutional neural networks is to weight summation and extract features, and flipping is not necessary.

多个监测站点的噪声序列可以构成一张拓扑结构图,这种图是不规则的不具有平移不变性,因此需要引入图卷积神经网络GCN来解决图关系问题。在本发明实施例中,十一个监测站点作为十一个节点,每个站点各有其特征,所有节点构成11×11维的邻接矩阵A,假设噪声序列长度为Y,则节点的特征为11*Y维的特征矩阵X,X与A矩阵是图卷积网络的输入。图卷积神经网络GCN网络层之间的传播方式为式(12),L表示第L个网络层:The noise sequences of multiple monitoring sites can form a topological graph, which is irregular and not translation invariant. Therefore, it is necessary to introduce a graph convolutional neural network (GCN) to solve the graph relationship problem. In the embodiment of the present invention, eleven monitoring stations are regarded as eleven nodes, and each station has its own characteristics, and all nodes form a 11×11-dimensional adjacency matrix A, assuming that the length of the noise sequence is Y, then the characteristics of the nodes are The 11*Y-dimensional feature matrix X, X and A matrix are the inputs of the graph convolutional network. The propagation mode between the network layers of the graph convolutional neural network GCN is formula (12), and L represents the Lth network layer:

Figure BDA0003883599360000122
Figure BDA0003883599360000122

其中,σ为非线性激活函数;WL是可训练权重矩阵,

Figure BDA0003883599360000123
为度矩阵,其构成为:
Figure BDA0003883599360000124
I表示邻接矩阵A第I行,J表示邻接矩阵A第J列,
Figure BDA0003883599360000125
表示保留站点本身的特征信息,I11表示11维的单位矩阵,λ为站点噪声特征权重,λ=1表示该站点的噪声特征与相邻站点的噪声特征一样重要,H为当前层提取到的特征,如果是输入层则X=H。Among them, σ is a nonlinear activation function; W L is a trainable weight matrix,
Figure BDA0003883599360000123
is a degree matrix, and its composition is:
Figure BDA0003883599360000124
I represents row I of adjacency matrix A, J represents column J of adjacency matrix A,
Figure BDA0003883599360000125
Indicates that the feature information of the site itself is preserved, I 11 indicates the 11-dimensional identity matrix, λ is the weight of the site noise feature, λ=1 indicates that the noise feature of the site is as important as the noise feature of the adjacent site, and H is extracted from the current layer feature, if it is an input layer then X=H.

CNN模型若不经过训练提取的特征非常有限,相比CNN模型,GCN使用随机初始化参数提取的特征就极为出色。用GCN来提取空间关系是目前研究方法中的最优选。If the CNN model is not trained, the extracted features are very limited. Compared with the CNN model, the features extracted by GCN using random initialization parameters are extremely good. Using GCN to extract spatial relationships is the most preferred method in current research.

为了形式化时空噪声序列预测问题,在GCN图卷积神经网络的基础上,本发明使用扩散卷积递归神经网络对噪声序列的时空关系建模。通过将噪声监测站点与扩散过程相关联,对站点之间存在的空间关系建模,也能够捕捉噪声的动态的随机性质。扩散卷积递归神经网络DCRNN模型结构如图6所示,DCRNN是基于扩散图卷积的递归神经网络,模型原本的输入是基于监测站点分布物理距离的邻接矩阵与时间序列噪声数据结合的时空噪声数据结构,本发明提出改进的DTW算法用于计算序列的相似距离,基于相似距离重构站点间时空关系。该模型的优势在于扩散卷积能够考虑到噪声之间的游走关系,并且在历史噪声序列上添加了编码器,将上一时刻的真实值或预测值延后一个时间步作为当前译码器的输入进行当前时间的噪声预测。其中ReLU是激活函数,如果输入为正,它将直接输出,否则,它将输出为零。扩散过程的平稳分布可以表示为图上无限随机游动的加权组合,扩散过程可以表示为式(13),并以封闭形式计算:In order to formalize the spatiotemporal noise sequence prediction problem, on the basis of the GCN graph convolutional neural network, the present invention uses a diffusion convolutional recurrent neural network to model the spatiotemporal relationship of the noise sequence. By associating noise monitoring sites with diffusion processes, modeling the spatial relationships that exist between sites can also capture the dynamic, stochastic nature of noise. The DCRNN model structure of the diffusion convolutional recurrent neural network is shown in Figure 6. DCRNN is a recurrent neural network based on diffusion graph convolution. The original input of the model is the spatiotemporal noise based on the combination of the adjacency matrix of the physical distance of the monitoring stations and the time series noise data. Data structure, the invention proposes an improved DTW algorithm for calculating the similarity distance of the sequence, and reconstructs the spatio-temporal relationship between sites based on the similarity distance. The advantage of this model is that the diffusion convolution can take into account the walk relationship between noises, and an encoder is added to the historical noise sequence, and the real value or predicted value of the previous moment is delayed by one time step as the current decoder. The input of the current time noise prediction. Where ReLU is the activation function, if the input is positive, it will output directly, otherwise, it will output zero. The stationary distribution of the diffusion process can be expressed as a weighted combination of infinite random walks on the graph, and the diffusion process can be expressed as Equation (13) and calculated in closed form:

Figure BDA0003883599360000126
Figure BDA0003883599360000126

W是节点相似性矩阵,

Figure BDA0003883599360000127
是出度矩阵的逆矩阵,β∈[0,1]表示重启概率,其中k是扩散程度,ε表示从节点扩散的可能性,DCRNN模型中扩散过程是双向的,可以考虑噪声监测站点双边的环境影响。W is the node similarity matrix,
Figure BDA0003883599360000127
is the inverse matrix of the out-degree matrix, β∈[0,1] represents the restart probability, where k is the degree of diffusion, ε represents the possibility of spreading from the node, the diffusion process in the DCRNN model is bidirectional, and the bilateral noise monitoring site can be considered environmental impact.

因此,在空间关系上,基于图信号特征矩阵

Figure BDA0003883599360000131
与滤波器fθ(θ为滤波器参数)双向扩散卷积运算可以定义为式(14):Therefore, in terms of spatial relationship, based on the graph signal feature matrix
Figure BDA0003883599360000131
The bidirectional diffusion convolution operation with the filter f θ (θ is the filter parameter) can be defined as formula (14):

Figure BDA0003883599360000132
Figure BDA0003883599360000132

其中,G表示图G,p表示第p个特征维度,P表示特征维度总数,K表示扩散过程的有限K步截断,★G表示对图G的扩散卷积,X:,p表示所有节点对第p个特征的卷积运算,θk,1表示出度计算的卷积核参数,θk,2表示入度计算的卷积核参数,

Figure BDA0003883599360000133
Figure BDA0003883599360000134
分别表示扩散与逆扩散的过度矩阵,定义卷积运算后,构建带有映射关系的扩散卷积层(P维到Q维)如式(15):Among them, G represents the graph G, p represents the pth feature dimension, P represents the total number of feature dimensions, K represents the finite K-step truncation of the diffusion process, G represents the diffusion convolution on the graph G, X:, p represents all node pairs The convolution operation of the p-th feature, θ k, 1 represents the convolution kernel parameter for out-degree calculation, θ k, 2 represents the convolution kernel parameter for in-degree calculation,
Figure BDA0003883599360000133
and
Figure BDA0003883599360000134
Represent the transition matrix of diffusion and inverse diffusion respectively. After defining the convolution operation, construct the diffusion convolution layer (P dimension to Q dimension) with mapping relationship as formula (15):

Figure BDA0003883599360000135
Figure BDA0003883599360000135

其中,q表示第q个特征,Q表示映射特征总数,H:,q表示所有节点对第q个特征扩散卷积运算,特征矩阵

Figure BDA0003883599360000136
为输入,当前层提取到的特征
Figure BDA0003883599360000137
为输出,
Figure BDA0003883599360000138
表示滤波器(Θ为参数张量),σ为激活函数。基于双向扩散卷积的空间关系构建完成。Among them, q represents the qth feature, Q represents the total number of mapping features, H:, q represents the diffusion and convolution operation of all nodes on the qth feature, and the feature matrix
Figure BDA0003883599360000136
As input, the features extracted by the current layer
Figure BDA0003883599360000137
for the output,
Figure BDA0003883599360000138
Represents a filter (Θ is a parameter tensor), and σ is an activation function. The spatial relationship based on bidirectional diffusion convolution is completed.

在时间关系上,选用GRU与扩散卷积结合构建扩散卷积门控递归单元(DCGRU),其本质原理是用扩散卷积代替GRU中的矩阵乘法,在LSTM原理的基础上,结合GRU与扩散卷积原理,DCGRU可以表示为式(16):In terms of time relationship, the combination of GRU and diffusion convolution is used to construct the diffusion convolution gated recursive unit (DCGRU). The essential principle is to replace the matrix multiplication in GRU with diffusion convolution. Convolution principle, DCGRU can be expressed as formula (16):

Figure BDA0003883599360000139
Figure BDA0003883599360000139

★G表示图G的扩散卷积,r(t)表示在t时刻的复位门,Θr★G表示复位门的参数张量,X(t)表示在t时刻的输入,H(t-1)表示在t-1时刻的输出,br表示复位门的偏置向量;u(t)表示在t时刻的更新门,Θu★G表示更新门的参数张量,bu表示更新门的偏置向量;C(t)表示在t时刻下一个细胞的隐藏状态,ΘC★G表示隐藏状态的参数张量,bC表示隐藏状态的偏置向量;H(t)表示在t时刻的输出。G represents the diffuse convolution of graph G, r (t) represents the reset gate at time t, Θ r G represents the parameter tensor of the reset gate, X (t) represents the input at time t, H (t-1 ) represents the output at time t-1, b r represents the bias vector of the reset gate; u (t) represents the update gate at time t, Θ u G represents the parameter tensor of the update gate, and b u represents the update gate’s Bias vector; C (t) represents the hidden state of the next cell at time t, Θ C G represents the parameter tensor of the hidden state, b C represents the bias vector of the hidden state; H (t) represents the output.

通过时空建模,最大程度的利用双向传播捕获噪声序列的时空相关性,但目前工作是基于距离的空间相关性,基于物理距离构建的空间特征难以完整表现站点之间噪声序列的相关性,因此还需要更好的方法来解决这一问题。想要提高空间关系的预测准确性,还需要深入空间关系来解决问题。Through spatio-temporal modeling, two-way propagation is used to capture the spatio-temporal correlation of noise sequences to the greatest extent, but the current work is based on the spatial correlation of distance, and the spatial features constructed based on physical distance are difficult to fully represent the correlation of noise sequences between sites, so A better approach to this problem is also needed. To improve the prediction accuracy of spatial relationship, it is necessary to go deep into the spatial relationship to solve the problem.

步骤2.4、实时提取当前时间对应的道闸口处的车流量特征,使用Kalman滤波方法动态调整DCRNN噪声预测值。Step 2.4, extract the traffic flow characteristics at the gate corresponding to the current time in real time, and use the Kalman filtering method to dynamically adjust the DCRNN noise prediction value.

Kalman滤波使用上一状态的估计,做出对当前状态的预测。最后利用对当前状态的观测值修正在预测阶段获得的预测值,以获得一个更接进真实值的新估计值。具体实现是将归一化的车流量特征与噪声序列通过先验估计和协方差矩阵的预测,计算出噪声与飘移的卡尔曼增益,用卡尔曼增益修正更新协方差矩阵,继而修正预测当前时刻的噪声值。Kalman filtering uses an estimate of the previous state to make a prediction for the current state. Finally, the predicted value obtained in the prediction stage is corrected by using the observed value of the current state to obtain a new estimated value that is closer to the real value. The specific implementation is to calculate the Kalman gain of noise and drift through the prior estimation and prediction of the covariance matrix of the normalized traffic flow characteristics and noise sequence, and use the Kalman gain correction to update the covariance matrix, and then correct and predict the current moment noise value.

步骤3、对构建的噪声预测网络模型进行训练,输出训练完成的模型。具体过程为:Step 3, train the constructed noise prediction network model, and output the trained model. The specific process is:

提取各监测站点历史数据用于改进的DTW算法重构数据的时空关系,训练噪声预测网络模型DTW-DCRNN,选用数据集的70%作为训练集,10%为验证集,最后20%为测试集。除此之外在前60个Epoch中不记录训练结果,60-100个Epoch中每10个Epoch记录一次训练结果并输出模型参数,100个Epoch后每5个Epoch记录一次训练结果并输出模型参数。最后根据测得的验证损失提前停止训练,以确保在模型快要过拟合的时候捕获模型。Extract the historical data of each monitoring station and use the improved DTW algorithm to reconstruct the spatio-temporal relationship of the data, train the noise prediction network model DTW-DCRNN, select 70% of the data set as the training set, 10% as the verification set, and the last 20% as the test set . In addition, the training results are not recorded in the first 60 Epochs, the training results are recorded and model parameters are output every 10 Epochs in the 60-100 Epochs, and the training results are recorded and model parameters are output every 5 Epochs after 100 Epochs . Finally, training is stopped early based on the measured validation loss to ensure that the model is caught when it is close to overfitting.

步骤4、实时采集监测站点信息并基于训练完成的模型进行噪声的实时预测。Step 4. Collect monitoring site information in real time and perform real-time prediction of noise based on the trained model.

本发明首先根据各站点的位置建立基于改进的DTW算法的相似度距离矩阵,摒弃了基于物理距离构建矩阵的传统方法,将相似度距离矩阵构建为邻接矩阵的表示形式,将基于相似度距离的时空数据集送入DCRNN模型中得到基于相似度距离空间关系的DTW-DCRNN模型最后结合实时车流量数据,经Kalman滤波器动态更新DCRNN的预测输出结果。The present invention first establishes the similarity distance matrix based on the improved DTW algorithm according to the positions of each site, abandons the traditional method of constructing the matrix based on the physical distance, and constructs the similarity distance matrix as the representation form of the adjacency matrix, and uses the similarity distance matrix based on the similarity distance The spatio-temporal data set is fed into the DCRNN model to obtain the DTW-DCRNN model based on the similarity distance spatial relationship. Finally, combined with the real-time traffic flow data, the Kalman filter is used to dynamically update the prediction output of the DCRNN.

通过相似度距离矩阵构建邻接矩阵关系来表征空间距离关系能够较好地弥补原始物理距离方法的局限性。经过Kalman滤波器调整后的预测结果也更加接近真实值,且带有补偿性质,3σ噪声处理方法有效避免模型训练预测结果大偏离的情况。Using the similarity distance matrix to construct the adjacency matrix relationship to represent the spatial distance relationship can better compensate for the limitations of the original physical distance method. The prediction results adjusted by the Kalman filter are also closer to the real value, and have compensation properties. The 3σ noise processing method can effectively avoid the large deviation of the model training prediction results.

为了证明本发明的可行性与优越性进行了如下实验。In order to prove the feasibility and superiority of the present invention, the following experiments were carried out.

实验1:DTW算法改进前后的对比实验Experiment 1: Comparative experiment before and after DTW algorithm improvement

图7(a)展示了基于原始DTW算法的邻接矩阵的一个示例,改进后的DTW算法如图7(b)比起原始方法增加了1号监测站点与3号监测站点之间的关联性。Figure 7(a) shows an example of the adjacency matrix based on the original DTW algorithm. The improved DTW algorithm as shown in Figure 7(b) increases the correlation between monitoring site 1 and monitoring site 3 compared with the original method.

本发明实验采用平均绝对误差MAE、平均绝对百分比误差MAPE、均方根误差RMSE作为评价指标。将DTW算法与改进后的DTW算法分别与DCRNN模型相结合进行预训练,求十二个时间步的平均预测结果,指标如表1,在预训练模型结果上MAE噪声降低0.06分贝,精度提升2.8%,MAPE降低0.11%,精度提升1.9%,RMSE提升0.02分贝,精度下降0.5%。改进后的MAE与MAPE均低于原始的方法,RMSE指标略高于原方法,即应用改进后的方法数据离群值、异常值预测准确度低于原方法,但总体指标新DTW算法的预测精度高于原始DTW算法。The experiment of the present invention adopts mean absolute error MAE, mean absolute percentage error MAPE and root mean square error RMSE as evaluation indexes. The DTW algorithm and the improved DTW algorithm are combined with the DCRNN model for pre-training, and the average prediction results of twelve time steps are obtained. The indicators are shown in Table 1. On the results of the pre-training model, the MAE noise is reduced by 0.06 decibels, and the accuracy is increased by 2.8 %, MAPE is reduced by 0.11%, accuracy is increased by 1.9%, RMSE is increased by 0.02dB, and accuracy is decreased by 0.5%. The improved MAE and MAPE are both lower than the original method, and the RMSE index is slightly higher than the original method, that is, the application of the improved method data outliers, the prediction accuracy of outliers is lower than the original method, but the prediction of the overall index new DTW algorithm The accuracy is higher than the original DTW algorithm.

表1 DTW算法改进前后对比Table 1 Comparison of DTW algorithm before and after improvement

Figure BDA0003883599360000151
Figure BDA0003883599360000151

根据时间步绘制指标图如图8,进一步观察改进DTW算法与原始算法,横坐标为时间步,图像右侧的纵坐标为MAPE指标,从图中可以看出,从第四个时间步开始改进DWT算法的平均绝对百分比误差更小,新方法在MAPE指标上表现更优。图像左侧的纵坐标为RMSE与MAE指标,单位是分贝(dB),从图中可以看出,从第六个时间步开始改进DWT算法的均方根误差更小,新方法在RMSE指标上表现更优,从第四个时间步开始改进DWT算法的平均绝对误差更小,新方法在MAE指标上表现更优。Draw the indicator diagram according to the time step as shown in Figure 8. Further observe the improved DTW algorithm and the original algorithm. The abscissa is the time step, and the ordinate on the right side of the image is the MAPE index. It can be seen from the figure that the improvement starts from the fourth time step The average absolute percentage error of the DWT algorithm is smaller, and the new method performs better on the MAPE index. The ordinate on the left side of the image is the RMSE and MAE indicators, and the unit is decibel (dB). It can be seen from the figure that the root mean square error of the improved DWT algorithm is smaller from the sixth time step, and the new method is better than the RMSE indicator. The performance is better, the average absolute error of the improved DWT algorithm from the fourth time step is smaller, and the new method performs better on the MAE index.

总结来说,改进后的DTW算法不仅提高了预测的整体精度,也没有增加算法计算的时间成本,并且能够挖掘更深层次的序列相关性,因此能够创新性的应用于图卷积工作中以提高预测精度。In summary, the improved DTW algorithm not only improves the overall accuracy of prediction, but also does not increase the time cost of algorithm calculation, and can dig deeper sequence correlations, so it can be innovatively applied to graph convolution work to improve prediction accuracy.

实验2:本发明所提预测模型与其他模型的对比实验Experiment 2: Comparative experiments between the prediction model proposed by the present invention and other models

为了进一步验证本发明方法预测性能,使用RMSE、MAE、MAPE三项评估指标对HA历史平均预测模型、VAR向量自回归模型、STGCN时空卷积神经网络、DCRNN扩散卷积递归神经网络、DTW-STGCN、DTW-DCRNN(N)等方法或模型进行评估,其中DTW-DCRNN(N)方法是未使用Kalman滤波与车流量数据动态调整的方法。以上实验均基于12个时间步预测,将对12个时间步的预测结果取平均值作为指标来衡量模型长期预测的效果,根据表2,本发明预测结果优于其他预测方法。In order to further verify the prediction performance of the method of the present invention, the three evaluation indicators of RMSE, MAE and MAPE are used to analyze the HA historical average prediction model, VAR vector autoregressive model, STGCN space-time convolutional neural network, DCRNN diffusion convolutional recurrent neural network, DTW-STGCN , DTW-DCRNN(N) and other methods or models for evaluation, where the DTW-DCRNN(N) method is a method that does not use Kalman filtering and dynamic adjustment of traffic flow data. The above experiments are all based on 12 time step predictions, and the average value of the prediction results of 12 time steps is used as an index to measure the effect of the long-term prediction of the model. According to Table 2, the prediction results of the present invention are better than other prediction methods.

HA:由于噪声周期性强、突变点多,该模型预测结果准确性低,训练速度快。HA: Due to the strong periodicity of noise and many mutation points, the prediction accuracy of this model is low and the training speed is fast.

STGCN:由于噪声序列带有随机性,在噪声预测上结果准确率偏低。STGCN: Due to the randomness of the noise sequence, the accuracy of the noise prediction is low.

DCRNN:该模型效果强于STGCN,时间序列噪声数据对时间依赖性强,侧面反映了GRU比CNN在时间序列建模上有更大优势。DCRNN: The effect of this model is stronger than that of STGCN, and the time series noise data has a strong dependence on time, which reflects that GRU has greater advantages in time series modeling than CNN.

VAR:因噪声序列周期性较强向量自回归模型的预测结果优于STGCN,说明神经网络并不是在任何情况下都能有更好表现的。VAR: Due to the strong periodicity of the noise sequence, the prediction result of the vector autoregressive model is better than that of STGCN, which shows that the neural network does not perform better under any circumstances.

DTW-STGCN:DTW算法应用于STGCN模型中构建DTW-STGCN方法预测性能仍强于STGCN模型。从侧面也验证了从时间序列相似角度考虑空间关系是有意义的。DTW-STGCN: The DTW algorithm is applied to the STGCN model to construct the DTW-STGCN method, and the prediction performance is still stronger than the STGCN model. It is also verified from the side that it is meaningful to consider the spatial relationship from the perspective of time series similarity.

DTW-DCRNN(N):对比DTW-DCRNN(N),本发明方法添加了Kalman滤波,对数据有修正与调整作用,其效果最好。相比原始的DCRNN模型来说,本发明方法的RMSE与MAE精度分别提高了11.1%与5%,MAPE提升5.26%。DTW-DCRNN(N): Compared with DTW-DCRNN(N), the method of the present invention adds Kalman filtering, which has the effect of correcting and adjusting the data, and its effect is the best. Compared with the original DCRNN model, the RMSE and MAE accuracy of the method of the present invention are increased by 11.1% and 5%, respectively, and the MAPE is increased by 5.26%.

表2基线模型预测结果Table 2 Baseline model prediction results

Figure BDA0003883599360000161
Figure BDA0003883599360000161

除此之外,实验中还发现本发明方法在长期预测中的表现甚佳,具体数据见表3,能够看出本发明方法在第一个时间步的预测精度低于DCRNN,但在后面的11个时间步中性能更加趋于稳定,求精确度较高。In addition, in the experiment, it is also found that the method of the present invention performs very well in long-term prediction. The specific data are shown in Table 3. It can be seen that the prediction accuracy of the method of the present invention in the first time step is lower than that of DCRNN, but in the following In 11 time steps, the performance tends to be more stable, and the accuracy is higher.

表3长期预测结果分析Table 3 Analysis of long-term forecast results

Figure BDA0003883599360000162
Figure BDA0003883599360000162

为了便于观察具体预测趋势,将上表可视化为图像,图9为平均绝对误差MAE与均方根误差RMSE曲线图,图像右侧的纵坐标为RMSE指标,单位是分贝(dB),从图可以看出,第2、3、4个时间步预测更为精准。图像左侧的纵坐标为MAE指标,单位是分贝(dB),不考虑第一个时间不得情况下,两种方法误差均随着时间步的增加而增加,但本发明DTW-DCRNN方法的误差更小、精度更高。图10为平均绝对百分比误差MAPE曲线图,趋势与MAE指标类似,但是本发明DTW-DCRNN模型平均绝对百分比误差更小。上述三个指标从第二个时间步(包含第二个时间步)开始,本发明方法在表现更优。In order to facilitate the observation of the specific prediction trend, the above table is visualized as an image. Figure 9 is the mean absolute error MAE and root mean square error RMSE curve. The ordinate on the right side of the image is the RMSE index, and the unit is decibel (dB). It can be seen that the prediction of the 2nd, 3rd, and 4th time steps is more accurate. The ordinate on the left side of the image is the MAE index, and the unit is decibel (dB). When the first time is not considered, the errors of the two methods all increase with the increase of the time step, but the error of the DTW-DCRNN method of the present invention Smaller and more precise. Fig. 10 is the MAPE curve graph of the average absolute percentage error, the trend is similar to the MAE index, but the average absolute percentage error of the DTW-DCRNN model of the present invention is smaller. The above three indicators start from the second time step (including the second time step), and the method of the present invention performs better.

总结来说,改进后的本发明方法不仅提高了预测的整体精度,比起原方法,本发明提出的方法在长期噪声时间序列预测工作中更有优势,并且能够为时空预测工作中图神经网络及其相关研究提供一种新的思路和方法。In summary, the improved method of the present invention not only improves the overall accuracy of prediction, but also has more advantages in the long-term noise time series prediction work than the original method, and can be used for the spatial-temporal prediction work. And related research provides a new way of thinking and method.

当然,上述说明并非是对本发明的限制,本发明也并不仅限于上述举例,本技术领域的技术人员在本发明的实质范围内所做出的变化、改型、添加或替换,也应属于本发明的保护范围。Of course, the above descriptions are not intended to limit the present invention, and the present invention is not limited to the above examples. Changes, modifications, additions or replacements made by those skilled in the art within the scope of the present invention shall also belong to the present invention. protection scope of the invention.

Claims (4)

1. A DTW-DCRNN-based chemical industry park noise prediction method is characterized in that an improved time dynamic warping algorithm is adopted, a space relation is reconstructed through a graph structure of a monitoring station, and long-term space-time prediction of the chemical industry park noise is achieved, and the method specifically comprises the following steps:
step 1, collecting and processing monitoring site information and vehicle information data in a chemical industry park;
step 2, introducing a penalty coefficient to improve an original time dynamic warping algorithm, introducing the penalty coefficient into a diffusion convolution recurrent neural network, and constructing a noise prediction network model by combining Kalman filtering;
step 3, training the constructed noise prediction network model and outputting the trained model;
and 4, acquiring information of the monitoring station in real time and predicting noise in real time based on the trained model.
2. The DTW-DCRNN-based chemical industry park noise prediction method according to claim 1, wherein in step 1, a gateway and a plurality of monitoring sites are arranged in the chemical industry park for collecting data in real time and synchronously recording position information data of each monitoring site; sensors are arranged at the gateway and each monitoring station; the gateway collects information data of vehicles entering and leaving, including license plate number, vehicle entering and leaving time and vehicle flow characteristics of vehicle type; the monitoring station collects noise and natural environment data in a chemical industry park in real time;
and the station position data, the noise and natural environment data and the in-out vehicle information data are transmitted to the gateway equipment and then are interactively transmitted with the database server, and when noise prediction is carried out, data extraction is carried out in real time and 3 sigma criterion is adopted for processing.
3. The DTW-DCRNN-based chemical industry park noise prediction method according to claim 1, wherein the specific process of the step 2 is as follows:
step 2.1, introducing a penalty coefficient to improve an original time dynamic warping algorithm, and calculating a sequence similarity distance;
the principle of the improved time dynamic regularization algorithm is as follows: obtaining the number of the optimal paths and the number of the public subsequences to calculate a penalty coefficient, and calculating the similarity distance between stations by combining the penalty coefficient; the specific process is as follows:
step 2.1.1, calculating a distance matrix;
two noise sequences S and T are assumed, as in equation (1):
Figure FDA0003883599350000011
the length of the sequence S is o, the length of the sequence T is m, o, m belongs to Z + Constructing a distance matrix d [ s ] based on the noise sequence mxo][t]Wherein the ith sequence point s i (i =1,2, \8230;, o) and the jth sequence point t j (j =1,2, \8230;, m) is a matrix(s) i ,t j ) The value of the element(s) is,euclidean distance: dis(s) i ,t j )=(s i -t j ) 2 The matrix distance calculation criterion is shown in equation (2):
Figure FDA0003883599350000021
step 2.1.2, solving a matrix optimal path;
the optimal path is a path with the minimum accumulation distance meeting the lower boundary condition, and the lower boundary condition is as follows: starting from the d [ o ] [ m ] point at the upper right corner of the matrix, finding the minimum point in the three lower left points as the next node until the d [0] [0] point at the lower left corner is finished, ensuring continuity and monotonicity, not crossing or omitting a certain point for matching, only aligning or matching with adjacent points, and unchangeable time sequence;
step 2.1.3, calculating the number of public subsequences and the optimal path sequence length, and further obtaining weight to obtain a penalty coefficient;
wherein, the optimal path sequence length is the length of the optimal path of the matrix finally obtained in the step 2.1.2;
the calculation process of the number of the common subsequences is as follows: firstly, constructing a recording empty list record, circularly traversing sequences s1 and s2, recording the length ls1 or ls2 of a common sequence and the sequence position g of the common sequence by using the record list after the common sequence appears, and extracting the common subsequence from the s1 sequence and storing the common subsequence in an s _ sum array; the length of s _ sum after traversal is the number of the public subsequences; when calculating the number of the public subsequences, setting a threshold value as 1, calculating the subsequences with the length greater than 1 as the public subsequences, and finally counting the total number of the public subsequences to obtain the required number of the public noise subsequences;
the calculation formula of the weight w is as follows (3):
Figure FDA0003883599350000022
wherein, the subseq is the length of the public noise subsequence, and the seq is the sequence length of the optimal path of the matrix;
the calculation process for setting the penalty coefficient alpha is shown as formula (4), wherein x represents the number of public subsequences, and w i Represents the weight of the ith sequence point,
Figure FDA0003883599350000023
step 2.1.4, express the optimal path as (r) 1 ,r 2 ,…,r seq ),r i (i =1,2, \8230;, seq) represents the value of the ith sequence point in the optimal path sequence, and finally, the sum of the values in the optimal path calculated by the original DTW is multiplied by a penalty coefficient to obtain the improved sequence similarity distance dis DTW
Figure FDA0003883599350000024
Step 2.2, forming a matrix by the noise sequence similarity distance among all the monitored sites, calculating the matrix as an adjacent matrix, and constructing a graph relation topological structure; the specific process is as follows:
step 2.2.1, according to the noise sequence similarity distance information of each monitored site, calculating a similarity distance matrix M between the sites, wherein the matrix M is as follows:
Figure FDA0003883599350000031
wherein a is the number of monitoring stations;
step 2.2.2, calculating the standard deviation lambda, M of the similarity distance matrix M c Is the c-th element value in the matrix, N is the number of elements in the M matrix, mu M For matrix mean, the calculation process is as follows:
Figure FDA0003883599350000032
step 2.2.3, analyzing the similarity between the monitoring stations according to the similar distance matrix M, and constructing an adjacent matrix M by using the standard deviation lambda of all non-infinite numbers in the similar distance matrix d Of a contiguous matrix M d Each element in (1)
Figure FDA0003883599350000033
The calculation process of (c) is as follows:
Figure FDA0003883599350000034
wherein c represents the sequence number of the matrix element, and the larger the similarity distance is, the larger the similarity distance is
Figure FDA0003883599350000035
The smaller the threshold is set to 0.1, if
Figure FDA0003883599350000036
If the similarity between the two sites is too low, no mutual influence relationship exists, the two sites are regarded as non-adjacent sites, and the adjacent relationship is not formed when the weight in the adjacent matrix is zero;
step 2.3, introducing a graph convolution neural network GCN to construct an adjacent matrix graph structure, and inputting the adjacent matrix graph structure into a diffusion convolution recurrent neural network for noise prediction to obtain a preliminary prediction result; the specific process is as follows:
the propagation mode between the GCN network layers of the graph convolution neural network is shown as formula (12), wherein L represents the Lth network layer:
Figure FDA0003883599350000037
wherein σ is a nonlinear activation function; w L Is a matrix of trainable weights to produce a desired weight,
Figure FDA0003883599350000038
a degree matrix, which is composed of:
Figure FDA0003883599350000039
i denotes the I th row of the adjacency matrix A, J denotes the J th column of the adjacency matrix A,
Figure FDA00038835993500000310
the characteristic information of the station is kept, H is the characteristic extracted by the current layer, and if the characteristic is the input layer, X = H;
on the basis of the GCN graph convolution neural network, a time-space relation of a noise sequence is modeled by using a diffusion convolution recursive neural network;
the smooth distribution of the diffusion process is represented as a weighted combination of infinite random walk on the graph, the diffusion process is represented by equation (13), and is calculated in closed form:
Figure FDA0003883599350000041
wherein W is a node similarity matrix,
Figure FDA0003883599350000042
is the inverse of the rectangular output matrix, with beta ∈ [0,1 ]]Representing restart probability, k is diffusion degree, epsilon represents the possibility of diffusion from nodes, and the diffusion process in the DCRNN model is bidirectional;
thus, in the spatial relationship, based on the map signal feature matrix
Figure FDA0003883599350000043
And filter f θ The bi-diffusion convolution operation is defined as equation (14):
Figure FDA0003883599350000044
where θ is the filter parameter, G represents graph G, P represents the pth feature dimension, P represents the total number of feature dimensions, K represents the finite K-step truncation of the diffusion process, and G represents the graph GDiffusion convolution, X:, p representing the convolution of all nodes with the p-th feature, θ k,1 A convolution kernel parameter, θ, representing a degree calculation k,2 Represents the parameters of the convolution kernel of the in-degree calculation,
Figure FDA0003883599350000045
and with
Figure FDA0003883599350000046
Respectively representing transition matrixes of diffusion and inverse diffusion, and constructing a diffusion convolution layer with a mapping relation after defining convolution operation as shown in a formula (15):
Figure FDA0003883599350000047
wherein Q represents the qth feature, Q represents the total number of mapped features, H, q representing the q characteristic diffusion convolution operation of all the node pairs and a characteristic matrix
Figure FDA0003883599350000048
Features extracted from the current layer as input
Figure FDA0003883599350000049
In order to be output, the output is,
Figure FDA00038835993500000410
representing a filter, wherein theta is a parameter tensor, and sigma is an activation function;
in the time relation, a diffusion convolution gating recursion unit DCGRU is constructed by combining GRU and diffusion convolution, and is expressed as an expression (16):
Figure FDA00038835993500000411
∑ G represents a diffusion convolution of graph G, r is a radical of hydrogen (t) Indicating the reset gate at time t, [ theta ] r★G Door with indication of resetTensor of parameter (c), X (t) Indicating input at time t, H (t-1) Representing the output at time t-1, b r A bias vector representing a reset gate; u. of (t) Indicating the update Gate at time t, Θ u★G Tensor of parameters representing the update gate, b u A bias vector representing an update gate; c (t) Indicates the hidden state of the next cell at time t, [ theta ] C★G Tensor of parameters representing hidden states, b C A bias vector representing a hidden state; h (t) Represents the output at time t;
step 2.4, extracting the traffic flow characteristics at the road gate corresponding to the current time in real time, and dynamically adjusting the DCRNN noise predicted value by using a Kalman filtering method;
kalman filtering uses the estimate of the previous state to make a prediction of the current state; finally, correcting the predicted value obtained in the prediction stage by utilizing the observed value of the current state to obtain a new estimated value which is closer to the true value; the method specifically comprises the steps of calculating Kalman gains of noise and drift by using normalized traffic flow characteristics and noise sequences through priori estimation and covariance matrix prediction, updating a covariance matrix by using Kalman gain correction, and then correcting and predicting a noise value at the current moment.
4. The DTW-DCRNN-based chemical industry park noise prediction method according to claim 1, wherein the specific process of the step 3 is as follows:
extracting historical data of each monitoring station for reconstructing a spatio-temporal relationship of data by an improved DTW algorithm, training a noise prediction network model DTW-DCRNN, selecting 70% of a data set as a training set, 10% as a verification set and finally 20% as a test set; in addition, the training results are not recorded in the first 60 epochs, the training results are recorded once every 10 epochs in the 60-100 epochs and the model parameters are output, and the training results are recorded once every 5 epochs after 100 epochs and the model parameters are output; and finally stopping training in advance according to the measured verification loss to ensure that the model is captured when the model is about to be overfitting.
CN202211238200.1A 2022-10-11 2022-10-11 A Noise Prediction Method for Chemical Industry Park Based on DTW-DCRNN Pending CN115659609A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211238200.1A CN115659609A (en) 2022-10-11 2022-10-11 A Noise Prediction Method for Chemical Industry Park Based on DTW-DCRNN

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211238200.1A CN115659609A (en) 2022-10-11 2022-10-11 A Noise Prediction Method for Chemical Industry Park Based on DTW-DCRNN

Publications (1)

Publication Number Publication Date
CN115659609A true CN115659609A (en) 2023-01-31

Family

ID=84987374

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211238200.1A Pending CN115659609A (en) 2022-10-11 2022-10-11 A Noise Prediction Method for Chemical Industry Park Based on DTW-DCRNN

Country Status (1)

Country Link
CN (1) CN115659609A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116206117A (en) * 2023-03-03 2023-06-02 朱桂湘 Signal processing optimization system and method based on number traversal
CN117155707A (en) * 2023-10-30 2023-12-01 广东省通信产业服务有限公司 Harmful domain name detection method based on passive network flow measurement
CN118428562A (en) * 2024-07-03 2024-08-02 生态环境部华南环境科学研究所(生态环境部生态环境应急研究所) Noise and large-scale infrastructure project site selection analysis method based on particle swarm optimization

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116206117A (en) * 2023-03-03 2023-06-02 朱桂湘 Signal processing optimization system and method based on number traversal
CN116206117B (en) * 2023-03-03 2023-12-01 北京全网智数科技有限公司 Signal processing optimization system and method based on number traversal
CN117155707A (en) * 2023-10-30 2023-12-01 广东省通信产业服务有限公司 Harmful domain name detection method based on passive network flow measurement
CN117155707B (en) * 2023-10-30 2023-12-29 广东省通信产业服务有限公司 Harmful domain name detection method based on passive network flow measurement
CN118428562A (en) * 2024-07-03 2024-08-02 生态环境部华南环境科学研究所(生态环境部生态环境应急研究所) Noise and large-scale infrastructure project site selection analysis method based on particle swarm optimization
CN118428562B (en) * 2024-07-03 2024-09-20 生态环境部华南环境科学研究所(生态环境部生态环境应急研究所) Noise and large-scale infrastructure project site selection analysis method based on particle swarm optimization

Similar Documents

Publication Publication Date Title
CN115659609A (en) A Noise Prediction Method for Chemical Industry Park Based on DTW-DCRNN
Kisi et al. Precipitation forecasting using wavelet-genetic programming and wavelet-neuro-fuzzy conjunction models
CN110648014B (en) A regional wind power forecasting method and system based on spatiotemporal quantile regression
He et al. Short-term runoff prediction optimization method based on BGRU-BP and BLSTM-BP neural networks
Zhao et al. Short term traffic flow prediction of expressway service area based on STL-OMS
Shiri et al. Estimation of daily suspended sediment load by using wavelet conjunction models
CN114565187A (en) Traffic network data prediction method based on graph space-time self-coding network
Wang et al. Medium and long-term precipitation prediction using wavelet decomposition-prediction-reconstruction model
CN118095104B (en) A method and system for rapid flood forecasting based on machine learning
Song et al. Application of artificial intelligence based on synchrosqueezed wavelet transform and improved deep extreme learning machine in water quality prediction
Bi et al. Daily runoff forecasting based on data-augmented neural network model
Latt Application of feedforward artificial neural network in Muskingum flood routing: a black-box forecasting approach for a natural river system
CN111667189A (en) Construction engineering project risk prediction method based on one-dimensional convolutional neural network
CN115330085A (en) Wind speed prediction method based on deep neural network without future information leakage
CN118607596A (en) Long-term PM2.5 concentration point and interval prediction method based on ConvFormer-KDE
CN117786396A (en) A short-term sea surface temperature prediction method and system based on the CSA-ConvLSTM model
Vafakhah et al. Application of intelligent technology in rainfall analysis
CN117851802A (en) Water quality prediction method and device and computer readable storage medium
CN119598402A (en) Water quality prediction method based on fusion of graphic neural network and space-time characteristics
CN117408171A (en) A hydrological ensemble forecasting method using Copula multi-model condition processor
CN119691409A (en) A dam leakage intelligent monitoring and prediction early warning method and system
CN118228613B (en) Soft measurement method for improving TSO optimization deep learning model
Xu et al. Rapid forecasting of compound flooding for a coastal area based on data-driven approach
CN118009990A (en) A high-precision real-time tide forecasting method based on Transformer model
CN117407763A (en) A climate causal discovery method based on deep reinforcement learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination