CN114428937A - Time sequence data prediction method based on space-time diagram neural network - Google Patents

Time sequence data prediction method based on space-time diagram neural network Download PDF

Info

Publication number
CN114428937A
CN114428937A CN202111508244.7A CN202111508244A CN114428937A CN 114428937 A CN114428937 A CN 114428937A CN 202111508244 A CN202111508244 A CN 202111508244A CN 114428937 A CN114428937 A CN 114428937A
Authority
CN
China
Prior art keywords
time
series data
matrix
space
neural network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111508244.7A
Other languages
Chinese (zh)
Inventor
谢非
杨嘉乐
张瑞
凌旭
李群召
刘畅
郑鹏飞
夏光圣
章悦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Normal University
Original Assignee
Nanjing Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Normal University filed Critical Nanjing Normal University
Priority to CN202111508244.7A priority Critical patent/CN114428937A/en
Publication of CN114428937A publication Critical patent/CN114428937A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Computational Mathematics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Algebra (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Operations Research (AREA)
  • Probability & Statistics with Applications (AREA)
  • Traffic Control Systems (AREA)

Abstract

The invention discloses a time series data prediction method based on a space-time diagram neural network, which comprises the following steps: the method comprises the steps of collecting road traffic data, generating time sequence data, generating input features according to the spatial features of sensors, capturing the time features to generate a time similarity matrix, generating a global space-time correlation adjacent matrix through mapping, further generating a combined space-time correlation adjacent matrix, sending the input features and the combined space-time correlation adjacent matrix into a graph neural network to extract features, using a Huber loss function training model to predict traffic flow distribution conditions, and summarizing traffic flow distribution. The method can update and predict the acquired traffic flow data in real time, capture potential time and space relation among different traffic flow data, acquire the global information of the road traffic network, and is used for monitoring and managing the intelligent traffic system.

Description

Time sequence data prediction method based on space-time diagram neural network
Technical Field
The invention relates to the technical field of deep learning, space-time prediction and a graph neural network, in particular to a time sequence data prediction method based on a space-time graph neural network.
Background
The time-space prediction is a great hot point of current artificial intelligence research, and is widely applied to daily life and production aiming at weather forecast, flood forecast, stock prediction and the like of time-space data, and particularly, is widely applied to traffic prediction in the field of Intelligent Traffic Systems (ITS).
In recent years, with the explosion of graph neural networks, the application of the graph neural networks to traffic prediction has been advanced to a certain extent. Networks such as STGCN, ASTGCN, Graph WaveNet, and STSGCN use Graph neural networks to construct spatial and temporal models, capturing temporal and spatial features of the respective time series data. When the network carries out feature extraction, the spatial and temporal adjacency matrixes are respectively used for representing the relation between time and space among different nodes, then graph convolution operation is carried out, and finally a prediction result is output. However, the existing methods and models have limitations (1) the increasingly complex road traffic structure brings difficulty to the extraction of spatial features; (2) the adjacency matrixes of time and space are independently generated to capture the characteristics of the nodes, so that the temporal and spatial relation among the nodes is ignored; (3) traffic flow data is complex and variable, and is very difficult to synchronously and dynamically capture global characteristics of the traffic flow data.
At present, in the aspect of space-time prediction, due to the large data volume and high requirements on accuracy and instantaneity, the existing network model and adjacency matrix feature extraction method is not applicable.
A new technical solution is needed to solve these problems.
Disclosure of Invention
The purpose of the invention is as follows: in order to solve the problem that the existing deep learning network is difficult to predict time series data quickly and accurately, a time series data prediction method based on a space-time diagram neural network is provided, and the problems of large data quantity, high real-time requirement and high accuracy requirement in a space-time prediction task are solved.
The technical scheme is as follows: in order to achieve the purpose, the invention provides a time series data prediction method based on a space-time diagram neural network, which comprises the following steps:
s1: the method comprises the following steps that a sensor collects road traffic flow data, preprocesses the data and generates corresponding time sequence data;
s2: extracting spatial features of the sensors according to different geographic positions of the sensors to generate input features, and capturing potential time features between different time series data by using a time similarity algorithm to generate a time similarity matrix;
s3: mapping the time similarity matrix to the global to generate a global space-time correlation adjacent matrix, and further generating a combined space-time correlation adjacent matrix;
s4: intercepting the time length of input time sequence data by setting different time step lengths, sending input features and a combined space-time correlation adjacent matrix into a corresponding stacked graph convolution layer for feature extraction, and performing a back propagation training model by using a Huber loss function;
s5: and predicting the time series data of the next time period by using the trained model, outputting the traffic flow distribution condition of the next time period according to the predicted time series data, and summarizing the traffic flow distribution of the whole day.
Further, the step S1 is specifically:
a1: setting sensors at different road junctions to collect statistics of the number of vehicles passing through the section of the journey in a time period, and storing data, wherein the stored files comprise collected initial test time, collected termination time and statistics of the number of the vehicles;
a2: preprocessing the original data to generate a time series data vector set V ═ V (V)1,v2,…vn) Wherein v isf(f ═ 1,2, …, n) is the number of vehicles collected by the f-th sensor that passed this month, and n is the total number of sensors used.
Further, the step S2 is specifically:
b1: the number of each sensor is consistent with the number of the road where the sensor is located, and the number of each sensor is unique, and the input characteristics X of the sensors are further generated through the numbers;
b2: for vector set V ═ V (V) from time series data1,v2,…vn) Any two pieces of time-series data v in (2)iAnd vjCalculating the time similarity of the two;
b3: screening two pieces of time sequence data with the highest similarity according to the time similarity result obtained by calculation;
b4: any one piece of time sequence data in the time sequence data vector set V is marked as VaWhere a represents the coordinate of the piece of time-series data in the time-series data vector set V, and when i is a, the sum V is obtainedaTime series data v with highest similaritybConstructing a time similarity matrix MsimTime similarity matrix MsimSetting the element of the coordinate position of (a, b) as 1, traversing the whole time series data vector set to complete MsimAnd (4) constructing.
Further, in the step B2, the time similarity is calculated by formula (1), formula (2) and formula (3):
Figure BDA0003404168400000021
Figure BDA0003404168400000022
wherein M represents the number of time intervals, M represents the total number of time intervals, and M has a value ranging from 1 to M, v(i,m)For the number of vehicles counted in the m time interval in the ith time series data vector, v(j,m)The number of the vehicles counted in the mth time interval in the jth time series data vector,
Figure BDA0003404168400000023
an average value of the number of vehicles representing the ith time-series data vector,
Figure BDA0003404168400000024
an average value of the number of vehicles representing the jth time-series data vector;
Figure BDA0003404168400000031
wherein, Correlation (v)i,vj) Denotes vi,vjThe time similarity between them.
Further, in the step B3, the two pieces of time-series data with the highest similarity are selected by equation (4):
max{Correlation(vi,vj)}#(4)
in the formula, max represents the maximum value calculation.
Further, the step S3 is specifically:
c1: using 1 to represent the sensors adjacent to the geographical position in the matrix, and using 0 to represent that the sensors are not adjacent to each other, thereby obtaining a corresponding adjacent matrix A;
c2: using identity matrix M of the same size as the adjacent matrix AconTo further represent potential temporal and spatial relationships between each sensor;
c3: the adjacent matrix A and the unit matrix MconAnd time similarity matrix MsimFurther fusing to generate a global space-time correlation adjacency matrix Acor
C4: using two global spatio-temporal correlation adjacency matrices AcorGenerating a combined spatio-temporal correlation adjacency matrix Acor·groupThereby further expanding the space-time feature extraction range of the time series data.
Further, the step S4 is specifically:
d1: dividing one hour into 12 time intervals on average by taking 5 minutes as a time interval;
d2: by setting time step length, the time sequence data in the input feature X is intercepted according to the formula (5) at time intervals of 12, 9 and 6 in sequence, and time sequence data slices of 3 different time intervals are generated:
Figure BDA0003404168400000032
wherein h denotes a graph convolution layer, l denotes the number of layers, l has a value of 3, input denotes input, output denotes output,
Figure BDA0003404168400000033
showing the input of the first layer map convolutional layer,
Figure BDA0003404168400000034
representing the output of the convolution layer of the l-1 layer graph, T is the total number of time intervals, T has a value of 12, and K is the combined spatio-temporal correlation adjacency matrix Acor·groupThe size of K is 4, d is the number of fragments of time series data, d is 0, 1,2, C is the number of extracted features, C is 3, R represents the real number set, [:]denotes all elements taken in this dimension, [: 0: T-l × (K + 1): wherein:]represents taking all elements from 0 th to T-l × (K +1) th in the 1 st dimension;
d3: stacking convolution layers according to different time sequence data slices, and combining the intercepted time sequence data segments with a combined space-time correlation adjacency matrixAcor·groupThe feature is extracted by feeding the stack into the corresponding graph convolution layer, and the operation of each graph convolution layer is expressed by formula (6):
STCGNNh((l-1))=hl=σ(Acor·groouph(l-1)W+b)#(6)
wherein W is weight learning parameter of the graph neural network, b is bias parameter of the graph neural network, σ represents ReLU activation function, STCGNN represents space-time correlation graph neural network, STCGNN (h)(l-1)) Representing the state of the l-1 layer graph convolution layer of the spatio-temporal related graph neural network;
d4: performing feature extraction operation on the time series data fragments by using the stacked graph volume layers, and splicing output results of the last layer of the graph volume layers to form new time series data u;
d5: the temporal and spatial features of the time-series data u are further aggregated according to equation (7) to obtain a predicted time-series data matrix Y:
Figure BDA0003404168400000041
in the formula, haggRepresents the maximum aggregation operation, out represents the output, and max represents the max-value operation;
d6: combining the mean square error and the average error to obtain a Huber loss function shown in formula (8);
Figure BDA0003404168400000042
in the formula, Y represents real time series data,
Figure BDA0003404168400000043
represents the predicted time-series data, delta is a threshold value for determining an error,
Figure BDA0003404168400000044
represents the calculation of Y and
Figure BDA0003404168400000045
huber loss function in between;
d7: and (4) returning the error along the minimum gradient direction according to the result obtained by calculation in the formula (8), updating the values of the weight learning parameter W and the bias parameter b in the formula (6), and storing the model.
The invention provides a time series data prediction method based on a time-space diagram neural network in the aspect of time-space prediction of time series data, so that the time series data of traffic flow can be predicted with high real-time performance and high accuracy, and the method can be widely applied to monitoring and management of an intelligent traffic system.
The method constructs the time-space correlation graph neural network STCGNN, not only can capture the potential correlation of time space, but also can extract the overall characteristics of time sequence data. The method of the invention can process a large amount of data in a short time and has small calculation amount. The experimental result on the traffic data set shows the superiority and effectiveness of the method, and the method can be widely applied to the detection and management of the intelligent traffic system.
The method of the invention designs the time similarity algorithm independently, can quickly and accurately capture the global characteristics of the time sequence data, and has high robustness for the conditions of large volatility and abnormal values. Simultaneously, a space-time correlation diagram neural network STCGNN and a combined space-time correlation matrix A are also designedcor·groupThe method can extract the fusion space-time characteristics of the sensor nodes, and can dynamically and efficiently capture the global space-time characteristics of the road traffic network.
The method can update and predict the acquired traffic flow data in real time, capture potential time and space relation among different traffic flow data, and acquire the global information of the road traffic network for monitoring and managing the intelligent traffic system. The detection method has the advantages of low time consumption, small calculation complexity, high real-time performance and high accuracy, and is suitable for other space-time prediction tasks, such as weather forecast, stock prediction and the like.
Has the advantages that: the inventionCompared with the prior art, a time similarity algorithm is designed to calculate the time correlation of time sequence data, and a new combined space-time correlation adjacency matrix A is constructedcor·groupThe method is far superior to the existing method in the results of average absolute error, average absolute percentage error and root mean square error obtained on a traffic flow data set, and can dynamically capture the characteristics of time sequence data and update the change of the time sequence data in real time, so that the method has robustness.
Drawings
FIG. 1 is a schematic workflow diagram of a time series data prediction method based on a space-time diagram neural network according to an embodiment of the present invention;
FIG. 2 is a node spatiotemporal relationship diagram and a global spatiotemporal correlation matrix A provided by the embodiment of the present inventioncor
FIG. 3 is a block diagram of a combined spatio-temporal correlation matrix A provided by an embodiment of the present inventioncor·group
FIG. 4 is a schematic diagram of a method for intercepting time-series data according to an embodiment of the present invention;
FIG. 5 is a diagram of a neural network structure of spatiotemporal correlation provided by an embodiment of the present invention;
FIG. 6 is a graph comparing actual traffic flow data and predicted traffic flow data provided by embodiments of the present invention.
Detailed Description
The present invention is further illustrated by the following figures and specific examples, which are to be understood as illustrative only and not as limiting the scope of the invention, which is to be given the full breadth of the appended claims and any and all equivalent modifications thereof which may occur to those skilled in the art upon reading the present specification.
The invention provides a time series data prediction method based on a space-time diagram neural network, which has the following basic principle: after the sensor collects traffic flow data, the deep learning neural network based on the convolution of the space-time diagram is used for prediction, and time and space are capturedPotential correlation between the input features X and the combined spatio-temporal correlation adjacency matrix A, and obtaining the overall spatio-temporal features of the time series datacor·groupAnd (5) sending the data to a space-time related graph neural network STCGNN for feature extraction, and finally outputting a prediction result.
Based on the above, the present embodiment applies the time series data prediction method to the traffic flow prediction of the intelligent transportation system, and with reference to fig. 1, the method includes the following steps:
step 1: the method comprises the following steps that a sensor collects road traffic flow data, preprocesses the data and generates corresponding time sequence data;
step 2: the traffic flow prediction method based on the space-time diagram neural network researches [ D ]. Zhejiang university, 2020.21-22 ], and potential time characteristics among different time sequence data are captured by using a time similarity calculation method, so that a time similarity matrix is generated;
and step 3: mapping the time similarity matrix to the global to generate a global space-time correlation adjacent matrix, and further generating a combined space-time correlation adjacent matrix;
and 4, step 4: intercepting the time length of input time sequence data by setting different time step lengths, sending input characteristics and a combined space-time correlation adjacency matrix into corresponding stacked graph convolution layers for characteristic extraction, and performing a back propagation training model by using a Huber loss function [ consultable Shubo day, p-Huber loss function and robustness research [ D ] of Zhejiang teacher university, 2021.11 ];
and 5: and predicting the time series data of the next hour by using the obtained trained model, outputting the traffic flow distribution condition of the next hour according to the predicted time series data, and summarizing the traffic flow distribution of the whole day.
Step 1 in this embodiment specifically includes the following processes:
step 1.1: setting sensors at different road junctions to collect statistics of the number of vehicles passing through the distance in one month, and storing the data as txt text files;
step 1.2: the stored files comprise collected initial test time, collected termination time and collected vehicle number statistics;
step 1.3: preprocessing the original data to generate a time series data vector set V ═ V (V)1,v2,…vn) Wherein v isf(f ═ 1,2, …, n) is the number of vehicles collected by the f-th sensor that passed this month, and n is the total number of sensors used.
In this embodiment, the step 2 specifically includes the following steps:
step 2.1: the number of each sensor is consistent with the number of the road where the sensor is located, and the number of each sensor is unique, and the number is used for further generating the input characteristics X [ frequently referenced ].
Step 2.2: for vector set V ═ V (V) from time series data1,v2,…vn) Any two pieces of time-series data v in (2)iAnd vjThe time similarity is calculated by formula (1), formula (2) and formula (3):
Figure BDA0003404168400000061
Figure BDA0003404168400000062
wherein M represents the number of time intervals, M represents the total number of time intervals, and M has a value ranging from 1 to M, v(i,m)For the number of vehicles counted in the m time interval in the ith time series data vector, v(j,m)The number of the vehicles counted in the mth time interval in the jth time series data vector,
Figure BDA0003404168400000063
an average value of the number of vehicles representing the ith time-series data vector,
Figure BDA0003404168400000064
an average value of the number of vehicles representing the jth time-series data vector;
Figure BDA0003404168400000065
wherein correction (v)i,vj) Denotes vi,vjThe time similarity between them;
step 2.3: according to the time similarity result calculated by the formula (3), screening out two pieces of time sequence data with the highest similarity by using a formula (4):
max{Correlation(vi,vj)}#(4)
in the formula, max represents calculation for obtaining a maximum value;
step 2.4: when i is a, the sum v can be obtained from the formulas (3) and (4)aTime series data v with highest similaritybAccording to va,vbMarking the element of the coordinate position of (a, b) in the matrix as 1 in the corresponding coordinates a, b in the time sequence data vector set V, traversing the whole time sequence data to obtain a new matrix called as a time similarity matrix Msim
Step 3 in this embodiment specifically includes the following processes:
step 3.1: sensors adjacent in geographic position are represented by 1 in the matrix, and 0 represents that they are not adjacent, so as to obtain a corresponding adjacent matrix A [ referable to Changwei.
Step 3.2: using identity matrix M of the same size as the adjacent matrix AconTo further represent potential temporal and spatial relationships between each sensor;
step 3.3: the adjacent matrix A and the unit matrix MconAnd time similarity matrix MsimFurther fusion generates a global spatiotemporal correlation adjacency matrix A as shown in FIG. 2cor
Step 3.4: using two global entitiesSpatio-temporal correlation adjacency matrix AcorGenerate a combined spatiotemporal correlation adjacency matrix A as shown in FIG. 3cor·groupThereby further expanding the space-time feature extraction range of the time sequence data;
step 4 in this embodiment specifically includes the following processes:
step 4.1: dividing one hour into 12 time intervals on average by taking 5 minutes as a time interval;
step 4.2: by setting the time step length, the time series data in the input feature X are sequentially intercepted at time intervals of 12, 9 and 6 by formula (5), and time series data slices of 3 different time intervals are generated, as shown in fig. 4:
Figure BDA0003404168400000071
wherein h denotes a graph convolution layer, l denotes the number of layers, l has a value of 3, input denotes input, output denotes output,
Figure BDA0003404168400000072
showing the input of the first layer map convolutional layer,
Figure BDA0003404168400000073
representing the output of the convolution layer of the l-1 layer graph, T is the total number of time intervals, T has a value of 12, and K is the combined spatio-temporal correlation adjacency matrix Acor·groupThe size of K is 4, i is the number of fragments of the time-series data, i is 0, 1,2, C is the number of extracted features, C is 3, R represents the real number set, [:]denotes all elements taken in this dimension, [: 0: T-l × (K + 1): wherein:]represents taking all elements from 0 th to T-l × (K +1) th in the 1 st dimension;
step 4.3: study of graph convolution layers according to different time series data slices [ Can refer to Changwei ] traffic flow prediction method based on space-time graph neural network [ D]Zhejiang university, 2020.21-22, stacking the truncated time-series data fragments and a combined spatio-temporal correlation adjacency matrix Acor·groupFeeding the corresponding stacked graph volume layerThe operation of each graph convolution layer is represented by formula (6):
STCGNN(h(l-1))=hl=σ(Acor·groouph(l-1)W+b)#(6)
wherein W is weight learning parameter of the graph neural network, b is bias parameter of the graph neural network, σ represents ReLU activation function, STCGNN represents space-time correlation graph neural network, and the structure of the network is shown in FIG. 5, STCGNN (h)(l-1)) Representing the state of the l-1 layer graph convolution layer of the spatio-temporal related graph neural network;
step 4.4: performing feature extraction operation on the time series data fragments by using the stacked graph volume layers, and splicing output results of the last layer of the graph volume layers to form new time series data u;
step 4.5: the temporal and spatial features of the time-series data u are further aggregated according to equation (7) to obtain a predicted time-series data matrix Y:
Figure BDA0003404168400000081
in the formula, haggRepresents the maximum aggregation operation, out represents the output, and max represents the max-value operation;
step 4.6: combining the mean square error and the mean error, we can get the Huber loss function [ referable to shu bo day ] p-Huber loss function and its robustness study [ D ]. university of chem, zhe jiang, 2021.11 ] as shown in equation (8):
Figure BDA0003404168400000082
in the formula, Y represents real time series data,
Figure BDA0003404168400000083
represents predicted time-series data, delta is a threshold value for determining an error,
Figure BDA0003404168400000084
represents the calculation of Y and
Figure BDA0003404168400000085
huber loss function in between;
step 4.7: and (4) returning the error along the minimum gradient direction according to the result obtained by calculation in the formula (8), updating the values of the weight learning parameter W and the bias parameter b in the formula (6), and storing the model.
Step 5 in this embodiment specifically includes the following processes:
step 5.1: predicting time series data of the next hour by using the obtained trained model;
step 5.2: according to the predicted time series data, the traffic flow distribution situation of the next hour is output, the traffic flow distribution situation of the whole day is finally summarized, and a comparison graph of the real traffic flow data and the predicted traffic flow data is shown in fig. 6.
It can be seen that the method of the present invention in this embodiment obtains result data with an average absolute error of 15.66, an average absolute percentage error of 14.69%, and a root mean square error of 27.06 on a traffic flow data set, and these data are far ahead of other methods, thereby verifying the effectiveness of the method of the present invention.
The embodiment also provides a time series data prediction system based on the space-time diagram neural network, which comprises a network interface, a memory and a processor; the network interface is used for receiving and sending signals in the process of receiving and sending information with other external network elements; a memory for storing computer program instructions executable on the processor; a processor for executing the steps of the prediction method when executing the computer program instructions.
The present embodiment also provides a computer storage medium storing a computer program that when executed by a processor can implement the method described above. The computer-readable medium may be considered tangible and non-transitory. Non-limiting examples of a non-transitory tangible computer-readable medium include a non-volatile memory circuit (e.g., a flash memory circuit, an erasable programmable read-only memory circuit, or a mask read-only memory circuit), a volatile memory circuit (e.g., a static random access memory circuit or a dynamic random access memory circuit), a magnetic storage medium (e.g., an analog or digital tape or hard drive), and an optical storage medium (e.g., a CD, DVD, or blu-ray disc), among others. The computer program includes processor-executable instructions stored on at least one non-transitory tangible computer-readable medium. The computer program may also comprise or rely on stored data. The computer programs may include a basic input/output system (BIOS) that interacts with the hardware of the special purpose computer, a device driver that interacts with specific devices of the special purpose computer, one or more operating systems, user applications, background services, background applications, and the like.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

Claims (7)

1. A time series data prediction method based on a space-time diagram neural network is characterized by comprising the following steps:
s1: the method comprises the following steps that a sensor collects road traffic flow data, preprocesses the data and generates corresponding time sequence data;
s2: extracting spatial features of the sensors according to different geographic positions of the sensors to generate input features, and capturing potential time features between different time series data by using a time similarity algorithm to generate a time similarity matrix;
s3: mapping the time similarity matrix to the global to generate a global space-time correlation adjacent matrix, and further generating a combined space-time correlation adjacent matrix;
s4: intercepting the time length of input time sequence data by setting different time step lengths, sending input features and a combined space-time correlation adjacent matrix into a corresponding stacked graph convolution layer for feature extraction, and performing a back propagation training model by using a Huber loss function;
s5: and predicting the time series data of the next time period by using the trained model, outputting the traffic flow distribution condition of the next time period according to the predicted time series data, and summarizing the traffic flow distribution of the whole day.
2. The method for predicting time-series data based on a space-time diagram neural network according to claim 1, wherein the step S1 specifically comprises:
a1: setting sensors at different road junctions to collect statistics of the number of vehicles passing through the section of the journey in a time period, and storing data, wherein the stored files comprise collected initial test time, collected termination time and statistics of the number of the vehicles;
a2: preprocessing the original data to generate a time series data vector set V ═ V (V)1,v2,…vn) Wherein v isf(f ═ 1,2, …, n) is the number of vehicles collected by the f-th sensor that passed this month, and n is the total number of sensors used.
3. The method for predicting time-series data based on a space-time diagram neural network according to claim 2, wherein the step S2 specifically comprises:
b1: the number of each sensor is consistent with the number of the road where the sensor is located, and the number of each sensor is unique, and the input characteristics X of the sensors are further generated through the numbers;
b2: for vector set V ═ V (V) from time series data1,v2,…vn) Any two pieces of time-series data v in (2)iAnd vjCalculating the time similarity of the two;
b3: screening two pieces of time sequence data with the highest similarity according to the time similarity result obtained by calculation;
b4: any time sequence data in the time sequence data vector set V is marked as VaWhere a represents the coordinate of the piece of time-series data in the time-series data vector set V, and when i is a, the sum V is obtainedaTime series data v with highest similaritybConstructing a time similarity matrix MsimTime similarity matrix MsimSet the element of the (a, b) coordinate position to 1, traverseThe whole time series data vector set completes MsimAnd (4) constructing.
4. The method according to claim 3, wherein the time-series data prediction based on the spatio-temporal neural network is performed in step B2 according to formula (1), formula (2) and formula (3):
Figure FDA0003404168390000021
Figure FDA0003404168390000022
wherein M represents the number of time intervals, M represents the total number of time intervals, and M has a value ranging from 1 to M, v(i,m)For the number of vehicles counted in the m time interval in the ith time series data vector, v(j,m)The number of the vehicles counted in the mth time interval in the jth time series data vector,
Figure FDA0003404168390000023
an average value of the number of vehicles representing the ith time-series data vector,
Figure FDA0003404168390000024
an average value of the number of vehicles representing the jth time-series data vector;
Figure FDA0003404168390000025
wherein correction (v)i,vj) Denotes vi,vjThe time similarity between them.
5. The method according to claim 3, wherein the two pieces of time series data with the highest similarity are selected by equation (4) in step B3:
max{Correlation(vi,vj)}#(4)
in the formula, max represents the maximum value calculation.
6. The method for predicting time-series data based on a space-time diagram neural network according to claim 1, wherein the step S3 specifically comprises:
c1: using 1 to represent the sensors adjacent to the geographical position in the matrix, and using 0 to represent that the sensors are not adjacent to each other, thereby obtaining a corresponding adjacent matrix A;
c2: using identity matrix M of the same size as the adjacent matrix AconTo further represent potential temporal and spatial relationships between each sensor;
c3: the adjacent matrix A and the unit matrix MconAnd time similarity matrix MsimFurther fusing to generate a global space-time correlation adjacency matrix Acor
C4: using two global spatio-temporal correlation adjacency matrices AcorGenerating a combined spatio-temporal correlation adjacency matrix Acor·group
7. The method for predicting time-series data based on a space-time diagram neural network according to claim 1, wherein the step S4 specifically comprises:
d1: dividing one hour into 12 time intervals on average by taking 5 minutes as a time interval;
d2: by setting time step length, the time sequence data in the input feature X is intercepted according to the formula (5) at time intervals of 12, 9 and 6 in sequence, and time sequence data slices of 3 different time intervals are generated:
Figure FDA0003404168390000031
wherein h represents a graph convolution layer, l represents the number of layers, and l has a value ofInput represents input, output represents output,
Figure FDA0003404168390000032
showing the input of the first layer map convolutional layer,
Figure FDA0003404168390000033
representing the output of the convolution layer of the l-1 layer graph, T is the total number of time intervals, T has a value of 12, and K is the combined spatio-temporal correlation adjacency matrix Acor·groupThe size of K is 4, d is the number of fragments of time series data, d is 0, 1,2, C is the number of extracted features, C is 3, R represents the real number set, [:]represents intercepting all elements under this dimension, [: ,0: t-l × (K +1),: ,:]represents taking all elements from 0 th to T-l × (K +1) th in the 1 st dimension;
d3: stacking convolution layers according to different time sequence data slices, and combining the intercepted time sequence data segments with a combined space-time correlation adjacent matrix Acor·groupThe feature is extracted by feeding the stack into the corresponding graph convolution layer, and the operation of each graph convolution layer is expressed by formula (6):
STCGNN(h(l-1)=hl=σ(Acor·groouph(l-1)W+b)#(6)
wherein W is weight learning parameter of the graph neural network, b is bias parameter of the graph neural network, σ represents ReLU activation function, STCGNN represents space-time correlation graph neural network, STCGNN (h)(l-1)) Representing the state of the l-1 layer graph convolution layer of the spatio-temporal related graph neural network;
d4: performing feature extraction operation on the time series data fragments by using the stacked graph volume layers, and splicing output results of the last layer of the graph volume layers to form new time series data u;
d5: the temporal and spatial features of the time-series data u are further aggregated according to equation (7) to obtain a predicted time-series data matrix Y:
Figure FDA0003404168390000034
in the formula, haggRepresents the maximum aggregation operation, out represents the output, and max represents the max-value operation;
d6: combining the mean square error and the average error to obtain a Huber loss function shown in formula (8);
Figure FDA0003404168390000035
in the formula, Y represents real time series data,
Figure FDA0003404168390000036
represents predicted time-series data, delta is a threshold value for determining an error,
Figure FDA0003404168390000037
represents the calculation of Y and
Figure FDA0003404168390000038
the Huber loss function in between;
d7: and (4) returning the error along the minimum gradient direction according to the result obtained by calculation in the formula (8), updating the values of the weight learning parameter W and the bias parameter b in the formula (6), and storing the model.
CN202111508244.7A 2021-12-10 2021-12-10 Time sequence data prediction method based on space-time diagram neural network Pending CN114428937A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111508244.7A CN114428937A (en) 2021-12-10 2021-12-10 Time sequence data prediction method based on space-time diagram neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111508244.7A CN114428937A (en) 2021-12-10 2021-12-10 Time sequence data prediction method based on space-time diagram neural network

Publications (1)

Publication Number Publication Date
CN114428937A true CN114428937A (en) 2022-05-03

Family

ID=81312061

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111508244.7A Pending CN114428937A (en) 2021-12-10 2021-12-10 Time sequence data prediction method based on space-time diagram neural network

Country Status (1)

Country Link
CN (1) CN114428937A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115114331A (en) * 2022-06-15 2022-09-27 国电南瑞科技股份有限公司 Power grid data flow connection method based on deep learning and parallel computing
CN115376318A (en) * 2022-08-22 2022-11-22 重庆邮电大学 Traffic data compensation method based on multi-attribute fusion neural network
CN115755219A (en) * 2022-10-18 2023-03-07 长江水利委员会水文局 Flood forecast error real-time correction method and system based on STGCN
CN116473514A (en) * 2023-03-29 2023-07-25 西安电子科技大学广州研究院 Parkinson's disease detection based on plantar pressure adaptive directed space-time graph neural network
CN117091799A (en) * 2023-10-17 2023-11-21 湖南一特医疗股份有限公司 Intelligent three-dimensional monitoring method and system for oxygen supply safety of medical center

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115114331A (en) * 2022-06-15 2022-09-27 国电南瑞科技股份有限公司 Power grid data flow connection method based on deep learning and parallel computing
CN115376318A (en) * 2022-08-22 2022-11-22 重庆邮电大学 Traffic data compensation method based on multi-attribute fusion neural network
CN115376318B (en) * 2022-08-22 2023-12-29 中交投资(湖北)运营管理有限公司 Traffic data compensation method based on multi-attribute fusion neural network
CN115755219A (en) * 2022-10-18 2023-03-07 长江水利委员会水文局 Flood forecast error real-time correction method and system based on STGCN
CN115755219B (en) * 2022-10-18 2024-04-02 长江水利委员会水文局 STGCN-based flood forecast error real-time correction method and system
CN116473514A (en) * 2023-03-29 2023-07-25 西安电子科技大学广州研究院 Parkinson's disease detection based on plantar pressure adaptive directed space-time graph neural network
CN116473514B (en) * 2023-03-29 2024-02-23 西安电子科技大学广州研究院 Parkinson disease detection method based on plantar pressure self-adaptive directed space-time graph neural network
CN117091799A (en) * 2023-10-17 2023-11-21 湖南一特医疗股份有限公司 Intelligent three-dimensional monitoring method and system for oxygen supply safety of medical center
CN117091799B (en) * 2023-10-17 2024-01-02 湖南一特医疗股份有限公司 Intelligent three-dimensional monitoring method and system for oxygen supply safety of medical center

Similar Documents

Publication Publication Date Title
CN114428937A (en) Time sequence data prediction method based on space-time diagram neural network
Hyandye et al. A Markovian and cellular automata land-use change predictive model of the Usangu Catchment
CN109492830B (en) Mobile pollution source emission concentration prediction method based on time-space deep learning
Muñoz et al. Comparison of statistical methods commonly used in predictive modelling
CN113313303A (en) Urban area road network traffic flow prediction method and system based on hybrid deep learning model
WO2022142042A1 (en) Abnormal data detection method and apparatus, computer device and storage medium
WO2022077767A1 (en) Traffic flow prediction method and apparatus, computer device, and readable storage medium
JP2012194967A5 (en)
CN111047078B (en) Traffic characteristic prediction method, system and storage medium
CN112419710A (en) Traffic congestion data prediction method, traffic congestion data prediction device, computer equipment and storage medium
Corchado et al. A topology-preserving system for environmental models forecasting
CN111368887B (en) Training method of thunderstorm weather prediction model and thunderstorm weather prediction method
US11620683B2 (en) Utilizing machine-learning models to create target audiences with customized auto-tunable reach and accuracy
Biard et al. Automated detection of weather fronts using a deep learning neural network
US20230140289A1 (en) Traffic accident prediction systems and methods
CN109460509B (en) User interest point evaluation method, device, computer equipment and storage medium
CN116341841B (en) Runoff forecast error correction method, apparatus, device, medium and program product
US20200050218A1 (en) Computer-implemented method and system for evaluating uncertainty in trajectory prediction
CN106168976A (en) A kind of specific user's method for digging based on NB Algorithm and system
Rahman et al. A deep learning approach for network-wide dynamic traffic prediction during hurricane evacuation
KR20140111822A (en) Error correction method for global climate model using non-stationary quantile mapping
Zjavka Multi-site post-processing of numerical forecasts using a polynomial network substitution for the general differential equation based on operational calculus
CN117975710A (en) Traffic flow prediction method, device, equipment and storage medium
CN113326877A (en) Model training method, data processing method, device, apparatus, storage medium, and program
CN106874286B (en) Method and device for screening user characteristics

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination