CN113537626B

CN113537626B - Method for predicting neural network combined time sequence by aggregating information difference

Info

Publication number: CN113537626B
Application number: CN202110886769.8A
Authority: CN
Inventors: 高超; 刘浩; 李向华; 王震; 朱培灿; 李学龙
Original assignee: Northwestern Polytechnical University
Current assignee: Northwestern Polytechnical University
Priority date: 2021-08-03
Filing date: 2021-08-03
Publication date: 2023-05-05
Anticipated expiration: 2041-08-03
Also published as: CN113537626A

Abstract

The invention discloses a method for predicting a neural network combining time sequence of aggregated information difference, belongs to the field of deep learning and intelligent transportation, and can simultaneously capture time and space characteristics of subway passenger flow data. According to the method, through a gating mechanism in the GateRecurrentUnit model, the change trend of historical information is still kept as much as possible while the passenger flow information at the current moment is captured, the time dependence relationship between the passenger flow information is fully considered, and finally, the influence of a smoothing filter defined in the Fourier domain of the graph neural network on a prediction result can be effectively counteracted by testing on the real traffic network on the sea subway traffic network, so that the problem that the prediction accuracy is low due to the fact that the prediction result is smooth when the peak passenger flow data is predicted is solved.

Description

Method for predicting neural network combined time sequence by aggregating information difference

Technical Field

The invention relates to the field of deep learning and intelligent traffic, in particular to a method for predicting a neural network combined time sequence for aggregating information differences.

Background

Nowadays, artificial intelligence technology is rapidly developed, meanwhile, along with the large amount of application of cheap traffic sensor technology, explosion-type growing traffic data is adopted, people start to enter traffic big data and intelligent traffic age, an intelligent traffic system aims to establish a complete set of traffic information service and traffic management control system, the system can relieve traffic jam problems, such as recently popularized ETC, automatic vehicle charging is completed by using an automatic vehicle identification technology, the vehicle passing efficiency of expressway toll gate is accelerated, the holiday vehicle jam problem is effectively relieved, the intelligent traffic system can monitor and predict road traffic in real time, reasonably plan traffic state of a road network, further improve traffic efficiency of vehicles, reduce traffic accidents, and along with rapid economic development, pace acceleration of urban modernization construction and stable population growth, urban traffic congestion is more and more serious, subways are an important component of urban rail traffic, great effect is exerted in terms of relieving the traffic congestion, worldwide development is rapid, subway passenger flow prediction is an important research direction in intelligent traffic systems, historical passenger flow data is utilized to reasonably predict passenger flow of future subway stations, reasonable planning suggestions can be provided for subway operation planners, operation efficiency of subway networks is improved and improved, subway line structures are more and more complicated along with the increase of subway lines in cities, safety and efficiency of subway operation face serious challenges, subway passenger flow is one of main factors affecting subway operation scheduling, accurate passenger flow prediction is an important means for coping with the challenges, and managers are helped to formulate reasonable operation plans, the method has the advantages that the traffic jam is relieved, the subway operation efficiency is improved, the basic scale of a newly built subway station, the length of a platform, the crowd evacuation capacity of the platform and the like can be judged according to various data of passenger flow prediction, the dangerous factors of the subway station are eliminated from the root, and various safety problems of the subway station are avoided. The accurate passenger flow prediction can also help travelers to plan an optimal travel route, effectively avoid peak passenger flow, improve travel efficiency and ensure travel safety, and in addition, the neural network based on aggregate information difference can be used for predicting the passenger flow of subway stations and can also be applied to road and vehicle flow prediction in combination with a passenger flow prediction method of a time sequence. The accurate traffic flow prediction has important reference value for planning, designing, managing and controlling traffic facilities, for example, by predicting traffic flow information of road traffic, a manager can effectively relieve the congestion degree of peak road sections by controlling signal lamps or dispatching traffic police and the like; and provide reference information for future planning road design, for example, the road should be designed into several lanes, which part of the road can reduce the number of lanes, which part needs to increase the number of lanes, which is a single-way road, etc., thus the newly planned road can be guaranteed to be capable of effectively relieving or even solving the road congestion problem to the greatest extent;

the traditional traffic flow prediction method is mainly divided into two major types, namely a statistical method and a traditional machine learning method, wherein the statistical method is based on ARIMA and variants, VAR and Kalman filtering of the ARIMA, the VAR and the Kalman filtering have mathematical bases and have strong interpretability, however, the high nonlinearity and the dynamic property of traffic data enable the methods to be not in line with the assumption of linearity and stationarity, so that the actual application is poor, the traditional machine learning method is based on a support vector machine, a K nearest neighbor and the like, the methods can model nonlinear relations and extract complex relations in the traffic data, but under the age background of big data, the methods need to manually extract the characteristics in the data, and when a data set is huge, a great amount of manpower is consumed and accuracy cannot be ensured.

Disclosure of Invention

In view of the above problems, the present invention aims to provide a method for predicting a time series by combining a neural network with an aggregate information difference, which improves the capability of capturing time and space features of traffic data, especially the capability of extracting space features, in an original graph rolling network model, due to a smoothing filter defined in a fourier domain, when a graph rolling network model is used for predicting passenger flow, the prediction of peak data is smoother than that of real peak data, and the peak feature is missing. Compared with a circulating neural network model, the gating recursion unit model has the advantages of simple structure, less parameter quantity and short training time, and finally, the method is applied to the Shanghai subway traffic network to successfully predict the passenger flow of each subway station, thereby having great significance in relieving traffic jam and improving subway operation efficiency and providing reference information for travelers to improve travel efficiency.

In order to achieve the purpose of the invention, the invention is realized by the following technical scheme: a method for combining neural networks aggregating information differences with time series prediction, comprising the steps of:

step one: firstly, inputting data, inputting an adjacency matrix of a subway network G= (V, E), then inputting historical passenger flow data X of each station, and regarding passenger flow as the attribute of the station, which can be expressed as X epsilon R ^N×F ；

Step two: the weight of the node is enhanced, the weight proportion of the node in extracting the space features is enhanced by increasing the alpha-time identity matrix on the diagonal of the normalized matrix after the normalization of the Laplace matrix by a method of maximizing the difference based on the aggregation information;

step three: extracting spatial features, namely extracting the spatial features through the historical passenger flow data of each station and the preprocessed adjacent matrix of the subway network, which are input in the step one;

step four: reset gate r in Gate Recurrent Unit model _t Then, a reset gate r is calculated _t Reset gate r _t Mainly for controlling candidate state c _t Whether or not the value of (2) depends on the previous state h _t-1 ；

Step five: computing update gate u _t The update gate in Gate Recurrent Unit model is mainly used for controlling the current state h _t How much from the previous state h needs to be reserved _t-1 Is to be from candidate state c _t How much information is received;

step six: performing calculation of candidate state c _t Obtaining a reset gate r through the fifth step _t By resetting the gate r _t Can calculate the candidate state c _t Information of (2);

step seven: updating cell state h _t Finally, the prediction result P is output through the full connection layer _t+1 ,P _t+2 ,...,P _t+T ]And predicting the passenger flow data of the T time steps in the future by using the historical passenger flow data of the T time steps.

The further improvement is that: in the first step, v= { V in the subway network G _i |i∈[1,N]And E= { E _ij ＝(v _i ,v _j )|i,j∈[1,N]I +.j }, where if v _i And v _j E when there is an edge between _ij E has a value of 1, V represents a set of subway stations, N represents the number of stations, E represents physical edges between stations, F represents the number of node attribute features and X _t And representing the passenger flow values of all stations at the time t.

The further improvement is that: in the first step, the Laplace matrix is obtained through the input calculation of the first step

In the middle of

And->

Representing the adjacency matrix A plus the identity matrix I _N And->

Representation->

Is a degree matrix of (2).

The further improvement is that: in the second step, the weight proportion of the node in extracting the space feature is enhanced by increasing the unit matrix of alpha times, and the weight proportion is expressed by the following formula

Wherein when alpha is 0.7, the final prediction result is the best, the degree matrix is a diagonal matrix, and the element values on the diagonal represent the degree of each vertex, wherein I _N Is a matrix of units which is a matrix of units,

representing the adjacency matrix A plus the identity matrix I _N ，/>

Representation->

The coefficient alpha has a value ranging between 0 and 1.

The further improvement is that: in the third step, the historical passenger flow data of each station and the adjacency matrix of the preprocessed subway network are extracted to extract the spatial characteristics, and the spatial characteristics are expressed by using the following formula

By the formula H can be found when l=0 ^l =x, and W ^l Represents layer I learnable parameters, and H ^(l+1) Representing the output of the first layer, while the value will be the input to the next layer, the two-layer neural network is represented using the following formula

In the formula

W in the formula ⁽⁰⁾ Is the learning parameter matrix of the first layer neural network, W ⁽¹⁾ For the learning parameter matrix of the second layer neural network, X' is time series data with spatial characteristics output by the two-layer graph neural network model, H ^l Output representing layer i of the graph neural network, σ (·) representing nonlinear activation function, ++>

Representing the adjacency matrix A plus the identity matrix I _N 、/>

Representation->

And (3) the degree matrix of each site, wherein X represents the historical passenger flow data matrix of each site input in the step one.

The further improvement is that: in the fourth step, a reset gate r is calculated _t Calculated by the following formula

r _t ＝σ(W _r [X′ _t ,h _t-1 ]+b _r )

W in the formula _r And b _r Is the parameter matrix and deviation in the training process, and r represents the calculation r _t Parameters and deviations of X _t ' means that time series data with spatial characteristics at time t are obtained from the graph neural network model.

Further improvementsThe method comprises the following steps: in the fifth step, a calculation update gate u is calculated _t Calculated by the following formula

u _t ＝σ(W _u [X′ _t ,h _t-1 ]+b _u )

W in the formula _u And b _u Is a parameter matrix and bias in the training process, wherein u represents the calculated u _t Parameters and deviations of (a).

The further improvement is that: in the sixth step, the gate r is reset _t Calculating candidate state c _t Calculated by the following formula

c _t ＝tanh(W _c [X′ _t ,(r _t *h _t-1 )]+b _c )

W in the formula _c And b _c Is a parameter matrix and bias in the training process, wherein c represents the calculated c _t Parameter matrix and bias at that time.

The further improvement is that: in the seventh step, the state h of the unit at the current moment is controlled by two door mechanisms of the fifth step and the sixth step _t And updating the cell state in relation to the cell state at the previous time, calculated by the following formula

h _t ＝u _t *h _t-1 +(1-u _t )*c _t

U in the formula _t And c _t Is the result of the calculation by the procedure of step six and step seven.

The beneficial effects of the invention are as follows: the invention combines the graph convolution network model with the enhanced self-node weight and the time sequence prediction model, so that the invention can utilize the time characteristics of traffic data and the space characteristics of the traffic data; in addition, for the extraction of the spatial features, the method of maximizing the difference of aggregation information is utilized to optimize the graph neural network model, so that the extraction effect of the method on the spatial features is better; meanwhile, the method is suitable for the problem of predicting the passenger flow of the subway station, and the scheme and other methods are compared and tested on the real world network, so that the result shows that the scheme is superior to other comparison methods, the method has higher accuracy, and the passenger flow in a future period of time can be effectively predicted.

Drawings

FIG. 1 is a flow chart of the present invention.

Fig. 2 is a detailed illustration of the present invention.

Fig. 3 shows an effect diagram of the present invention on prediction on the data of the subway network in the open sea.

Fig. 4 is a real network dataset scale.

FIG. 5 is a graph showing the comparison of the predicted effect of passenger flow on a real traffic network for multiple methods

Detailed Description

The present invention will be further described in detail with reference to the following examples, which are only for the purpose of illustrating the invention and are not to be construed as limiting the scope of the invention.

Example 1

According to the embodiment shown in fig. 1-3, a method for predicting a neural network combined time sequence of aggregated information difference is provided, which includes the following steps:

step four: reset gate r in Gate Recurrent Unit model _t Then, a reset gate r is calculated _t Reset gate r _t Mainly for controlling candidate state c _t Whether or not the value of (2) depends onThe former state h _t-1 ；

In step one, v= { V in subway network G _i |i∈[1,N]And E= { E _ij ＝(v _i ,v _j )|i,j∈[1,N]I +.j }, where if v _i And v _j E when there is an edge between _ij E has a value of 1, V represents a set of subway stations, N represents the number of stations, E represents physical edges between stations, F represents the number of node attribute features and X _t And representing the passenger flow values of all stations at the time t.

In the first step, the Laplace matrix is obtained through the input calculation of the first step

In the middle of

And->

Representing the adjacency matrix A plus the identity matrix I _N And->

Representation->

Is a degree matrix of (2).

In the second step, the weight proportion of the node in extracting the space feature is enhanced by increasing the unit matrix of alpha times, and the following formula is used for expression

In the third step, the historical passenger flow data of each station and the adjacency matrix of the preprocessed subway network are extracted to extract the spatial characteristics, and the spatial characteristics are expressed by using the following formula

In the formula

Representing the adjacency matrix A plus the identity matrix I _N 、/>

Representation->

And X represents the historical passenger flow data matrix of each site input in the step one.

In step four, the reset gate r is calculated _t Calculated by the following formula

r _t ＝σ(W _r [X′ _t ,h _t-1 ]+b _r )

In step five, a calculation update gate u is performed _t Calculated by the following formula

u _t ＝σ(W _u [X′ _t ,h _t-1 ]+b _u )

In step six, the gate r is reset _t Calculating candidate state c _t Calculated by the following formula

c _t ＝tanh(W _c [X′ _t ,(r _t *h _t-1 )]+b _c )

In the seventh step, the state h of the unit at the current moment is controlled by two door mechanisms of the fifth step and the sixth step _t And updating the cell state in relation to the cell state at the previous time, calculated by the following formula

h _t ＝u _t *h _t-1 +(1-u _t )*c _t

When the method is used, the EST-GCN model consists of two parts, namely a graph rolling network for enhancing self-node weight and a gating recursion unit, firstly, subway network topological structures at different moments and passenger flow data of each station are extracted to serve as input data of the ES-GCN model, then, based on an aggregation information difference maximizing method, the model solves the influence of a smoothing filter defined in a Fourier domain in an original graph rolling network on a prediction result, the smoothing filter can lead the model to be smoother compared with real peak data when the model predicts the peak data, the prediction result lacks the peak characteristics, and the spatial characteristics of traffic data can be extracted better by solving the problem. The result obtained by the graph-convolution network model with enhanced self-node weights is time-series data with spatial feature information, then to obtain the temporal feature, we input the obtained time-series data with spatial feature into a gated recursive unit model, then through inter-unit transfer information to get dynamic changes, the gated recursive unit model is actuated by two doors, reset the door r _t Mainly for controlling candidate state c _t Whether or not the value of (2) depends on the previous state h _t-1 Update gate u _t Mainly used for controlling the current state h _t How much from the previous state h needs to be reserved _t-1 Information of (c) and need to be derived from candidate state c _t How much information is received. Through the two gating mechanisms, the gating recursion unit model not only can acquire the passenger flow information at the current moment, but also can keep the historical passenger flow information as much as possible, and finally, a final prediction result is obtained from the full-connection layer.

Example two

According to fig. 3, the present embodiment provides a method for predicting the neural network of aggregated information difference in combination with time series, and the areas a to d in fig. 3 show the comparison result of the real value and the predicted value of the passenger flow volume of each station at different time granularity from eight points in the morning,from the visual results, the actual passenger flows of a small number of stations are slightly different from the predicted passenger flows, but more than about 80% of stations are basically fit with the predicted passenger flows under different time granularities; the regions e to h in FIG. 3 show v ₃ The comparison result of the real value and the predicted value of the passenger flow volume of the station from different time and at different time granularity can be seen from the visual result, and the station v at the time granularity of 15min ₃ The real value and the predicted value of the passenger flow have a larger gap at the beginning, but the gap is smaller and smaller along with the time, because training data are increased, the prediction capability of the model is also improved, and the model, on the whole, because the influence of a smoothing filter in the Fourier domain of the graph convolution neural network on the space feature extraction result is counteracted by increasing the weight value of a node on the diagonal line of the adjacent matrix, the model has an excellent prediction result on the peak value, the beginning and the end of the peak time can be detected, and the prediction result has a similar mode with the change trend of the passenger flow of the actual rail transit station.

According to the method for predicting the neural network combined time sequence of the aggregated information difference provided by the embodiment shown in fig. 4, the method is all data of real subway network data for testing, wherein the number of subway stations is 285 stations in total, the time span of historical passenger flow data is 2015, 4, 6, 4, 26, 20 days in total, and the interval time of each piece of data is 15 minutes.

According to the embodiment shown in fig. 5, a method for combining neural networks with time sequence prediction for aggregating information differences is provided, which shows that the effects of passenger flow prediction on a plurality of real networks are compared by a plurality of comparison methods and the method provided by the invention, and the effect of passenger flow prediction is evaluated by mean absolute error, root mean square error, coefficient determination, interpretation variance score and accuracy, and the bolded items in each row indicate that the method has the best effect on the data set of the corresponding row, so that the effect of the EST-GCN method provided by the invention on passenger flow prediction is better than other methods.

The foregoing has shown and described the basic principles, principal features and advantages of the invention. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, and that the above embodiments and descriptions are merely illustrative of the principles of the present invention, and various changes and modifications may be made without departing from the spirit and scope of the invention, which is defined in the appended claims. The scope of the invention is defined by the appended claims and equivalents thereof.

Claims

1. A method for combining neural networks aggregating information differences with time series prediction, comprising the steps of:

step one: firstly, inputting data, inputting an adjacency matrix of a subway network G= (V, E), then inputting historical passenger flow data X of each station, and regarding passenger flow as the attribute of the station, which can be expressed as X epsilon R ^N×F X is a matrix of N rows and F columns, N represents the number of sites, and F is the number of node attribute features;

wherein v= { V in subway network G _i |i∈[1,N]And E= { E _ij ＝(v _i ,v _j )|i,j∈[1,N]I +.j }, where when v _i And v _j E when there is an edge between _ij The value of E is 1, V represents the set of subway stations, N represents the number of stations, and E represents the physical connection edges between stations;

step four: reset gate r in Gate Recurrent Unit model _t Then, a reset gate r is calculated _t Heavy weightDoor r _t Mainly for controlling candidate state c _t Whether or not the value of (2) depends on the previous state h _t-1 ；

Step five: computing update gate u _t The update gate in Gate Recurrent Unit model is mainly used for controlling the current state h _t How much from the previous state h needs to be reserved _t-1 I.e. updating the gate help model decides how much past information to pass to the future;

step six: performing calculation of candidate state c _t Obtaining a reset gate r through the fourth step _t By resetting the gate r _t Can calculate the candidate state c _t Information of (2);

step seven: updating cell state h _t Finally, the prediction result P is output through the full connection layer _t+1 ,P _t+2 ,...,P _t+T ]And predicting passenger flow data for T time steps in the future using the historical passenger flow data for T time steps, wherein P _t+1 Passenger flow data representing predicted first time step, P _t+T Passenger flow data representing a predicted T-th time step.

2. The method for combining time series prediction with neural network for aggregating information differences according to claim 1, wherein: in the first step, v= { V in the subway network G _i |i∈[1,N]And E= { E _ij ＝(v _i ,v _j )|i,j∈[1,N]I +.j }, where if v _i And v _j E when there is an edge between _ij E has a value of 1, V represents a set of subway stations, N represents the number of stations, E represents physical edges between stations, F represents the number of node attribute features and X _t And representing the passenger flow values of all stations at the time t.

3. The method for combining time series prediction with neural network for aggregating information differences according to claim 1, wherein: in the first step, the Laplace matrix is obtained through the input calculation of the first step

In the middle of

And->

Representing the adjacency matrix A plus the identity matrix I _N And->

Representation->

Is a degree matrix of (2).

4. The method for combining time series prediction with neural network for aggregating information differences according to claim 1, wherein: in the second step, the weight proportion of the node in extracting the space feature is enhanced by increasing the unit matrix of alpha times, and the weight proportion is expressed by the following formula

representing the adjacency matrix A plus the identity matrix I _N ，/>

Representation->

The coefficient alpha has a value ranging between 0 and 1.

5. The method for combining time series prediction with neural network for aggregating information differences according to claim 1, wherein: in the third step, the historical passenger flow data of each station and the adjacency matrix of the preprocessed subway network are extracted to extract the spatial characteristics, and the spatial characteristics are expressed by using the following formula

In the formula

Representing the adjacency matrix A plus the identity matrix I _N 、/>

Representation->

6. The method for combining time series prediction with neural network for aggregating information differences as recited in claim 5, wherein: in the fourth step, a reset gate r is calculated _t Calculated by the following formula

r _t ＝σ(W _r [X′ _t ,h _t-1 ]+b _r )

7. The method for combining time series prediction with neural network for aggregating information differences as recited in claim 5, wherein: in the fifth step, a calculation update gate u is calculated _t Calculated by the following formula

u _t ＝σ(W _u [X′ _t ,h _t-1 ]+b _u )

8. The method for combining time series prediction with neural network for aggregating information differences as recited in claim 5, wherein: in the sixth step, the gate r is reset _t Calculating candidate state c _t Calculated by the following formula

c _t ＝tanh(W _c [X′ _t, (r _t *h _t-1 )]+b _c )

W in the formula _c And b _c Is the parameter matrix and bias in the training process,wherein c represents the calculation c _t Parameter matrix and bias at that time, tanh represents the hyperbolic tangent activation function.

9. The method for combining time series prediction with neural network for aggregating information differences according to claim 1, wherein: in the seventh step, the gate u is updated in the fifth step _t And step six candidate state c _t To control the state h of the unit at the current time _t And is in contact with the cell state h at the last moment _t-1 In relation, updating the cell state is performed, calculated by the following formula

h _t ＝u _t *h _t-1 +(1-u _t )*c _t

U in the formula _t And c _t Is the result of the calculation by the procedures of the step five and the step six.