CN113537626A

CN113537626A - Neural network combined time sequence prediction method for aggregating information difference

Info

Publication number: CN113537626A
Application number: CN202110886769.8A
Authority: CN
Inventors: 高超; 刘浩; 李向华; 王震; 朱培灿; 李学龙
Original assignee: Northwestern Polytechnical University
Current assignee: Northwestern Polytechnical University
Priority date: 2021-08-03
Filing date: 2021-08-03
Publication date: 2021-10-22
Anticipated expiration: 2041-08-03
Also published as: CN113537626B

Abstract

The invention discloses a method for predicting a neural network combined with a time sequence based on aggregate information difference, belongs to the field of deep learning and intelligent transportation, and can simultaneously capture time and space characteristics of subway passenger flow data. According to the method, through a gating mechanism in a GateRecurrentUnit model, the change trend of historical information is kept as much as possible while the passenger flow information at the current moment is captured, the time dependence relationship among the passenger flow information is fully considered, and finally, the method is tested on a real traffic network Shanghai subway traffic network, so that the influence of a smoothing filter defined in a Fourier domain of a graph neural network on a prediction result can be effectively counteracted, and the problem of low prediction precision caused by the fact that the prediction result is smooth when peak passenger flow data is predicted is solved.

Description

Neural network combined time sequence prediction method for aggregating information difference

Technical Field

The invention relates to the field of deep learning and intelligent transportation, in particular to a method for predicting a neural network combined time sequence by aggregating information difference.

Background

Nowadays, the artificial intelligence technology is rapidly developed, meanwhile, with the massive application of the cheap traffic sensor technology, the explosively increased traffic data, the human beings begin to enter the times of traffic big data and intelligent traffic, the intelligent traffic system aims to establish a set of complete traffic information service and traffic management control system, the system can relieve the traffic jam problem, such as ETC (electronic toll collection) popularized recently, the automatic vehicle identification technology is utilized to complete the automatic vehicle toll collection, the passing efficiency of a highway toll gate is accelerated, the traffic jam problem in holidays is effectively relieved, the intelligent traffic system can also monitor and predict the road traffic in real time, and the traffic state of a road network is reasonably planned, so that the passing efficiency of the vehicles is improved, the occurrence of traffic accidents is reduced, and with the rapid development of economy, the acceleration of pace of urban modern construction and the stable increase of population, the problem of urban traffic congestion is more and more serious, subways are used as important components of urban rail transit, play a great role in relieving traffic congestion, and develop rapidly in the world, in an intelligent traffic system, the prediction of subway passenger flow is an important research direction, the passenger flow of future subway stations is reasonably predicted by using historical passenger flow data, reasonable planning suggestions can be provided for subway operation planners, the operation efficiency of a subway network is improved and improved, with the increase of subway lines in cities, the subway line structure is more and more complex, the safety and efficiency of subway operation face severe challenges, the subway passenger flow is one of main factors influencing subway operation scheduling, so the accurate passenger flow prediction is an important means for coping with the challenges, and a manager is helped to make a reasonable operation plan and relieve traffic congestion, the method improves the operation efficiency of the subway, can judge the basic scale of the newly built subway station, the length of the platform, the capability of evacuating people from the platform and the like according to various data of passenger flow prediction, eliminates the dangerous factors of the subway station from the root and avoids various safety problems of the subway station. The accurate passenger flow prediction can also help travelers to plan an optimal travel route, peak passenger flow is effectively avoided, travel efficiency is improved, and travel safety is guaranteed. The accurate traffic flow prediction has important reference value for planning, designing, managing and controlling traffic facilities, for example, by predicting the traffic flow information of road traffic, a manager can effectively relieve the congestion degree of a peak road section by controlling signal lamps or adding traffic polices and the like; reference information is provided for future planned road design, for example, a road should be designed into a plurality of lanes, which part of the road can reduce the number of lanes, which part of the road needs to increase the number of lanes, which roads are one-way roads and the like, so that the newly planned road can be ensured to effectively alleviate or even solve the problem of road congestion to the greatest extent;

the traditional traffic flow prediction method is mainly divided into two major categories, namely a statistical-based method and a traditional machine learning-based method, wherein the statistical-based method comprises ARIMA and variants thereof, VAR and Kalman filtering, the methods have mathematical basis and have strong interpretability, however, the traffic data are highly nonlinear and dynamic, so that the methods do not conform to the assumptions of linearity and stability, and therefore, the method is poor in practical application performance, the traditional machine learning-based method comprises methods such as a support vector machine and K neighbor, the methods can model nonlinear relations and extract complex relations in the traffic data, but in the era background of big data, the methods need to manually extract features in the data, when a data set is huge, a large amount of manpower is consumed, and the accuracy cannot be guaranteed, so that the invention provides a method for predicting a neural network combined with time sequence by aggregating information difference so as to solve the existing problem of time sequence prediction in the neural network combined with the existing method for predicting the information difference There are problems in the art.

Disclosure of Invention

In view of the above problems, an object of the present invention is to provide a method for predicting a time series by combining a neural network with aggregated information differences, which improves the capability of capturing time and space features of traffic data, particularly extracts space features, and in an original graph convolution network model, due to a smoothing filter defined in a fourier domain, when a passenger flow prediction is performed by using the graph convolution network model, the prediction of peak data is smoother than that of real peak data and the features of the peak are missing The influence of the result, in the aspect of obtaining the time characteristic, the invention uses the gate control recursion unit model, through the gate control mechanism, the passenger flow information at the current moment can be obtained, the change trend of the historical passenger flow information can be obtained as much as possible, and the final prediction result is obtained through the fusion of various information. Compared with a recurrent neural network model, the gated recursion unit model selected by the invention has the advantages of simple structure, less parameter quantity and short training time, and finally, the method disclosed by the invention is applied to the Shanghai subway traffic network to successfully predict the passenger flow of each subway station, so that the method has great significance for relieving traffic jam and improving the subway operation efficiency, and provides reference information for travelers to improve the travel efficiency.

In order to realize the purpose of the invention, the invention is realized by the following technical scheme: a method of aggregating neural networks of information differences in conjunction with time series predictions, comprising the steps of:

the method comprises the following steps: firstly, inputting data, inputting an adjacency matrix of a subway network G ═ V, E, then inputting historical passenger flow data X of each station, regarding the passenger flow as the attribute of the station, and expressing the attribute as X ∈ R^N×F；

Step two: enhancing the weight of the node per se, and enhancing the weight proportion of the node in the process of extracting the spatial features by increasing alpha times of the unit matrix on the diagonal line of the normalized Laplace matrix after the normalization of the Laplace matrix by a method based on the maximization of the difference of the aggregated information;

step three: extracting spatial features, namely extracting the spatial features through the historical passenger flow data of each station input in the step one and the preprocessed adjacency matrix of the subway network;

step four: reset Gate in Gate Recurrent Unit model is r_tThen, the calculation is carried out to reset the door r_tReset gate r_tMainly for controlling the candidate states c_tWhether the value of (d) depends on the previous state h_t-1；

Step five: perform calculation update door u_tThe update Gate in the Gate Recurrent Unit model is mainly used to control the current state h_tHow much needs to be kept from the previous state h_t-1And the required slave candidate state c_tHow much information is received;

step six: performing calculation candidate state c_tThe reset gate r is obtained through the step five_tBy resetting the gate r_tCan calculate the candidate state c_tThe information of (a);

step seven: proceed to update the cell state h_tFinally, the prediction result [ P ] is output through the full connection layer_t+1,P_t+2,...,P_t+T]And predicting passenger flow data of T time steps in the future by using the historical passenger flow data of the T time steps.

The further improvement lies in that: in the first step, V ═ { V ] in the subway network G_i|i∈[1,N]And E ═ E }_ij＝(v_i,v_j)|i,j∈[1,N]I ≠ j }, ifv_iAnd v_jWhen there is an edge in between e_ijThe value of E is 1, V represents the set of subway stations, N represents the number of stations, E represents the physical connecting edges between the stations, F represents the characteristic number of the node attributes and X_tRepresenting the traffic values for all stations at time t.

The further improvement lies in that: in the first step, the Laplace matrix is obtained through the input calculation of the first step

In the formula

And is

Representing the adjacency matrix A plus the identity matrix I_NAnd is and

to represent

The degree matrix of (c).

The further improvement lies in that: in the second step, the weight proportion occupied by the nodes in the process of extracting the spatial features is enhanced by increasing alpha times of the identity matrix, and the weight proportion is expressed by using the following formula

Wherein when alpha is 0.7, the final prediction result is optimal, the degree matrix is diagonal matrix, the element value on the diagonal represents the degree of each vertex, and I_NIs a matrix of the units,

representing the adjacency matrix A plus the identity matrix I_N，

To represent

The coefficient alpha ranges from 0 to 1.

The further improvement lies in that: in the third step, the historical passenger flow volume data of each station and the preprocessed adjacency matrix of the subway network extract the spatial characteristics, and the spatial characteristics are expressed by using the following formula

By the formula, when l is 0, H^lX, and W^lRepresents the l < th > layer learnable parameter, and H^(l+1)Representing the output of the l-th layer, and the value is used as the input of the next layer, and the graph neural network of the two layers is represented by the following formula

In the said formula

W in the formula⁽⁰⁾Is a learning parameter matrix, W, of the first layer neural network⁽¹⁾For the learning parameter matrix of the second layer neural network, X' is the time sequence data with spatial features output by the two-layer neural network model, H^lRepresents the output of the first layer of the neural network of the graph, sigma (-) represents the nonlinear activation function,

Representing the adjacency matrix A plus the identity matrix I_N、

To represent

And X represents the historical passenger flow data matrix of each station input in the step one.

The further improvement lies in that: in the fourth step, the reset gate r is calculated_tIs calculated by the following formula

r_t＝σ(W_r[X′_t,h_t-1]+b_r)

In the formula W_rAnd b_rIs the parameter matrix and deviation in the training process, and r represents the calculated r_tParameters and deviations of (2), X_t' represents that time series data with a spatial characteristic at the time t is obtained from a graph neural network model.

The further improvement lies in that: in the fifth step, the door u is updated by calculation_tIs calculated by the following formula

u_t＝σ(W_u[X′_t,h_t-1]+b_u)

In the formula W_uAnd b_uIs the parameter matrix and bias in the training process, where u represents the calculated u_tParameters and deviations of.

The further improvement lies in that: in the sixth step, the gate r is reset_tComputing candidate states c_tIs calculated by the following formula

c_t＝tanh(W_c[X′_t,(r_t*h_t-1)]+b_c)

In the formula W_cAnd b_cIs the parameter matrix and bias in the training process, where c represents the calculation of c_tTime of day parameter matrix and bias.

The further improvement lies in that: in the seventh step, the unit state h at the current moment is controlled by the two gantry cranes in the fifth step and the sixth step_tAnd is related to the cell state at the previous time, updates the cell state, and calculates by the following formula

h_t＝u_t*h_t-1+(1-u_t)*c_t

U in the formula_tAnd c_tIs the result calculated by the procedure of step six and step seven.

The invention has the beneficial effects that: according to the method, the time characteristics of traffic data and the spatial characteristics of the traffic data can be utilized by combining the graph convolution network model for enhancing the self-node weight and the time sequence prediction model; for the extraction of the spatial features, the method optimizes the graph neural network model by utilizing the method of maximizing the difference of the aggregated information, so that the extraction effect of the spatial features is better; meanwhile, the method is suitable for predicting the passenger flow of the subway station, and the comparison test is carried out on the real world network by using the scheme of the invention and other methods, and the result shows that the scheme of the invention is superior to other comparison methods, so that the method has higher accuracy and can effectively predict the passenger flow in a period of time in the future.

Drawings

FIG. 1 is a flow chart of the present invention.

Fig. 2 is a detailed illustration of the present invention.

Fig. 3 shows the effect of the prediction performed on the data of the Shanghai subway network according to the present invention.

Fig. 4 shows the real network data set size.

FIG. 5 is a comparison of passenger flow prediction effects of multiple methods on a real traffic network

Detailed Description

In order to further understand the present invention, the following detailed description will be made with reference to the following examples, which are only used for explaining the present invention and are not to be construed as limiting the scope of the present invention.

Example one

According to fig. 1-3, the present embodiment provides a method for aggregating information-differentiated neural networks with time series prediction, which includes the following steps:

In the first step, V ═ V in the subway network G_i|i∈[1,N]And E ═ E }_ij＝(v_i,v_j)|i,j∈[1,N]I ≠ j }, where if v is_iAnd v_jWhen there is an edge in between e_ijThe value of E is 1, V represents the set of subway stations, N represents the number of stations, E represents the physical connecting edges between the stations, F represents the characteristic number of the node attributes and X_tIndicating the traffic of all the stations at time tThe value is obtained.

In the first step, the Laplace matrix is obtained through the input calculation of the first step

In the formula

And is

Representing the adjacency matrix A plus the identity matrix I_NAnd is and

to represent

The degree matrix of (c).

In the second step, the weight proportion occupied by the nodes in the process of extracting the spatial features is enhanced by increasing alpha times of the identity matrix, and the weight proportion is expressed by using the following formula

In the third step, the historical passenger flow volume data of each station and the preprocessed adjacency matrix of the subway network are used for extracting the spatial characteristics, and the spatial characteristics are expressed by using the following formula

In the formula

Representing the adjacency matrix A plus the identity matrix I_N、

To represent

In step four, the reset gate r is calculated_tIs calculated by the following formula

r_t＝σ(W_r[X′_t,h_t-1]+b_r)

In the fifth step, the door u is calculated and updated_tIs calculated by the following formula

u_t＝σ(W_u[X′_t,h_t-1]+b_u)

In step six, the gate r is reset_tComputing candidate states c_tIs calculated by the following formula

c_t＝tanh(W_c[X′_t,(r_t*h_t-1)]+b_c)

In the seventh step, the unit state h at the current moment is controlled by the two gantry cranes in the fifth step and the sixth step_tAnd is related to the cell state at the previous time, updates the cell state, and calculates by the following formula

h_t＝u_t*h_t-1+(1-u_t)*c_t

When the EST-GCN model is used, namely a graph convolution network and a gating recursion unit for enhancing self-node weight, firstly, subway network topological structures at different moments and passenger flow data of each station are extracted to serve as input data of the ES-GCN model, then based on an aggregation information difference maximization method, the model solves the influence of a smoothing filter defined in a Fourier domain in an original graph convolution network on a prediction result, the smoothing filter can cause the model to be smoother compared with real peak data when the model predicts the peak data, the prediction result lacks the characteristic of the peak, and the spatial characteristic of traffic data can be better extracted by solving the problem. The result obtained by enhancing the graph convolution network model of the self-node weight is time sequence data with space characteristic information, then in order to obtain the time characteristic, the obtained time sequence data with the space characteristic is input into a gate control recursion unit model, then the dynamic change is obtained by transferring information between units, the gate control recursion unit model resets a gate r through two gate mechanisms_tMainly for controlling candidate states c_tWhether the value of (d) depends on the previous state h_t-1Update the door u_tMainly used for controlling the current state h_tHow much needs to be kept from the previous state h_t-1And need to be driven from candidate state c_tHow much information is received. Through the two door mechanisms, the door control recursion unit model can not only acquire the passenger flow information at the current moment, but also keep historical passenger flow information as much as possible, and finally obtain a final prediction result from the full-connection layer.

Example two

According to the illustration in fig. 3, the embodiment provides a method for predicting a neural network by combining aggregated information differences with a time series, regions a to d in fig. 3 show comparison results of real values and predicted values of passenger flows of various stations at different time granularities from eight morning, and as can be seen from visualization results, there are a few differences between actual passenger flows and predicted passenger flows of some stations, but more than 80% of the real passenger flows of the stations at different time granularities are basically fit with the predicted passenger flows; the areas e to h in FIG. 3 show v₃The site starts from different time, the comparison result of the actual value and the predicted value of the passenger flow under different time granularities shows that under the time granularity of 15min, the site v₃The actual value and the predicted value of the passenger flow rate have a large difference at the beginning, but the difference is smaller and smaller along with the time, the training data is increased, the prediction capability of the model is improved, and on the whole, the influence of a smoothing filter in a Fourier domain of a graph convolution neural network on a spatial feature extraction result is counteracted by increasing the weight value of a node on a diagonal line of an adjacent matrix, so that the model has the same excellent prediction result on the peak value, the beginning and the end of the peak time can be detected, and the prediction result has a similar mode with the change trend of the passenger flow rate of an actual rail transit station.

According to fig. 4, the present embodiment provides a method for predicting a neural network by aggregating information differences in combination with a time series, which is to use each item of data of real subway network data for testing, wherein the number of subway stations is 285 stations in total, the time span of historical passenger flow data is 2015-4-month 6-2015-4-month 26-2015-20 days in total, each data interval is 15min, and 80% of the data is used as a training set, and the other 20% of the data is used as a test set to verify the performance of a model.

According to fig. 5, the present embodiment provides a method for predicting passenger flow by combining a neural network with time series, which shows how many comparison methods compare passenger flow prediction effects on multiple real networks, and evaluates the passenger flow prediction effect accurately by means of average absolute error, root mean square error, decision coefficient, interpretation of variance score, and the bold term in each row indicates that the method has the best effect on the data set of the corresponding row, so that it is seen that the EST-GCN method provided by the present invention has better effect on passenger flow prediction than other methods.

The foregoing illustrates and describes the principles, general features, and advantages of the present invention. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, which are described in the specification and illustrated only to illustrate the principle of the present invention, but that various changes and modifications may be made therein without departing from the spirit and scope of the present invention, which fall within the scope of the invention as claimed. The scope of the invention is defined by the appended claims and equivalents thereof.

Claims

1. A method for aggregating neural networks of information differences in conjunction with time series predictions, comprising the steps of:

2. The method of claim 1, wherein the neural network of aggregate information variation incorporates time series prediction, and further comprising: in the first step, V ═ { V ] in the subway network G_i|i∈[1,N]And E ═ E }_ij＝(v_i,v_j)|i,j∈[1,N]I ≠ j }, where if v is_iAnd v_jWhen there is an edge in between e_ijThe value of E is 1, V represents the set of subway stations, N represents the number of stations, E represents the physical connecting edges between the stations, F represents the characteristic number of the node attributes and X_tRepresenting the traffic values for all stations at time t.

3. The method of claim 1, wherein the neural network of aggregate information variation incorporates time series prediction, and further comprising: in the first step, the Laplace matrix is obtained through the input calculation of the first step

In the formula

And is

Representing the adjacency matrix A plus the identity matrix I_NAnd is and

to represent

The degree matrix of (c).

4. The method of claim 1, wherein the neural network of aggregate information variation incorporates time series prediction, and further comprising: in the second step, the weight proportion occupied by the nodes in the process of extracting the spatial features is enhanced by increasing alpha times of the identity matrix, and the weight proportion is expressed by using the following formula

representing the adjacency matrix A plus the identity matrix I_N，

To represent

The coefficient alpha ranges from 0 to 1.

5. The method of claim 1, wherein the neural network of aggregate information variation incorporates time series prediction, and further comprising: in the third step, the historical passenger flow volume data of each station and the preprocessed adjacency matrix of the subway network extract the spatial characteristics, and the spatial characteristics are expressed by using the following formula

In the said formula

Representing the adjacency matrix A plus the identity matrix IN,

To represent

6. The method of claim 1, wherein the neural network of aggregate information variation incorporates time series prediction, and further comprising: in the fourth step, the reset gate r is calculated_tIs calculated by the following formula

r_t＝σ(W_r[X′_t,h_t-1]+b_r)

7. The method of claim 1, wherein the neural network of aggregate information variation incorporates time series prediction, and further comprising: in the fifth step, the door u is updated by calculation_tIs calculated by the following formula

u_t＝σ(W_u[X′_t,h_t-1]+b_u)

8. The method of claim 1, wherein the neural network of aggregate information variation incorporates time series prediction, and further comprising: in the sixth step, the gate r is reset_tComputing candidate states c_tIs calculated by the following formula

c_t＝tanh(W_c[X′_t,(r_t*h_t-1)]+b_c)

9. The method of claim 1, wherein the neural network of aggregate information variation incorporates time series prediction, and further comprising: in the seventh step, the unit state h at the current moment is controlled by the two gantry cranes in the fifth step and the sixth step_tAnd is related to the cell state at the previous time, updates the cell state, and calculates by the following formula

h_t＝u_t*h_t-1+(1-u_t)*c_t