CN111709553B

CN111709553B - Subway flow prediction method based on tensor GRU neural network

Info

Publication number: CN111709553B
Application number: CN202010419486.8A
Authority: CN
Inventors: 洪科伟; 程雨夏; 吴卿
Original assignee: Hangzhou Dianzi University
Current assignee: Hangzhou Dianzi University
Priority date: 2020-05-18
Filing date: 2020-05-18
Publication date: 2023-05-23
Anticipated expiration: 2040-05-18
Also published as: CN111709553A

Abstract

The invention discloses a subway flow prediction method based on tensor GRU neural network, which comprises the steps of counting original data, filling the data according to an output time period, dividing the data into a training set and a testing set, and inputting the training set and the testing set into the network day by day according to a date sequence; current inputX ^t Output from the last timeH ^t‑1 Generating an output at that time via operations of the GRU update gate and the reset gateH ^t The method comprises the steps of carrying out a first treatment on the surface of the Will last momentH ^t Performing loss calculation and back propagation with the normalized label, and outputting the back normalized label in the final step of epochH ^t As the final output of the network. The invention enables the network to process higher order data and obtain higher order results, so that the network can grasp the structural characteristics in the predicted data results.

Description

Subway flow prediction method based on tensor GRU neural network

Technical Field

The invention relates to a subway flow prediction method, in particular to a subway flow prediction method based on a tensor GRU neural network.

Background

The traffic monitoring of each city is hoped to accurately acquire the traffic of each region in real time, and then traffic analysis among regions is carried out according to the existing traffic data, so that the traffic condition of the whole city can be mastered, and the decision making capability of the brains of the city is enriched. The data generated at each moment in the city is various, and the data can be the density of people, the flow rate of the people, and even the sex of individuals. A method for predicting subway traffic using a tensor GRU (evolution of a recurrent neural network RNN) network is proposed herein for a scene of subway monitoring.

Subway traffic is based on a comprehensive assessment of the number of entrances and exits generated in a unit of time at each subway station along each subway line in the city. In order to predict the trend of the subway traffic over a period of time, other data of the subway traffic, such as the position of the subway station, the period of time the subway station is located, and the like, often need to be considered. The information is the data of the structure naturally, the traditional method often needs to split the data of each dimension to perform feature extraction and finally perform feature fusion, and the direct damage is the structural relationship in the data. The method comprises the steps of adopting a tensor operation method to change the neural network matrix operation into full tensor operation, adopting a time sequence network to accept input of different time periods according to the data characteristics of the subway, and finally considering the position relation of each element in the data according to the tensor distance so as to achieve the purpose of directly utilizing the data structure of the data, expanding a high-order thought to an output end, enabling the output end not to be limited to judgment of vector level any more, and achieving the purpose of grasping tensor structural characteristics by calculating subscripts of elements in the tensor according to the tensor distance, so that the output end can realize high-order expansion, and has important significance for predicting different results under different future development trends.

Disclosure of Invention

Aiming at the defects of the prior art, the invention provides a subway flow prediction method based on a tensor GRU neural network.

The invention aims to solve the technical problem of how to apply tensor GRU to solve the problem of predicting the subway traffic under the scene of a high-order data set.

In order to solve the problem, the invention is realized by the following technical scheme:

the subway flow prediction method based on tensor GRU comprises the following steps:

1) The specific data of each time period card swiping of each subway station in a certain time period is taken as a data set, the data set is cleaned and screened, and the subway station, the subway line and the holiday type are respectively found as dimensions of each order of data to form a high-order tensor.

2) Before the GRU neural network is input, normalizing the input of each time according to the order, and simultaneously ensuring that the label data and the input data share the same group of mean variances; performing z-score normalization on input data according to the dimension of network output to obtain a mean tensorMAnd standard deviation tensorSt；K ₁ ×K ₂ ×…×K _N The N-order tensor of the shape is expanded in order.

3) Dividing the data processed in the step 2) into each time stepXInput into the GRU neural network, and in each tensor GRU operationIn the unit, by weightingWAnd (3) withXCarrying out Einstein multiplication to obtain the tensor state of the hidden layer, wherein the specific expression of Einstein product is as follows:

wherein the method comprises the steps ofAAndBrepresenting an N-order tensor and an N+M-order tensor, k, respectively _N Represents the dimension size of the nth order of the tensor, from k in order of order ₁ To k _N Sequentially multiplying corresponding index elements and then carrying out summation operation to reduce the N order to 1 order so that the result is M-order tensor; obtaining the final hidden layer state through the operation of the update gate and the reset gate in the tensor GRUHAs an output or as an input to a next time step. After calculation of a defined time step, the resulting tensor is obtainedHThe tensor is the resulting hidden layer stateH。

4) The obtained outputHThe loss calculation is performed with the label which is also normalized before input, and tensor distance is used as a loss function for grasping the structural characteristics of the higher-order tensor:

wherein l and m represent subscripts of two tensors, x, respectively _l ，y _l ，x _m ，y _m The element values of the corresponding subscripts representing the two tensors respectively,

euclidean distance representing element value of corresponding subscript, updating each moment by back propagation mechanism _i WThe weight tensor, i represents a certain time step of the output,

repeating steps 3) to 4) for different epochs until the epochs are over and then executing the next step.

5) After the end of epoch, the result of step 3) is obtained according toMAndStinverse normalizing the output of the network to obtain inverse normalizationAnd the effective dimension in the post tensor is used as the final predicted result value of the network, so that the final flow predicted value is obtained.

Compared with the prior art, the invention has the following effects:

1. the neural network based on vector and matrix operation can operate based on tensor structure, the dimension of the neural network is improved, high-order information can be input in subway flow application, high-order data fusion is carried out, and the computing capacity is incomparable to vectors.

2. Under the condition that the tensor distance is used as a loss function, the characteristics of the structure can be learned in the network learning process, the output result of the flow is expressed as a scene prediction value capable of predicting multiple modes, and the dimension boundary of network prediction or regression is widened.

Drawings

FIG. 1 is a block diagram of the interior of a GRU network;

fig. 2 is an overall flow chart of the present invention.

Detailed Description

The invention is further described with reference to the drawings and detailed description which follow:

1) Acquiring specific data of card swiping at each time interval of each subway station in the month, counting, taking ten minutes as a time unit, dividing one day into 144 time units, counting the number of outbound and inbound persons in each time unit, acquiring flow data of each subway station in each month from No. 1 to No. 26, and constructing a five-order tensor according to the subway line in which the subway station is located and the holiday condition in which the date is locatedXAnd (2) andX∈R ^{26×144×80×3×2} where 26 represents 26 days of data, 144 represents 144 time periods in a day, 80 represents the number of subway stations, 3 represents the number of subway lines, and 2 represents both workdays and not workdays.

2) Taking 1-25 days as training data and 26 days as test data, 144 is selected as the batch for training, which is helpful for the network to randomly learn the rules of each time period. When predicting data for a certain time period, the total 100 minutes of flow data of the first 50 minutes and the last 40 minutes of the time period plus the own ten minutes is taken as input, that is, the flow data with the characteristic quantity of 100 minutes, that is, 10 digits, at each time step of the GRU.

3) In a single training, a period of 144 time periods is selected, then the data for a total of 25 days from 1 to 25 days is expanded and transferred into the GRU on a daily basis, so that the tensor order of the input for each GRU time step is R ^80×3×2×10 。

4) Before inputting network operation, normalizing the input and the label, and performing z-score normalization on the input data according to the dimension of the network output, if the dimension of the output is R ⁸⁰ Then R of the input data will be combined ^3×2×10 These three dimensions give R ⁸⁰ Mean tensor of (f)MAnd standard deviation tensorStAnd the inverse normalization operation is convenient for the output data, meanwhile, the distribution characteristics of the labels are not considered, and the labels are normalized by using the mean standard deviation obtained by input.

5) Training the normalized data according to the time dimension of 25 days as the input of the cells of the GRU in sequence according to the formula:

Z ^t ＝σ(U ^(Z) * _N X ^t +W ^(Z) * _M S ^t-1 ),

R ^t ＝σ(U ^(R) * _N X ^t +W ^(R) * _M S ^t-1 )

H ^t ＝tanh(U ^(H) * _N X ^t +W ^(H) * _M (S ^t-1 ⊙R ^t ))

S ^t ＝(1-Z ^t )⊙H ^t +Z ^t ⊙S ^t-1

wherein the method comprises the steps ofUAndWrepresenting the weight tensor calculated with the input tensor and the hidden layer tensor at the last moment respectively,Z ^t representing the tensor of the updated gate obtained after calculation,R ^t representing the tensor of the reset gate,H ^t representing the tensor reset for the last time step according to the reset gateS ^t-1 After-operation, the hidden layer tensor is obtained, and then the tensor of the time stepS ^t I.e. updated according to the update gate at the previous timeZ ^t ⊙S ^t-1 And (1) not participating in the updateZ ^t )⊙H ^t The addition is carried out, forward propagation is carried out, firstly, einstein product is carried out on 4 th order input tensor and 4+n weight tensor at the moment of t, n is the order of hidden layer tensor when one super parameter represents, and data is subjected to updating gate and resetting gate and partial element effective value elimination of the last state and the current state is determined according to the calculated reserved weight value of each gate.

6) When the network passes forward propagation, obtaining network output resultYThen, using normalized label value and output to make tensor-based loss

Mainly for updating weight tensors in various gates in the networkU ^(Z) ，W ^(Z) ，U ^(R) ，W ^(R) ，U ^(H) ，W ^(H) 。

7) Repeating steps 4) to 6) until the network converges, and outputting the final output result according toMAndStand obtaining the final flow prediction after performing inverse normalization.

Claims

1. A subway flow prediction method based on tensor GRU neural network is characterized by comprising the following steps:

1) Taking specific data of each card swiping of each subway station in a certain period as a data set, cleaning and screening the data set, and respectively finding out the dimensions of each data of the subway station, the subway line and the holiday type to form a high-order tensor;

2) Before the GRU neural network is input, normalizing each input according to the order, and simultaneously ensuring the number of labelsSharing the same set of mean variances with the input data; performing z-score normalization on input data according to the dimension of network output to obtain a mean tensorMAnd standard deviation tensorSt；K ₁ ×K ₂ ×…×K _N Expanding the N-order tensors of the shape according to the order;

3) Dividing the data processed in the step 2) into each time stepXInput into GRU neural network, in each tensor GRU operation unit, by weightingWAnd (3) withXCarrying out Einstein multiplication to obtain the tensor state of the hidden layer, wherein the specific expression of Einstein product is as follows:

wherein the method comprises the steps ofAAndBrepresenting an N-order tensor and an N+M-order tensor, k, respectively _N Represents the dimension size of the nth order of the tensor, from k in order of order ₁ To k _N Sequentially multiplying corresponding index elements and then carrying out summation operation to reduce the N order to 1 order so that the result is M-order tensor; obtaining the final hidden layer state through the operation of the update gate and the reset gate in the tensor GRUHAs an output or as an input to a next time step; after calculation of a defined time step, the resulting tensor is obtainedHThe tensor is the resulting hidden layer stateH；

repeating steps 3) to 4) for different epochs until the epochs are over and then executing the next step;

5) After the end of epoch, the result of step 3) is obtained according toMAndStand carrying out inverse normalization on the output of the network, and obtaining the effective dimension in the tensor after inverse normalization as the final predicted result value of the network, thus obtaining the final flow predicted value.