CN112529299B

CN112529299B - Short traffic flow prediction method based on ARIMA and LSTM mixed neural network

Info

Publication number: CN112529299B
Application number: CN202011460307.1A
Authority: CN
Inventors: 王炜; 周伟; 华雪东; 秦韶阳
Original assignee: Southeast University
Current assignee: Southeast University
Priority date: 2020-12-11
Filing date: 2020-12-11
Publication date: 2022-11-18
Anticipated expiration: 2040-12-11
Also published as: CN112529299A

Abstract

The invention discloses a short traffic flow prediction method based on an ARIMA and LSTM hybrid neural network, which comprises the steps of respectively collecting original traffic flow data from an initial time period to a t time period, determining a difference order d, an autoregressive order p and a mobile regression order q in an ARIMA model to obtain an ARIMA (p, d, q) model of calibration parameters, training the ARIMA (p, d, q) model by using an original traffic flow sequence from the initial time period to the t time period to obtain initial prediction flow data from the t time period

Extracting AR (p) partial data and MA (q) partial data from the trained ARIMA (p, d, q) model to preliminarily predict flow data in t period

And taking the AR (p) partial data and the MA (q) partial data as input, taking the real traffic flow data value in the t period as an output training ARIMA-LSTM hybrid neural network model to obtain a prediction model, acquiring prediction input data in the prediction period, inputting the prediction input data into the prediction model to obtain prediction traffic flow data in the prediction period, and obtaining the prediction traffic flow data in the prediction period, wherein the obtained prediction traffic flow data has high accuracy.

Description

Short traffic flow prediction method based on ARIMA and LSTM mixed neural network

Technical Field

The invention relates to the technical field of intelligent transportation, in particular to a short traffic flow prediction method based on an ARIMA and LSTM hybrid neural network.

Background

In today's information transportation systems, intelligent Transportation Systems (ITS) and advanced traffic management information (ATMS) play a vital role for individual travelers and government agencies. The method provides timely and effective road information for road traffic management and road guidance, and can obviously improve traffic jam conditions. The traffic flow is the most direct and basic information of ITS and ATMS, and the accurate traffic flow prediction result can be used as an important basis for making traffic management measures. Many scholars recognize traffic flow as a time series model: the traffic flow at the time t is determined by a plurality of data before the time t. In addition to the early proposed ARIMA model, machine learning and deep learning models are widely applied to short-term traffic flow prediction in recent years, including KNN models, SVR models, ANN models, LSTM models, and the like, and the more complicated the prediction model is, the more the nonlinear learning capability is enhanced, and the prediction accuracy is remarkably improved. However, when the learning ability of a single model gradually approaches the limit and the prediction accuracy cannot be improved significantly, the combined prediction method has received extensive attention and is considered to be an effective way to improve the prediction result at present.

Disclosure of Invention

Aiming at the problems, the invention provides a short traffic flow prediction method based on an ARIMA and LSTM hybrid neural network, which can integrate the advantages of a linear model ARIMA and a nonlinear model LSTM, capture the linear and nonlinear characteristics of traffic flow data and improve the accuracy of short-time traffic flow prediction.

In order to realize the aim of the invention, the short traffic flow prediction method based on the ARIMA and LSTM hybrid neural network comprises the following steps:

s10, respectively collecting original traffic flow data from an initial time period to a t time period; the raw traffic flow data includes a road traffic flow x _t Lane occupancy, average speed of a road section, average travel time of the road section or road congestion index;

s20, using a unit root inspection method to perform stationarity detection on the traffic flow sequence and determining the difference order d in the ARIMA model;

s30, determining an autoregressive order p and a mobile regression order q of the ARIMA model according to a minimized Bayesian Information Criterion (BIC) to obtain an ARIMA (p, d, q) model of calibration parameters; training an ARIMA (p, d, q) model by using an original traffic stream sequence from an initial period to a t period, and converting original traffic stream data x from the t-1 period _t-1 Inputting the trained ARIAM (p, d, q) model to realize the preliminary prediction of the traffic flow, and obtaining the preliminary prediction flow data in the t period

S40, extracting AR (p) partial data and MA (q) partial data from the trained ARIMA (p, d, q) model; the AR (p) partial data represents historical actual data from t-p to t-1 time in an ARIMA (p, d, q) model, and the MA (q) partial data represents historical nonlinear components from t-q to t-1 time in the ARIMA (p, d, q) model;

s60, preliminarily predicting stream data in t period

AR (p) partial data and MA (q) partial data as inputs; training an ARIMA-LSTM hybrid neural network model by taking the real traffic flow data value in the t period as output to obtain a prediction model;

and S70, acquiring the prediction input data of the prediction time interval, and inputting the prediction input data into the prediction model to obtain the predicted traffic flow data of the prediction time interval.

In one embodiment, step S60 is preceded by:

s50, constructing an ARIMA-LSTM mixed neural network model according to the ARIMA (p, d, q) model and the LSTM neural network module.

Specifically, the ARIMA-LSTM hybrid neural network model comprises an input layer, an output layer, an LSTM layer, a full connection layer, a batch normalization layer, a merging layer and a multi-output layer.

In an embodiment, the method for predicting short traffic flow based on ARIMA and LSTM hybrid neural network further includes:

and S80, calculating a first evaluation parameter and a second evaluation parameter of the prediction model, and evaluating the prediction model according to the first evaluation parameter and the second evaluation parameter.

Specifically, the calculation formula of the first evaluation parameter includes:

the calculation formula of the second evaluation parameter includes:

wherein, y _i A real value of traffic flow data representing the i-period,

representing the predicted traffic flow data for the period i, n representing the total number of samples, MAE representing the first evaluation parameter, and MAPE representing the second evaluation parameter.

In one embodiment, obtaining prediction input data for a prediction period comprises:

inputting original traffic flow data in m periods into an ARIMA (p, d, q) model after training to obtain initial prediction flow data in m periods, reading AR (p) partial data and MA (q) partial data which respectively correspond to the m periods from the ARIMA (p, d, q) model, and determining prediction input data according to the initial prediction flow data in the m periods and the AR (p) partial data and the MA (q) partial data which respectively correspond to the m periods; wherein the m period is a prediction period.

In one embodiment, the ARIMA (p, d, q) model comprises:

wherein the content of the first and second substances,

preliminary prediction stream data representing period t + ^d Representing a backward shift operator, x _t-1 Original traffic flow data, x, representing a t-1 time period _t-p Original traffic flow data, ε, representing the t-p period _t-1 Representing the prediction error, epsilon, of the t-1 period _t-q Representing the prediction error for the t-p period,

all represent regression coefficients, which are determined by optimization during model training.

The short traffic flow prediction method based on the ARIMA and LSTM hybrid neural networks respectively collects the original traffic flow data from the initial period to the t period, and uses the unit root checkThe method comprises the steps of carrying out stationarity detection on a traffic flow sequence, determining a difference order d in an ARIMA model, determining an autoregressive order p and a mobile regression order q of the ARIMA model according to a minimum Bayesian Information Criterion (BIC), obtaining the ARIMA (p, d, q) model of calibration parameters, training the ARIMA (p, d, q) model by using an original traffic flow sequence from an initial time period to a t-1 time period, and carrying out x-ray flow data of original traffic flow data in the t-1 time period _t-1 Inputting the trained ARIMA (p, d, q) model to realize the preliminary prediction of the traffic flow, and obtaining the preliminary prediction flow data in the t period

The method comprises the steps of using the traffic flow data true value of a t time period as an output training ARIMA-LSTM hybrid neural network model to obtain a prediction model, obtaining prediction input data of the prediction time period, inputting the prediction input data into the prediction model to obtain prediction traffic flow data of the prediction time period, wherein the obtained prediction traffic flow data have high accuracy, the advantages of a linear model ARIMA and a nonlinear model LSTM can be integrated, the linear and nonlinear characteristics of the traffic flow data are captured, and therefore the accuracy of short-time traffic flow prediction is improved.

Drawings

FIG. 1 is a flow diagram of a short traffic flow prediction method based on a hybrid neural network of ARIMA and LSTM according to an embodiment;

FIG. 2 is a schematic view of traffic flow time series data of an embodiment;

FIG. 3 is a diagram illustrating the difference order d calibration process in the ARIMA model according to an embodiment;

FIG. 4 is a schematic diagram of an ARIMA-LSTM hybrid neural network in accordance with one embodiment;

FIG. 5 is a schematic diagram of the structure of an LSTM neural unit in one embodiment;

FIG. 6 is a diagram of the MAE of the predicted results in one embodiment;

FIG. 7 is a MAPE graph of predicted results in one embodiment.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.

Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.

Referring to fig. 1, fig. 1 is a flowchart of a short traffic flow prediction method based on ARIMA and LSTM hybrid neural network according to an embodiment, including the following steps:

s10, respectively collecting original traffic flow data from an initial time period to a t time period; the raw traffic flow data includes a road traffic flow x _t Lane occupancy, average speed of the road segment, average travel time of the road segment, or road congestion index.

The original traffic flow data is a traffic flow time series reflecting the change of the traffic state along with time, and can be in any form of road traffic flow, lane occupancy, road section average speed, road section average travel time, road congestion index and the like. The raw traffic flow data may be partitioned into a training set and a test set for subsequent validation or testing of each model using the test set.

In the above steps, each period from the initial period to the t period has a corresponding duration (e.g., 5 minutes or 10 minutes).

And S20, carrying out stationarity detection on the traffic flow sequence by using a unit root detection method, and determining a differential order d in the ARIMA model.

The unit root test method may include an ADF unit root test method and the like. Specifically, an ADF (automatic document feeder) unit root inspection method can be used for carrying out stationarity detection on a traffic flow sequence, and the difference order d in the ARIMA model is determined by a successive inspection and successive difference method.

S30, determining an autoregressive order p and a mobile regression order q of the ARIMA model according to a minimized Bayesian Information Criterion (BIC) to obtain an ARIMA (p, d, q) model of calibration parameters; training an ARIMA (p, d, q) model by using an original traffic flow sequence from an initial time interval to a t time interval, and converting original traffic flow data x from the t-1 time interval _t-1 Inputting the trained ARIAM (p, d, q) model to realize the preliminary prediction of the traffic flow, and obtaining the preliminary prediction flow data in the t period

In one embodiment, the above steps may also calculate the prediction error ε _t Preliminary prediction stream data of t period

Determining the linear component of the traffic flow in the t period, and calculating the prediction error epsilon in the t period _t Determining as a nonlinear component; wherein

In this embodiment, the BIC criterion may be adopted to determine the other 2 parameters of the ARIMA model, that is, the autoregressive order p and the mobile regression order q, to obtain an ARIMA (p, d, q) model with calibrated parameters. The original traffic flow sequence x is then used _t Training an ARIMA (p, d, q) model, and initially predicting the traffic flow by using the trained traffic flow

And calculates the prediction error epsilon _t Therein of

Specifically, preliminary work is performedAfter the measurement, the prediction result of the ARIMA (p, d, q) model is analyzed, and the prediction result of the ARIMA (p, d, q) is considered

For linear components of traffic flow, prediction error [. Epsilon. ] _t Is a nonlinear component.

S40, extracting AR (p) partial data and MA (q) partial data from the trained ARIMA (p, d, q) model; the AR (p) partial data represents historical actual traffic x from t-p to t-1 in ARIMA (p, d, q) model _t-p ,x _t-p+1 ,...,x _t-1 The MA (q) partial data represents the history nonlinear component epsilon from t-q to t-1 in ARIMA (p, d, q) model _t-q ,ε _t-q+1 ,...,ε _t-1 。

S60, preliminarily predicting stream data in t period

AR (p) partial data and MA (q) partial data as inputs; and (4) training an ARIMA-LSTM hybrid neural network model by taking the traffic flow data real value in the t time period as output to obtain a prediction model.

Specifically, the essence of the training process of the ARIMA-LSTM hybrid neural network model is to establish an input-output mapping relationship to realize the prediction of the ARIMA-LSTM model on a true value and obtain a final prediction result y of the traffic flow _t . The real traffic flow data value in the t time period can be acquired through field acquisition, and in some cases, the real traffic flow data value in the t time period is consistent with the original traffic flow data in the t time period. The ARIMA-LSTM hybrid neural network model is obtained by combining an ARIMA (p, d, q) model and an LSTM neural network module, and the result obtained by the ARIMA (p, d, q) is further processed: it includes three input parts: the AR (p) parts of ARIMA (p, d, q), respectively (which are time series with p steps, i.e. t-p to the historical actual data x at time t-1 _t-p ,x _t-p+1 ,...,x _t-1 ) MA (q) part (which is a time series with q steps, i.e. the historical non-linear contribution ε at the time t-q to t-1 _t-q ,ε _t-q+1 ,...,ε _t-1 ) And ARIMA (p, d, q)) Is predicted (i.e. linear component at time t)

). In an ARIMA-LSTM network (ARIMA-LSTM hybrid neural network model), the time series of the AR (p) and MA (q) portions are processed through the LSTM modules to mine their timing characteristics. The output of the ARIMA-LSTM network is often the actual traffic flow at time t.

In one embodiment, a prediction model may be derived using training set training and an ARIMA model and an ARIMA-LSTM hybrid neural network model. After the prediction model is obtained, the prediction model can be tested by adopting a test set to simulate an actual prediction process, input data is input into the trained model according to a set format to obtain an output result (namely a prediction result), and the output result is compared with the actual result to evaluate and analyze the model.

In the embodiment, a linear ARIMA model is adopted to extract linear components, and residual errors are regarded as nonlinear components; secondly, capturing time characteristics of historical nonlinear parts and historical observation data by using an LSTM unit, inputting the time characteristics and the current linear components into the same neural network, and using a merging layer for connecting the three parts together; and finally, multi-step prediction of the traffic flow is realized by utilizing multiple output layers, and the accuracy of a prediction result can be improved.

The short traffic flow prediction method based on the ARIMA and the LSTM hybrid neural network comprises the steps of respectively collecting original traffic flow data from an initial time period to a t time period, using a unit root inspection method to carry out stationarity detection on a traffic flow sequence, determining a difference order d in an ARIMA model, determining an autoregressive order p and a mobile regression order q of the ARIMA model to obtain an ARIMA (p, d, q) model with calibrated parameters, training the ARIMA (p, d, q) model by using the original traffic flow sequence from the initial time period to the t time period, and enabling the original traffic flow data x from the t time period to be x _t Inputting the trained ARIMA (p, d, q) model to realize the preliminary prediction of the traffic flow and obtain the model in the t periodPreliminary prediction of streaming data

The AR (p) part data and the MA (q) part data are used for training an ARIMA-LSTM hybrid neural network model by taking a traffic flow data real value in a t time period as an output, a prediction model is obtained, prediction input data in the prediction time period are obtained, the prediction input data are input into the prediction model, and prediction traffic flow data in the prediction time period are obtained.

In one embodiment, step S60 is preceded by:

Specifically, the ARIMA-LSTM hybrid neural network model comprises an input layer, an output layer, a LSTM (long short term memory network) layer, a full connection (Dense) layer, a Batch Normalization (BN) layer, a merging layer and a multi-output layer.

In one example, the LSTM layer is derived from an LSTM neural network. The LSTM neural network is a Recurrent Neural Network (RNN) widely used for time series modeling, can interpret the context dependence of input data, and overcomes the problem of gradient vanishing existing in RNN. The LSTM can solve the problem of long-term correlation of data and has wide application in time series analysis. In the ARIMA-LSTM hybrid neural network of the present invention, the LSTM layer is used to capture the temporal characteristics of the linear and non-linear portions. The LSTM layer is composed of a number of LSTM neural units. The LSTM neural unit controls the transfer of information through 3 gating structures, including input gate i _t Door f for forgetting to leave _t And an output gate o _t . Input gate i _t For controlling newly input data, forgetting gate f _t For controlling the information that should be discarded, an output gate o _t For filtering and controlling the information that needs to be output.

Let x be _t Input time series representing time t, input gate i _t Door f for forgetting to leave _t And an output gate o _t The update formulas of the states of (1) are respectively as follows:

i _t ＝sigmoid(w _i ·[h _t-1 ,x _t ]+b _i )’

f _t ＝sigmoid(w _f ·[h _t-1 ,x _t ]+b _f )’

o _t ＝sigmoid(w _o ·[h _t-1 ,x _t ]+b _o )’

memory state c of LSTM neural unit _t The update formula of (2) is:

h _t ＝o _t ☉tanh(c _t )’

the operator "☉" is the Hadamard product, w, of a vector _i 、w _f 、w _o And w _c Are respectively the connection weight, b _i 、b _f 、 b _o And b _c To activate the bias, sigmoid (-) and tanh (-) are nonlinear activation functions that can map the input data to [0,1]And [ -1,1]And (3) a range.

The input layer of the ARIMA-LSTM hybrid neural network model comprises three parts: preliminary prediction of streaming data in t periods

AR (p) partial data x _t-p ,x _t-p+1 ,...,x _t-1 And MA (q) partial data as input ε _t-q ,ε _t-q+1 ,...,ε _t-1 (ii) a Traffic flow data true value x in t time period _t The LSTM layer is employed as an output to capture the temporal characteristics of the AR (p) and MA (q) portions time series. Linear component

Activation is performed by the Dense layer. For merging layersThe 3-part feature is integrated into one and the same neural network. The BN layer is used after each layer to speed up convergence speed in neural network training. The multi-output layer is used for controlling the final output of the neural network and realizing the short-time traffic flow multi-step prediction.

And for the number of the neurons of each layer of neural network, calibrating by adopting a grid search and cross validation method so as to ensure that the model can exert the optimal prediction performance as far as possible.

In one embodiment, the method for predicting short traffic flow based on ARIMA and LSTM hybrid neural networks further includes:

the calculation formula of the second evaluation parameter includes:

wherein, y _i A real value of traffic flow data representing the i-period,

representing the predicted traffic flow data for the period i, n representing the total number of samples, MAE representing the first evaluation parameter, and MAPE representing the second evaluation parameter. The smaller the MAE and the MAPE are, the smaller the model prediction error is, and the higher the prediction accuracy is, and at this time, the evaluation of the prediction model according to the first evaluation parameter and the second evaluation parameter comprises the following steps: and if the first evaluation parameter and the second evaluation parameter are both smaller than the evaluation threshold, judging that the prediction performance of the prediction model reaches the prediction standard.

inputting original traffic flow data of m time periods into a trained ARIMA (p, d, q) model to obtain initial prediction flow data of the m time periods, reading AR (p) partial data and MA (q) partial data respectively corresponding to the m time periods from the ARIMA (p, d, q) model, and determining prediction input data according to the initial prediction flow data of the m time periods and the AR (p) partial data and the MA (q) partial data respectively corresponding to the m time periods; wherein the m period is a prediction period.

In one embodiment, the ARIMA (p, d, q) model comprises:

wherein the content of the first and second substances,

preliminary prediction stream data representing a period t + ^d Representing a backward shift operator, x _t-1 Original traffic flow data, x, representing a t-1 time period _t-p Original traffic flow data, ε, representing the t-p period _t-1 Representing the prediction error, epsilon, of the t-1 period _t-q Representing the prediction error for the t-p period,

In this embodiment, an ARIMA model may be modeled in advance. ARIMA modeling consists of three parts, self-regression (AR), integration (I), and Moving Average (MA). The model has three parameters p, d and q, which respectively represent an autoregressive order, a differencing order and a moving average order. ADF root test the root test can be used to diagnose the stationarity of time series. If the original time sequence can not pass the stationarity check, the unstable time sequence can be converted into the stable time sequence by a difference method. ARIMA (p, d, q) is formulated as:

wherein the content of the first and second substances,

preliminary prediction stream data representing a period of t, x _t-1 Original traffic flow data, x, representing a t-1 time period _t-p Original traffic flow data, epsilon, representing a t-p period _t-1 Representing the prediction error, epsilon, of the t-1 period _t-q Representing the prediction error for the t-p period,

and expressing regression coefficients, and determining by optimization during model training. V ^d Represents a post-shift operator, such as = (1-B) represents a post-shift operator, i.e., B (x) _t )＝x _t-1 ；

Representing a predicted value; epsilon _t Representing residual components whose distribution obeys a mean of 0 and a variance of σ ² A gaussian distribution of (a).

In one example, where the difference order d has been determined, the autoregressive order p and the moving average order q are determined according to a minimum Bayesian Information Criterion (BIC) whose calculation formula is as follows:

BIC＝kln(n)-2ln(L)，

where k represents the number of model parameters, n represents the total number of samples, and L is the maximum likelihood value.

After ARIMA parameter calibration is completed, modeling and predicting are carried out by utilizing ARIMA (p, d, q) and original traffic flow data, so as to obtain predicted input data of each time interval.

In one embodiment, taking the traffic data set sourced by the university of Deluglas, minnesota traffic data research laboratory (TRDL) as an example, the example selected the S778 inspection station located on the Notneapolis, minnesota, st.Paul I-94 highway, USA. The raw data are counted in the form of a 5-minute interval traffic flow time series, 288 samples are obtained each day, the date ranges from 10 months and 1 day of 2018 to 26 days of working days, and a total of 20 days, and a total of 5760 samples are obtained. Fig. 2 shows the finally collected time-series data of the traffic flow. The data for the first 3 weeks (10 months, 1 day-19 days, total 4320 samples) were used as training set for training the model, and the data for the last 1 week (10 months, 1 day-19 days, total 1440 samples) were used as test set for evaluation of the prediction results. To help ARIMA-LSTM select the appropriate hyperparameters, a grid search and cross validation approach was used and the training set was further divided into a training subset (10 months, 1 day-12 days, total 4320 samples) and a validation set (10 months, 15 days-19 days, total 1440 samples).

The data stability test in step S20 is to use an ADF unit root test method to perform stability test on a traffic flow sequence. Fig. 3 shows a calibration process of the difference order d in the ARIMA model, and the difference order d in the ARIMA model is determined by a successive inspection and successive difference method. In this example, the ADF root test results for the original traffic flow are shown in the following table:

it can be seen that the test statistic-5.821 is below the 1% confidence threshold-3.433, and the P value 4.183 × 10 ^-7 Less than 0.001. Indicating that the original traffic flow time series is already smooth and no differentiation is required, so d =0.

The ARIMA modeling of this embodiment is that ARIMA consists of three parts, autoregressive (AR), integrate (I), and Moving Average (MA). The model has three parameters p, d and q, which respectively represent an autoregressive order, a differential order and a moving average order. ADF root test the root test can be used to diagnose the stationarity of time series. If the original time sequence cannot pass stationarity check, the unstable time sequence can be converted into a stable time sequence by a difference method. ARIMA (p, d, q) is formulated as:

wherein = (1-B) means backward shiftOperators, i.e. B (x) _t )＝x _t-1 ；

Representing a predicted value; epsilon _t Representing residual components whose distribution obeys a mean of 0 and a variance of σ ² The distribution of the gaussian component of (a) is,

and expressing regression coefficients, and optimizing and determining the regression coefficients during model training.

In the case where the difference order d =0 has been determined, the autoregressive order p and the moving average order q are determined according to the minimum bei She Si information criterion (BIC), the calculation formula of which is as follows:

BIC＝kln(n)-2ln(L)

where k represents the number of model parameters, n represents the number of samples, and L is the maximum likelihood value.

In this embodiment, the search ranges of p and q are both set to [0,24], and p =5 and q =9 are finally determined based on BIC, so as to obtain an ARIMA (5,0,9) model. The traffic flow sequence is then modeled and predicted using ARIMA (5,0,9).

The ARIMA-LSTM hybrid neural network model modeling is characterized in that the established ARIMA-LSTM hybrid neural network model has a specific structure as shown in FIG. 4, and comprises an input layer, an output layer, a LSTM layer, a full connection (Dense) layer, a Batch Normalization (BN) layer, a merging layer and a multi-output layer.

Wherein the LSTM layer is derived from the LSTM neural network. The LSTM neural network is a Recurrent Neural Network (RNN) widely used for time series modeling, and can interpret the context dependence of input data and overcome the problem of gradient disappearance existing in RNN. The LSTM can solve the problem of long-term correlation of data and has wide application in time series analysis. In the ARIMA-LSTM hybrid neural network of the present invention, the LSTM layer is used to capture the temporal characteristics of the linear and non-linear portions. The LSTM layer is composed of a number of LSTM neural units. Fig. 5 shows the basic structure of an LSTM neural unit. The LSTM neural unit controls the transfer of information through 3 gating structures, including input gate i _t Door f for forgetting to leave _t And an output gate o _t . Input door i _t For controlling newly input data, forgetting gate f _t For controlling the information that should be discarded, an output gate o _t For filtering and controlling the information that needs to be output.

Let x be _t Input time series representing time t, input gate i _t Forgetting door f _t And an output gate o _t The update formulas of the states of (1) are respectively as follows:

i _t ＝sigmoid(w _i ·[h _t-1 ,x _t ]+b _i )’

f _t ＝sigmoid(w _f ·[h _t-1 ,x _t ]+b _f )’

o _t ＝sigmoid(w _o ·[h _t-1 ,x _t ]+b _o )’

memory state c of LSTM neural unit _t The update formula of (2) is:

h _t ＝o _t ☉tanh(c _t )’

the operator "☉" is the Hadamard product of the vector, w _i 、w _f 、w _o And w _c Are respectively the connection weight, b _i 、b _f 、 b _o And b _c To activate the bias, sigmoid (-) and tanh (-) are nonlinear activation functions that can map the input data to [0,1]And [ -1,1]And (3) a range.

FIG. 4 illustrates the architecture of an ARIMA-LSTM hybrid neural network, the input layer of which comprises three parts: historical actual time series x _t-p ,x _t-p+1 ,...,x _t-1 Historical residual non-sequence (non-linear component) epsilon derived from ARIMA (p, d, q) _t-q ,ε _t-q+1 ,...,ε _t-1 And preliminary prediction results (linear components) from ARIMA (p, d, q)

The historical real sequence and the historical residual sequence are the AR (p) and MA (q) parts of ARIMA, respectively, time series data with p and q time steps, respectively, whose temporal features are captured using the LSTM layer. Linear component of DThe ense layer is activated. The merging layer is used to integrate 3 partial features into one and the same neural network. The BN layer is used after each layer for accelerating the convergence speed in the training of the neural network. The multi-output layer is used for controlling the final output of the neural network and realizing the short-time traffic flow multi-step prediction, and in the embodiment, the number of the neural units of the output layer is set to be 6, namely, the traffic flow prediction with 6 step lengths can be realized by one-time output.

And for the number of the neurons of each layer of neural network, calibrating by adopting a grid search and cross validation method so as to ensure that the model can exert the optimal prediction performance as far as possible. Specifically, the ARIMA-LSTM hybrid neural network is trained by adopting a training subset, and the number of neurons in each layer is selected on a verification data set according to a minimum loss function. The search range of the unit number of the LSTM layer and the Dense layer is 5 to 25, the step length is 5, the BN layer does not relate to a nerve unit, and calibration is not needed. And optimizing the hybrid neural network by adopting an Adam algorithm of a Mean Square Error (MSE) loss function. The learning rate (learning rate), batch-size (batch-size), and iteration number (epoch) were set to 0.001, 256, and 500, respectively.

And the prediction result evaluation is to evaluate the prediction result by adopting MAE and MAPE, and the calculation formula comprises the following steps:

wherein, y _i A real value of traffic flow data representing the i-period,

representing the predicted traffic flow data for the period i, n representing the total number of samples, MAE representing the first evaluation parameter, and MAPE representing the second evaluation parameter. The smaller the MAE and MAPE are, the smaller the model prediction error is, and the higher the prediction accuracy is.

To illustrate the advantages of the ARIMA-LSTM hybrid neural network proposed by the present invention over the same type of model, four additional models were tested in this example, including ARIMA-ANN, ARIMA, ANN, and LSTM. The average evaluation index of 6 prediction steps for 5 models is shown in the following table:

	ARIMA-LSTM	ARIMA-ANN	ARIMA	LSTM	ANN
						MAE	27.93	29.01	29.29	29.70	32.37
MAPE	13.27％	17.85％	19.12％	17.46％	26.63％

the table above reflects the overall performance of each model. The ARIMA-LSTM hybrid neural network model provided by the invention shows the best performance in 5 models. The MAE is 27.93, which is respectively improved by 4.67 percent and 5.97 percent compared with ARIMA and LSTM models, and is improved by 3.73 percent compared with ARIMA-ANN model; the MAPE of the model is 13.37 percent, which is respectively improved by 30.59 percent and 23.97 percent compared with ARIMA and LSTM models, and is improved by 25.62 percent compared with ARIMA-ANN model. This shows that the proposed ARIMA-LSTM hybrid model can significantly improve the overall prediction accuracy.

FIGS. 6 and 7 show the MAE and MAPE predicted results in 30 min for the 5 models, respectively, and overall, the multi-step prediction has accumulated errors, and the errors increase as the number of predicted steps increases. The growth rates of these models are different. The ARIMA-LSTM hybrid neural network model provided by the invention has the slowest error increase. In the single-step prediction, the prediction effect of all models is basically the same except for ANN, and the maximum value of MAE and the maximum value of MAPE are about 20% and 10%, respectively. In the last prediction step, there were significant differences between the models, particularly for MAPE with fig. 7. MAPE in ARIMA-LSTM is around 16%, and other models are above 20%. The results show that the ARIMA-LSTM hybrid model has strong robustness and reliability for multi-step traffic flow prediction.

The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, however, as long as there is no such combination, the scope of the present description should be considered as being described in the present specification.

It should be noted that the terms "first \ second \ third" referred to in the embodiments of the present application merely distinguish similar objects, and do not represent a specific ordering for the objects, and it should be understood that "first \ second \ third" may be interchanged under a specific order or sequence where permitted. It should be understood that "first \ second \ third" distinct objects may be interchanged under appropriate circumstances such that the embodiments of the application described herein may be implemented in an order other than those illustrated or described herein.

The terms "comprising" and "having" and any variations thereof in the embodiments of the present application are intended to cover non-exclusive inclusions. For example, a process, method, apparatus, product, or device that comprises a list of steps or modules is not limited to the listed steps or modules, but may alternatively include other steps or modules not listed or inherent to such process, method, product, or device.

The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, and these are all within the scope of protection of the present application. Therefore, the protection scope of the present patent application shall be subject to the appended claims.

Claims

1. A short traffic flow prediction method based on an ARIMA and LSTM hybrid neural network is characterized by comprising the following steps:

s20, using a unit root inspection method to perform stationarity detection on the traffic flow sequence and determine a difference order d in the ARIMA model;

s30, determining an autoregressive order p and a mobile regression order q of the ARIMA model according to a minimum Bayesian information criterion to obtain an ARIMA (p, d, q) model with calibrated parameters; using an original traffic flow sequence from an initial time period to a t time period to train an ARIMA (p, d, q) model, and using original traffic flow data x from the t-1 time period _t-1 Inputting the trained ARIAM (p, d, q) model to realize the preliminary prediction of the traffic flow, and obtaining the preliminary prediction flow data in the t period

s60, preliminarily predicting stream data in t period

2. The short traffic flow prediction method based on ARIMA and LSTM hybrid neural network as claimed in claim 1, wherein step S60 is preceded by:

and S50, constructing an ARIMA-LSTM hybrid neural network model according to the ARIMA (p, d, q) model and the LSTM neural network module.

3. The short traffic flow prediction method based on ARIMA and LSTM hybrid neural network of claim 2, wherein the ARIMA-LSTM hybrid neural network model comprises an input layer, an output layer, a LSTM layer, a fully connected layer, a batch normalization layer, a merging layer, and a multi-output layer.

4. The ARIMA and LSTM hybrid neural network-based short traffic flow prediction method of claim 1, further comprising:

5. The ARIMA and LSTM hybrid neural network-based short traffic flow prediction method of claim 4, wherein the first evaluation parameter calculation formula comprises:

the calculation formula of the second evaluation parameter includes:

wherein, y _i A real value of traffic flow data representing the i-period,

and the predicted traffic flow data of the period i is represented, n represents the total number of samples, MAE represents a first evaluation parameter, and MAPE represents a second evaluation parameter.

6. The ARIMA and LSTM hybrid neural network-based short traffic flow prediction method of claim 1, wherein obtaining prediction input data for a prediction period comprises:

inputting original traffic flow data of m time periods into a trained ARIMA (p, d, q) model to obtain preliminary prediction flow data of the m time periods, reading AR (p) partial data and MA (q) partial data respectively corresponding to the m time periods from the ARIMA (p, d, q) model, and determining prediction input data according to the preliminary prediction flow data of the m time periods and the AR (p) partial data and the MA (q) partial data respectively corresponding to the m time periods; wherein the m period is a prediction period.

7. The ARIMA and LSTM hybrid neural network-based short traffic flow prediction method of claim 6, wherein the ARIMA (p, d, q) model comprises:

wherein the content of the first and second substances,

preliminary prediction stream data representing the t period,

represents a backward shift operator, x _t-1 Original traffic flow data, x, representing a t-1 time period _t-p Original traffic flow data, epsilon, representing a t-p period _t-1 Representing the prediction error, epsilon, of the t-1 period _t-q Representing the prediction error for the t-p period,

θ ₀ ,θ ₁ ,...,θ _q all represent regression coefficients, determined by optimization during model training.