CN111915073A

CN111915073A - Short-term prediction method for intercity passenger flow of railway by considering date attribute and weather factor

Info

Publication number: CN111915073A
Application number: CN202010718851.5A
Authority: CN
Inventors: 滕靖; 李金洋
Original assignee: Tongji University
Current assignee: Tongji University
Priority date: 2020-04-28
Filing date: 2020-07-23
Publication date: 2020-11-10

Abstract

A short-term prediction method for intercity passenger flow of a railway considering date attributes such as festivals and holidays and weather information is characterized in that an improved particle swarm algorithm and a long-term and short-term memory neural network model are comprehensively applied, and the method comprises the following technical steps: firstly, preprocessing the railway intercity passenger flow and historical data of influencing factors, and converting the historical data into a data set with supervised learning; secondly, training the long-term and short-term memory neural network model by using the processed data, and optimizing the hyper-parameters by applying an improved particle swarm algorithm; and finally, inputting the historical data and the influence factor data in the prediction period into the trained model to predict the intercity passenger flow of the railway.

Description

Short-term prediction method for intercity passenger flow of railway by considering date attribute and weather factor

Technical Field

The present invention relates to communication technology/computer technology.

Background

In recent years, with rapid construction of high-speed railways and increasing economic communication in urban communities, the size of intercity passenger flow is expanding, and new features of high density and commuting appear. Therefore, the passenger demand state among cities is scientifically and reasonably mastered, and the method has important significance for scientifically optimizing resource allocation, reasonably formulating price strategies, dynamically optimizing product structures and improving transportation service quality of railway operation enterprises.

Railway traffic forecasts can be divided into long-term, medium-term and short-term forecasts according to time scale. Wherein the short term forecasts provide daily passenger volume estimates that take into account recent (e.g., week, month) daily demand changes. The method is one of the most critical tasks in railway enterprise operation decision and dynamic operation adjustment; accurate short-term railway traffic prediction provides a basis for effective railway revenue management. The short-term passenger flow in the inter-city railway is influenced by factors such as holidays, large-scale activities, weather and the like, so that the characteristics of high fluctuation and strong randomness are presented, and the difficulty is brought to accurate prediction.

From the perspective of prediction model classes, the prediction methods of short-term railway passenger flow can be divided into three categories. The first category of methods are parametric models, including exponential smoothing models, gray prediction models, ARIMA models, and the like. The method is applied earlier in passenger flow prediction, but the defects are obvious, for example, a gray prediction model may cause a large deviation of a prediction result due to the sparsity of samples, an ARIMA model cannot well grasp the nonlinear relation of time series, and the like. The second category of methods is non-parametric models, including support vector machines, neural network models, and the like. The neural network model has the characteristics of adaptivity, nonlinearity, arbitrary functionality, mapping capability and the like, and is widely applied to short-term railway passenger flow prediction in recent years. However, neural networks also have some inherent drawbacks, such as local minima problems, selection of the number of hidden elements, and the risk of over-fitting, among others. The third category of methods is combined models, which have been receiving more and more attention from researchers in recent years because of their better performance than single models, and also have achieved good results in various prediction scenarios.

From the prediction object range, the early passenger flow prediction model ignores the date attribute, so that the abnormal fluctuation of the passenger flow cannot be well explained by the prediction result. Later, students distinguish the holiday and non-holiday attributes of dates and respectively predict short-term passenger flow to improve prediction accuracy, but the universality of the model is reduced. In recent research, a unified prediction model is established by integrally considering holidays and non-holidays, a date attribute is specially marked, and the general performance of the model is recovered while the specificity of the holidays is embodied. However, the feature detail of the existing research on the date attribute is not enough, and in fact, the daily passenger flow fluctuation feature is related to the attributes of the month, the week, the holiday and the date adjacent to the holiday, so that the influence of the date attribute is subjected to detailed research.

Disclosure of Invention

The invention provides a short-term prediction method of railway intercity passenger flow based on an improved neural network model, which considers date attribute and weather factor to improve prediction precision, support design work of railway passenger transportation products and achieve the purposes of cost reduction and efficiency improvement.

Technical scheme

A short-term prediction method for intercity passenger flow of a railway considering date attributes such as festivals and holidays and weather information is characterized in that an improved particle swarm algorithm and a long-term and short-term memory neural network model are comprehensively applied, and the method comprises the following technical steps:

firstly, preprocessing the railway intercity passenger flow and historical data of influencing factors, and converting the historical data into a data set with supervised learning;

secondly, training the long-term and short-term memory neural network model by using the processed data, and optimizing the hyper-parameters by applying an improved particle swarm algorithm;

and finally, inputting the historical data and the influence factor data in the prediction period into the trained model to predict the intercity passenger flow of the railway.

The method specifically comprises the following steps:

step 1, preprocessing the railway intercity passenger flow and historical data of influencing factors, and converting the preprocessed data into a data set with supervised learning, wherein the data set comprises the following steps:

and extracting date characteristics of the historical data according to attributes of the month, the week and the holidays, and collecting weather information of the city. Wherein, date attributes such as holidays and the like and weather information jointly form influence factor data of passenger flow prediction.

And on the basis of the collected passenger flow data and the collected influence factor data, encoding the original data by further adopting an One-hot encoding mode. An N-bit status register is used to encode N states, each having its own independent register bit and only one bit being active at any one time. This first requires mapping the classification values to integer values. Each integer value is then represented as a binary vector, labeled 0 except for the index of the integer labeled 1.

And after the data is subjected to coding processing, converting the data into a data set with supervised learning according to the input step size and the output step size predicted by the model.

Step 2, establishing a long-term and short-term memory neural network model, and training the model by using the processed data, wherein the method comprises the following steps:

constructing a long-short term memory neural network comprising an input gate, a forgetting gate and an output gate, and respectively recording three gates with time step t as i_t、f_tAnd o_t(ii) a The corresponding candidate long-term memory, updated long-term memory and working memory are recorded as

c_tAnd h_t：

An input gate: i.e. i_t＝σ(W_i·[h_t-1,x_t]+b_i)

Forget the door: f. of_t＝σ(W_f·[h_t-1,x_t]+b_f)

An output gate: o_t＝σ(W_o·[h_t-1,x_t]+b_o)

Candidate long-term memory:

updating long-term memory:

working and memorizing: h is_t＝o_t·tanh(c_t)

Wherein, W_i、W_f、W_o、W_cAs a weight matrix, b_i、b_f、b_o、b_cRespectively, threshold values of the respective functions, h_t-1For working memory at time step t-1, x_tFor the input at time step t, σ is sigmoid activation function, tanh is hyperbolic tangent activation function, and "·" represents vector inner product.

Step 3, improving the traditional particle swarm optimization, and applying the optimization to the learning rate, the hidden layer number and the iteration number in the long-short term memory neural network model, wherein the optimization comprises the following steps:

initializing parameters, and determining a population scale, iteration times, learning factors and a limited interval of positions;

secondly, initializing the position and the speed of the particles, and randomly generating three hyper-parameters of the long-term and short-term memory neural network;

thirdly, determining an evaluation function of the particles, and outputting the fitness of the model by adopting the average value of the precision of the training sample and the test sample of the long-term and short-term memory neural network model;

fourthly, calculating the fitness of the corresponding position of each example, determining an individual extreme value and a group extreme value according to the initial particle fitness value, and taking the best position of each particle as the historical best position of each particle;

and fifthly, in each iteration process, updating the speed and the position of the particles, the inertia weight and the learning factor according to the following formula, and updating the individual extreme value and the population extreme value of the particles according to the fitness value of the new population particles:

velocity of update particle: v. of_t+1＝w·v_t+c₁·r₁·(pbest-x_t)+c₂·r₂·(gbest-x_t)

Updating the position of the particle: x is the number of_t+1＝x_t+v_t+1

Updating the inertia weight:

updating the learning factor:

wherein, t_maxRespectively the current iteration number and the maximum iteration number, v_t、x_tRespectively the velocity and position of the particle at time t, w the inertial weight, c₁、c₂Is a learning factor, r₁、r₂As random coefficients, pbest and gbest are the individual extremum and the population extremum, w_max、w_minRespectively, the upper and lower bounds of the weight coefficient.

And sixthly, outputting the optimal result of the model after the maximum iteration times of the particle swarm algorithm are met.

Step 4, inputting the historical data and the influence factor data of the prediction period into the trained model to predict the intercity passenger flow of the railway, wherein the prediction comprises the following steps:

on the basis of the optimal model, the historical data of the input step length and the prediction period influence factor data of the output step length are preprocessed and input into the trained model to obtain a short-term inter-city passenger flow prediction result of the prediction period.

According to the technical scheme, the method is based on the improved particle swarm algorithm and the long-short term memory neural network model, and the effective prediction of the short-term inter-city passenger flow of the railway in any period of the whole year is realized by taking date attributes such as festivals and holidays and weather factors into consideration.

The invention is characterized by the following three aspects:

firstly, the influence of date attribute subdivision on passenger flow fluctuation is provided, and a universal railway intercity passenger flow prediction method integrating holidays and non-holidays is provided.

Secondly, the weather factors are rarely considered in the conventional prediction model, and the method tries to incorporate the daily meteorological features into the short-term passenger flow prediction influence factors and perform empirical analysis.

Thirdly, with the continuous development of the machine learning related theory in recent years, some neural network models suitable for time series analysis are developed, and the models are preferably applied to improve the prediction accuracy.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below. The drawings in the following description are examples of the present invention, and it will be apparent to those skilled in the art that other drawings can be obtained from the drawings without inventive effort.

FIG. 1 is a schematic illustration of the process flow of the present invention;

FIG. 2 is a flow chart of the overall implementation of a model based on a long-short term memory neural network and an improved particle swarm optimization in an embodiment of the present invention;

FIG. 3 is a variation of the training times of the long-term and short-term memory neural network model optimized by the particle swarm optimization in one embodiment of the present invention;

FIG. 4 is a diagram illustrating how to optimize the number of hidden layers of a long-term and short-term memory neural network model by using a particle swarm optimization in an embodiment of the present invention;

FIG. 5 is a diagram illustrating the variation of the learning rate of a long-term and short-term memory neural network model optimized by a particle swarm optimization in an embodiment of the present invention;

FIG. 6 is a variation of the overall optimal fitness of the population in the particle swarm algorithm according to an embodiment of the present invention;

FIG. 7 is a schematic diagram illustrating the passenger flow prediction result in the future 7 days according to an embodiment of the present invention.

Detailed Description

The present invention is described below in connection with an exemplary communication system.

In one embodiment, as shown in FIG. 1, a method for short-term prediction of intercity passenger flow in a railway considering date attributes and weather factors, comprises the following steps:

s101, preprocessing the railway intercity passenger flow and historical data of the influencing factors, and converting the preprocessed data into a data set with supervised learning;

s102, training the long-term and short-term memory neural network model by using the processed data, and optimizing the hyper-parameters by applying an improved particle swarm algorithm;

s103, inputting the historical data and the influence factor data of the prediction period into the trained model, and predicting the inter-city passenger flow of the railway.

1. Data pre-processing

The change of the intercity passenger flow of the railway has very obvious time characteristics. First, short term changes in passenger flow are based on long term evolutionary trends. Secondly, the inter-city passenger flow of the railway has different regular characteristics in different seasons and months in a year. The size of the passenger flow in august summer is slightly higher than that in other months of the same year because of summer holidays and weather factors suitable for vacation. Furthermore, the inter-city traffic of the railway shows a remarkably changing characteristic with a week period, generally speaking, the traffic of friday reaches the peak within one week, Monday is the valley, and the traffic of other working days is relatively stable. Furthermore, the inter-city passenger flow of the railway is affected by holidays in time, and usually a peak of passenger flow is reached the day before the holidays start and the last day of the holidays. When short-term prediction of railway intercity passenger flow is carried out, if the application scene of the prediction model in time is not distinguished, the date attribute is required to be taken as an influence factor of passenger flow change to be included in the model. According to the above analysis, the present invention extracts the features of the date attribute from three dimensions, respectively month, week and holiday.

In addition, weather, as a key external factor, also has a certain influence on railway travel, especially travel with non-rigid requirements. Generally speaking, if the weather is fine, a certain amount of non-rigid travel is induced; on the contrary, under the influence of rain and snow weather, the running amount of people is also reduced. Therefore, the influence of weather characteristics such as 'sunny', 'cloudy' and 'rain' in weather on the intercity passenger flow of the railway is considered in the invention.

Therefore, the invention starts from two aspects of date attribute and weather, four factors of month, week, holiday and meteorological characteristics, comprehensively considers the influence of the factors on the short-term inter-city passenger flow of the railway, and effectively predicts the short-term passenger flow in the future based on historical passenger flow change data.

First, the collected influencing factor data is subjected to preliminary integer coding. The codes from january to december are sequentially 1 to 12, the codes from monday to sunday are sequentially 1 to 7, the holiday attributes are totally divided into 11 classes, the codes are sequentially 1 to 11, the weather is divided into 13 classes, and the codes are sequentially 1 to 13. The specific classification and coding modes of the holiday attribute and the weather attribute are shown in table 1 and table 2 respectively.

TABLE 1 preliminary classification coding method for holiday attribute

Wherein, the 'small and long holidays' refer to national legal festivals and holidays, including spring festival, Qingming festival, labor festival, early afternoon festival, mid-autumn festival and national celebration festival. "common date" means other ordinary dates than the special holiday attribute described above.

TABLE 2 weather factor preliminary classification coding mode

The invention simplifies the classification of the weather characteristics. The former is adopted for the weather type of 'turning from A to B', and if the weather type is 'turning from sunny to cloudy', the weather type is 'sunny'; the latter is taken for the weather type from "A to B", and if "little to medium rain" is recorded as "medium rain". The specific weather category can be distinguished according to different passenger flow sending cities.

After the four influencing factors of the month, the week, the holiday attribute and the weather characteristic are subjected to preliminary classification coding, the data are further coded in an One-hot coding mode. This is because the coded values of these four influencing factors are only characteristic classes representing attributes, and are not continuous variables, and therefore, need to be expressed in a discretized manner. One-hot encoding employs an N-bit state register to encode N states, each state having its own independent register bit and only One bit being active at any time. This first requires mapping the classification values to integer values. Each integer value is then represented as a binary vector, labeled 0 except for the index of the integer labeled 1. For example, for a week attribute with 7 states, 7 bytes are required for encoding, as shown in table 3.

TABLE 3 week Attribute expressed in One-hot code

Similarly, month, holiday and weather attributes are subjected to One-hot coding. After the four influencing factors are coded in a One-hot coding mode, a new influencing factor sequence is expanded into 43 columns. And on the basis of the influence factor data subjected to coding processing and the passenger flow historical data, converting the data into a data set with supervised learning according to the input step length and the output step length predicted by the model. The input step size refers to the step size of historical data needing to be input in the practical application of the prediction model, and the output step size refers to the time length predicted by the model. In the invention, an input step size is taken as 14, an output step size is taken as 7, namely 14 days of historical data are input to predict the passenger flow of 7 days in the future.

2. Constructing long-short term memory neural network model

The Long-short-term memory (LSTM) neural network model is a variant of the Recurrent Neural Network (RNN), which can learn Long-term dependence in input data and alleviate problems of gradient extinction and gradient explosion in model training; the LSTM neural network model has significant advantages in processing data with non-linear time series. The change in intercity passenger flow in a railway, although fluctuating significantly in the short term, is still based on the long-term trend of passenger flow changes and recent passenger flow levels, with very significant temporal correlation. Therefore, the invention selects the LSTM neural network model, exerts the grasp capability of the LSTM neural network model on the dynamic change rule of the time sequence, finely extracts and learns the short-term evolution characteristics of the intercity passenger flow of the railway, and accurately predicts the passenger flow.

The long-short term memory neural network model adds a structure called a memory unit in neurons of a hidden layer of the traditional recurrent neural network model to memorize past information, and adds a three-gate (input gate, forgetting gate and output gate) structure to control the use of historical information.

The three gates with time step as t moment are respectively marked as i_t、f_tAnd o_tThe corresponding candidate long-term memory, updated long-term memory and working memory are recorded as

c_tAnd h_t. Then at time t there is:

an input gate: i.e. i_t＝σ(W_i·[h_t-1,x_t]+b_i)

Forget the door: f. of_t＝σ(W_f·[h_t-1,x_t]+b_f)

An output gate: o_t＝σ(W_o·[h_t-1,x_t]+b_o)

Candidate long-term memory:

updating long-term memory:

working and memorizing: h is_t＝o_t·tanh(c_t)

Wherein, W_i、W_f、W_o、W_cAs a weight matrix, b_i、b_f、b_o、b_cRespectively, threshold values of the respective functions, h_t-1For working memory at time step t-1, x_tFor the input at time step t, σ is sigmoid activation function, tanh is hyperbolic tangent activation function, and "·" represents vector inner product. Forgetting the door to control how much information needs to be forgotten in the memory unit, the input door controls how much new information needs to be added in each memory unit, and the output door controls how much information needs to be output in each memory unit.

Although the memory unit in the long and short term memory neural network makes the model suitable for processing and predicting time series data, the three hyper-parameters of the learning rate, the number of hidden layers and the number of iterations are still difficult to determine, and the setting of the three hyper-parameters has great influence on the fitting capability, the training process and the effect of the model. In practical applications, the setting of the hyper-parameters is usually empirical and has great randomness and subjectivity. Therefore, the invention provides an improved particle swarm optimization for optimizing three hyper-parameters, namely the learning rate, the number of hidden layers and the iteration number in a long-short term memory network model, thereby achieving a better prediction effect.

3. Construction of an improved particle swarm algorithm

Particle Swarm Optimization (PSO) is widely applied to solving optimization problems, including parameter training of neural networks, with its simple operation and fast convergence speed. When solving the optimization problem, the particle swarm algorithm updates the speed and position thereof by tracking the individual optimal particles and the population optimal particles. This process can be described as follows: in a search space of D dimension (i.e. there are D parameters to be optimized), a total of m particles form a population. In the t-th iteration, the speed and position of a certain particle are respectively v_tAnd x_t(ii) a Then the particle updates the speed v of the step t +1 by tracking the current optimal fitness pbest of the particle and the optimal fitness gbest of the group_t+1And position x_t+1。

Because the basic particle swarm optimization algorithm has limited global optimization capability and convergence speed, the invention is improved on the basis of the classical algorithm by two points. Firstly, the invention modifies the fixed inertial weight w in the velocity update expression of the particle to be dynamically variable with the number of iterations: the inertia weight is decreased at a nonlinear speed along with the increase of the iteration times, and the decreasing speed is increased along with the iteration times, so that the local optimization capability of the algorithm is ensured. In addition, the invention sets the fixed learning factor in the particle velocity update expression to be dynamically changed with the number of iterations: wherein the learning factor c corresponding to the locally optimal solution₁Changing from small to large along with the increase of the iteration times so as to accelerate the optimizing speed of the particles in the early stage; and a learning factor c corresponding to a global optimal solution₂It changes from large to small as the number of iterations increases to assist in the particle optimization accuracy at a later stage. The improved particle swarm algorithm technical process is as follows:

Updating the position of the particle: x is the number of_t+1＝x_t+v_t+1

Updating the inertia weight:

updating the learning factor:

On the basis of the optimal model, the historical data of the input step length and the prediction period influence factor data of the output step length are preprocessed and input into the trained model, and then the prediction result of the short-term inter-city passenger flow of the railway in the prediction period can be obtained.

In summary, the overall implementation flow of the model based on the long-short term memory neural network and the improved particle swarm optimization is shown in fig. 2.

4. Example analysis

In one embodiment of the invention, railway traffic data from Shanghai to Nanjing 2014 at 1 month 1 to 2018 at 12 month 24 at 1820 days, and weather information between five years are collected for instance verification. The first 70% of the data was taken as training samples and the last 30% as test samples. The method comprises the steps of preprocessing original data, carrying out One-hot coding on date attributes and weather factors, and converting coded data into a supervised data set. The predicted input step size is taken as 14, the output step size is taken as 7, and the Shanning railway passenger flow of 7 days in the future is predicted by using the historical data of 14 days. Therefore, the number of input columns of the finally preprocessed supervised data set is 917 columns, and the number of output columns is 7 columns.

The data are input into the PSO-LSTM model. And setting the population scale and the iteration times in the PSO algorithm and the upper and lower bounds of the training times, the hidden layer number and the learning rate in the LSTM algorithm according to the trial and experience. The obtained optimal model fitness is 91.58%. The optimal training times, the number of hidden layers, the learning rate and the change condition of the model fitness of the LSTM model are shown in fig. 3-6, and the passenger flow prediction result is shown in fig. 7.

In order to verify the effectiveness of the model, five models of PSO-LSTM-1, PSO-LSTM-2, PSO-LSTM-3, PSO-LSTM-4 and PSO-BP are established for comparison. Wherein the PSO-LSTM-1, PSO-LSTM-2, PSO-LSTM-3 and PSO-LSTM-4 models are models in which meteorological features, holiday attributes, month attributes and week attributes are deleted from the original PSO-LSTM model respectively. The BP neural network is a basic neural network model and is widely applied to short-term railway intercity passenger flow prediction, the LSTM model in the PSO-LSTM model is replaced by the BP neural network, and the PSO-BP model is established to be compared with an original model. The prediction average error pair ratio of different models is shown in table 4, and it can be seen that the PSO-LSTM model established by the invention has the optimal prediction effect.

TABLE 4 predicted average error contrast

Claims

1. A short-term prediction method of railway intercity passenger flow considering date attributes such as festivals and holidays and weather information is characterized in that,

the method comprehensively uses an improved particle swarm algorithm and a long-term and short-term memory neural network model, and comprises the following technical steps:

2. The prediction method according to claim 1, comprising the steps of:

extracting date characteristics of historical data according to attributes of months, weeks, festivals and holidays, and collecting weather information of cities; the holiday date attribute and the weather information jointly form influence factor data of passenger flow prediction;

on the basis of the collected passenger flow data and the collected influence factor data, encoding the original data by further adopting an One-hot encoding mode; using an N-bit state register to encode N states, each state having its own independent register bit and only one bit being valid at any time; this first requires mapping the classification values to integer values. Then, each integer value is represented as a binary vector, all but the index of the integer is labeled 1, and all are labeled 0;

after the data are coded, converting the data into a data set with supervised learning according to the input step length and the output step length predicted by the model;

constructing a long-short term memory neural network comprising an input gate, a forgetting gate and an output gate, and respectively recording three gates with time step t as i_t、f_tAnd o_t(ii) a Corresponding candidate long-term memory, updated long-term memory and workRespectively record as memory

c_tAnd h_t：

An input gate: i.e. i_t＝σ(W_i·[h_t-1，x_t]+b_i)

Forget the door: f. of_t＝σ(W_f·[h_t-1，x_t]+b_f)

An output gate: o_t＝σ(W_o·[h_t-1，x_t]+b_o)

Candidate long-term memory:

updating long-term memory:

working and memorizing: h is_t＝o_t·tanh(c_t)

Wherein, W_i、W_f、W_o、W_cAs a weight matrix, b_i、b_f、b_o、b_cRespectively, threshold values of the respective functions, h_t-1For working memory at time step t-1, x_tFor the input when the time step is t, sigma is a sigmoid activation function, tanh is a hyperbolic tangent activation function, and' represents the inner product of the vectors;

Updating the position of the particle: x is the number of_t+1＝x_t+v_t+1

Updating the inertia weight:

updating the learning factor:

Sixthly, outputting the optimal result of the model after the maximum iteration times of the particle swarm algorithm are met;