CN114912666A - Short-time passenger flow volume prediction method based on CEEMDAN algorithm and attention mechanism - Google Patents

Short-time passenger flow volume prediction method based on CEEMDAN algorithm and attention mechanism Download PDF

Info

Publication number
CN114912666A
CN114912666A CN202210434929.XA CN202210434929A CN114912666A CN 114912666 A CN114912666 A CN 114912666A CN 202210434929 A CN202210434929 A CN 202210434929A CN 114912666 A CN114912666 A CN 114912666A
Authority
CN
China
Prior art keywords
data
layer
passenger flow
time
output
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210434929.XA
Other languages
Chinese (zh)
Inventor
王嘉旋
王睿
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tongji University
Original Assignee
Tongji University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tongji University filed Critical Tongji University
Priority to CN202210434929.XA priority Critical patent/CN114912666A/en
Publication of CN114912666A publication Critical patent/CN114912666A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06Q50/40
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T90/00Enabling technologies or technologies with a potential or indirect contribution to GHG emissions mitigation

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Business, Economics & Management (AREA)
  • Computational Mathematics (AREA)
  • Pure & Applied Mathematics (AREA)
  • Economics (AREA)
  • Mathematical Optimization (AREA)
  • Human Resources & Organizations (AREA)
  • Strategic Management (AREA)
  • Mathematical Analysis (AREA)
  • Probability & Statistics with Applications (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Game Theory and Decision Science (AREA)
  • Development Economics (AREA)
  • Algebra (AREA)
  • Databases & Information Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention belongs to the field of short-time passenger flow prediction of rail transit, and provides a CNN-LSTM fusion neural network short-time passenger flow prediction method based on a complete empirical mode decomposition algorithm (CEEMDAN) and an Attention mechanism (Attention) of adaptive noise. The method comprises the following steps: s1, preprocessing data; s2, processing an input layer; s3, hidden layer processing; s4, processing an output layer; and S5, training a model. The method has the advantages of strong peak prediction capability, noise problem consideration of prediction results and strong capability of resisting irregular training data shapes.

Description

Short-time passenger flow volume prediction method based on CEEMDAN algorithm and attention mechanism
Technical Field
The invention belongs to the field of short-time passenger flow prediction of rail transit, and particularly relates to a CNN-LSTM fusion neural network short-time passenger flow prediction method based on a complete empirical mode decomposition algorithm (CEEMDAN) and an Attention mechanism (Attention) of adaptive noise.
Background
The short-time passenger flow prediction of the rail transit takes several minutes as a span, and the number of passengers in a certain time period in the future is predicted according to the passenger flow data of the historical subway station. Accurate short-time passenger flow prediction can provide early warning information for the subway station, so that workers can take specific measures to relieve load pressure of the subway station. In terms of image recognition, Convolutional Neural Networks (CNNs) are widely used due to the fixed-scale input and the large number of similar non-essential pixel-neuron connections that can be eliminated. In the field of machine translation and speech recognition, a Recurrent Neural Network (RNN) and a long-short term memory network (LSTM) of a variant thereof enable a model to learn in sequence prediction through an internal state mechanism and utilize information such as time or sequence of previous and later periods, so that words and sentences can be well predicted. Deep learning has universality to problems to be solved, and a model training task can be completed only by providing enough learning data. In the short-time passenger flow prediction problem, a proper model can be built according to the historical passenger flow information of the subway line for training, so that the future passenger flow change can be predicted. However, the utilization rate of the original data information by a single LSTM network is low, and only the time characteristics can be extracted. The passenger flow volume data is time-space two-dimensional data, and the existing method realizes the extraction of space characteristics by fusing a CNN network, thereby realizing certain performance improvement. But has certain disadvantages: the method comprises the steps of 1, the peak value prediction capability is insufficient 2, the noise part cannot be predicted properly, the prediction result is smooth, a certain difference is formed between the prediction result and real data, and the capability of resisting the shape of a small amount of irregular original data is poor.
The adaptive noise complete empirical mode decomposition algorithm (CEEMDAN) is a signal decomposition algorithm proposed by Torres et al on the basis of an empirical mode decomposition algorithm (EMD) and an Ensemble Empirical Mode Decomposition (EEMD) algorithm. The EMD algorithm decomposes the time series signals step by step to obtain a plurality of eigenmode functions imf (intrinsic mode function) and a residual term res. However, due to the existence of the mode aliasing problem, the time-frequency distribution is wrong, so that imf only meets the formal correctness and loses the real meaning. Wu and Huang propose an EEMD algorithm, and the mode aliasing caused by factors such as intermittent high-frequency components is solved by introducing white noise, but the mode aliasing cannot be overcome due to the fact that the introduced white noise assignment cannot be specifically quantized and when the initial value is set incorrectly. And when the original signal is recovered, the introduced white noise cannot be effectively eliminated, thereby bringing a large error. The method is different from the method that the EEMD algorithm directly adds the whole white noise, and the CEEMDAN algorithm adds the white noise component which is decomposed by the EMD during decomposition, so that the problems that the white noise is too much and cannot be removed when finally various dispersed signals are added are effectively solved. In the decomposition process, the CEEMDAN algorithm carries out weighted average immediately when decomposing imf components, and the problem that the inaccuracy of imf components brings too large errors for subsequent decomposition is effectively solved. The CEEMDAN algorithm is not used for short-time passenger flow prediction of rail transit in the prior art.
Disclosure of Invention
Aiming at the problems, on the basis of the existing CNN-LSTM model, the CEEMDAN algorithm is added to an input layer of the CNN-LSTM model to realize the separation of main data and noise data, the noise and the main data are respectively trained and predicted, and finally the two prediction results are integrated to achieve a more accurate passenger flow prediction result. In order to solve the problem of insufficient peak prediction capability, the invention adds an Attention mechanism aiming at time dimension on the basis of a CEEMDAN-CNN-LSTM model structure, reduces information loss by removing a pooling layer in a CNN network, finally realizes a CEEMDAN-ConvLSTM-Attention model, and improves passenger flow prediction performance.
Technical scheme
A short-time passenger flow prediction method based on a CEEMDAN algorithm and an attention mechanism is characterized by comprising the following steps:
s1, data preprocessing
Firstly, preprocessing original data to obtain rail transit passenger flow volume data based on two dimensions of time and space. The raw data is transaction records which can be collected by a rail transit system. The pretreatment process is as follows: and cleaning and counting the original data according to fields such as transaction time, a station where the transaction occurs, transaction types and the like, namely screening and clearing transaction records such as buses and ferries according to each station, only keeping the transaction records of the subway stations, counting the total number of people entering the subway stations within a fixed time interval, and taking the total number of people entering the subway stations within each time interval as passenger flow data of the subway stations at the current time point.
The time-space two-dimensional matrix obtained after the raw data preprocessing is as follows:
Figure BDA0003612577690000021
where S denotes a site index ranging from 1 to m, and t denotes a time interval index ranging from 1 to n.
The passenger flow data of a certain subway station k from the time t-n to the last time interval t-1 can be expressed as:
Figure BDA0003612577690000031
the passenger flow data of all subway stations at a certain time interval i can be represented as:
Figure BDA0003612577690000032
s2, input layer processing
Performing data decomposition on the time-space two-dimensional passenger flow volume data by using a CEEMDAN algorithm on an input layer to obtain a main part data matrix
Figure BDA0003612577690000033
Sum noise partial data matrix
Figure BDA0003612577690000034
And the training set and the test set are partitioned.
S2.1 CEEMDAN Algorithm processing procedure
Inputting the original input matrix into each column
Figure BDA0003612577690000035
This is regarded as a continuous-time signal x (t) as the signal to be decomposed. The treatment process is as follows:
(3) introducing normally distributed Gaussian white noise into the signal to be decomposed, wherein X (t) is the original signal, n i (t) is Gaussian according to normal distributionWhite noise, N being the number of times noise is added, ξ 0 Standard deviation for noise:
X i (t)=X(t)+ξ 0 n i (t),i=1,2,3...,N
(4) the preprocessed signal is decomposed using the EMD algorithm, resulting in a number of first order imf components:
imf i 1 (t)=EMD(X i (t))
(3) the average of all imf components was taken as the first-order imf component of the CEEMDAN decomposition:
Figure BDA0003612577690000036
(4) the first order residual term res is calculated from the first order imf component:
Figure BDA0003612577690000037
(5) the first-order residual term res 1 (t) repeating the above process as a new signal, resulting in a second order imf component and a second order remainder. The new input signal is, after white noise is introduced:
res 1 (t)+ξ 0 n i (t),i=1,2,3...,N。
the second-order imf component after EMD algorithm decomposition is:
Figure BDA0003612577690000038
the second-order residue resulting from removing the second-order imf component is:
Figure BDA0003612577690000039
(6) when K is 1,2., K, the K-th margin is calculated as:
Figure BDA0003612577690000041
and (5) repeating the content in the step (5) by taking the obtained remainder as a new signal, and repeating the process to the K order until the generated remainder cannot be decomposed (the generated remainder is a monotonous function or the extreme value point is not more than two). Let res (t) be the remainder that cannot be decomposed finally.
(7) The final decomposition results are:
Figure BDA0003612577690000042
where res (t) is the remainder of the CEEMDAN decomposition K order imf component.
And S2.2, adding the high-frequency small amplitude signals obtained by decomposition to obtain noise part data, and adding the smoother low-frequency signals to obtain main part data. And integrating the time sequence decomposition results of a plurality of sites to obtain a main part data matrix
Figure BDA0003612577690000043
Sum noise partial data matrix
Figure BDA0003612577690000044
The following were used:
Figure BDA0003612577690000045
Figure BDA0003612577690000046
and S2.3, carrying out normalization processing on the decomposed data to enable the preprocessed data to be limited in a certain range, so that the problems of non-convergence and the like caused by singular sample data in training are solved.
Since the traffic data is very unevenly distributed over time, the value is very large at the peak and the traffic is 0 at some night time, normalization processing is indispensable. The scheme adopts a min-max normalization method, and the specific formula is as follows:
Figure BDA0003612577690000047
wherein X is the data currently being normalized, Y is the processed output data, X is max Is the maximum of all data points, X min The minimum of all data points. After normalization, all data points are at [0,1 ]]The numerical values within the interval.
S2.4 after normalization, the resulting passenger flow data that is continuous in the time dimension is converted into a supervised learning sequence shape that is acceptable to the LSTM network.
The input data to the LSTM network must conform to its required supervised learning sequence shape, changing the original data shape according to the predicted step size desired during the training process. The predicted step size represents the longest limit that each target value can carry historical passenger flow information during the training process. By inputting the sequence X t =[x 1 x 2 … x n ]For example, the prediction step is k, and the converted data format is:
Figure BDA0003612577690000051
the body part data matrix and the noise part data are converted into supervised learning sequence shapes, respectively, and provided to S3 and S5.
S3, hidden layer processing
According to the invention, the hidden layer adds an Attention mechanism on the basis of a CNN-LSTM model, removes a pooling layer, builds a ConvLSTM-Attention model, and respectively performs model training on main data and noise part data obtained by an input layer. The specific process is as follows:
firstly, extracting the spatial characteristics of a two-dimensional matrix through a convolutional layer, then inputting the obtained spatial characteristic sequence of a time dimension into an LSTM network for time characteristic extraction, hiding part of neurons through a Dropout layer to prevent an overfitting phenomenon (namely, a model converges in a training set and shows an overfitting to a test set), inputting an output sequence of the LSTM network into an attention mechanism layer to calculate a weight value of each data in the sequence, and then multiplying the weight value with the data. The output matrix is finally flattened by the Flatten layer into a one-dimensional sequence that the output layer can receive.
The method selects a 1-dimensional CNN network to perform spatial feature extraction, and extracts the main part data matrix processed in S2
Figure BDA0003612577690000052
Sum noise partial data matrix
Figure BDA0003612577690000053
As separate inputs to the two models. The specific implementation method of the convolutional layer is as follows:
Figure BDA0003612577690000054
wherein the content of the first and second substances,
Figure BDA0003612577690000055
is the input of the convolutional layer of
Figure BDA0003612577690000056
Or
Figure BDA0003612577690000057
Figure BDA0003612577690000058
Is the convolution layer output, W is the weight obtained from model training, b is the bias obtained from model training, and σ is the Relu activation function.
The matrix after CNN network feature extraction is X t =[x t-n x t-(n-1) … x t-1 ] T The matrix is the distribution of the spatial eigenvalue in time, and the eigen matrix is used as the input of the LSTM network for model training.
The long-short term memory network LSTM is a recurrent neural network model constructed according to the LSTM concept proposed by Juergen Schmidhuber et al. The LSTM network includes an input layer, an LSTM layer, a fully connected layer, and an output layer. The LSTM layer includes a forgetting gate, an input gate, and an output gate, and the specific algorithm is as follows.
i t =σ(w i g[h t-1 ,x t ]+b i )
f t =σ(w f g[h t-1 ,x t ]+b f )
Figure BDA0003612577690000061
o t =σ(w o g[h t-1 ,x t ]+b o )
Figure BDA0003612577690000062
h t =o t *tanh(C t )
Wherein i t Representing the input Gate calculation procedure, f t And
Figure BDA0003612577690000063
indicating a forgetting gate calculation procedure, o t Representing the output gate calculation procedure, C t And h t Long-term and short-term memory parameters, respectively, σ denotes the sigmoid activation function, a denotes the matrix element-wise product, and w and b are weights and biases, respectively. h is t The final output values, weights and biases for the model learned parameters after the model is input for each supervised learning sequence.
Attention is quantified in neural networks as specific weight values. The attention mechanism implementation method of the invention is to add a dense layer with an activation function of softmax after the LSTM network. The full connection layer takes LSTM network output as input, and calculates a corresponding weight matrix through a softmax activation function, thereby realizing the effect of automatically learning weight parameters in the training process. The Softmax function is an activation function that maps neuron outputs to an interval of (0, 1), and the mapping result can be regarded as a probability. This probability formula is as follows.
Figure BDA0003612577690000064
Wherein x is k For the element for which we want to calculate the weight at present, x i Historical data required for use in calculating current prediction data. In an LSTM network, β t For data h output at a time t Corresponding weight, and multiplying the weight by the data to obtain enhanced output data h t ' and takes it as input data for the next calculation. As the result after the LSTM network training is a two-dimensional matrix formed by a plurality of input sequences, a Flatten layer is added behind the LSTM network to Flatten the two-dimensional matrix into a one-dimensional sequence which can be received by an output layer.
S4, output layer processing
And receiving the output of the hidden layer by using a full connection layer at an output layer and outputting the prediction result, wherein the prediction result is the prediction result of each time point under the current model parameter.
S5, model training
In each iteration process, error calculation is carried out on the prediction result output by the output layer and the real sequence, and model parameters are updated through an optimization algorithm.
After each round of training is finished, the loss function value is firstly calculated, parameters are updated through a model optimization algorithm, the loss function value is reduced by turns, and the prediction error is reduced. The optimization algorithm selected by the scheme is an Adam optimizer, so that the problems of gradient dip and the like can be effectively converged and overcome. The selected loss function is Mean Square Error (MSE), and the specific formula is as follows:
Figure BDA0003612577690000071
where N is the total number of input samples, y i In order to achieve the target value,
Figure BDA0003612577690000072
is a predicted value. And updating the model parameters through an Adam optimization algorithm to reduce the error function value until the model converges.
Model testing
Inputting test set data into the trained model for prediction, integrating prediction results of the two models, adding noise prediction data and main part prediction data at the same time point to obtain final passenger flow prediction data, performing inverse normalization processing on the prediction results, calculating MAE and RMSE errors, and comparing model prediction performance. The anti-normalization processing formula is as follows:
Figure BDA0003612577690000073
wherein Y is the result after inverse normalization,
Figure BDA0003612577690000074
for the current predicted value, X max And X max The maximum and minimum values of the input data during the normalization process, respectively.
The invention has the beneficial effects that:
(1) the CEEMDAN-ConvLSTM-Attention model designed by the invention greatly improves the accuracy of the short-time passenger flow prediction result. The ConvLSTM fusion model makes full use of time and space characteristics in the original data of subway passenger flow, so that the accuracy of a prediction result is far higher than that of a single LSTM network prediction result.
(2) The method integrates an attention mechanism aiming at the time dimension, solves the problem that the traditional model is insufficient in peak value prediction capability when a short-time passenger flow prediction task is realized, and improves the practical significance of a prediction result.
(3) The model provided by the invention is integrated with a CEEMDAN algorithm, so that noise data and main data are respectively predicted, and the problems that the prediction result is smooth and the real situation cannot be well fitted at the data fluctuation position in the past are effectively solved.
Drawings
FIG. 1 is a diagram of the CEEMDAN-CNN-LSTM-Attention model structure
FIG. 2 shows the decomposition results of CEEMDAN algorithm
FIG. 3 shows the prediction results of CEEMDAN-ConvLSTM-Attention model
FIG. 4 shows the predicted results of the CNN-LSTM model
Detailed Description
The technical solutions provided in the present application will be further described with reference to the following specific embodiments and accompanying drawings. The advantages and features of the present application will become more apparent in conjunction with the following description.
As shown in fig. 1, a short-time passenger flow prediction method based on a CEEMDAN algorithm and an attention mechanism is characterized by comprising the following steps:
s1, data preprocessing
S2, input layer processing
Performing data decomposition on the time-space two-dimensional passenger flow volume data by using a CEEMDAN algorithm on an input layer to obtain a main part data matrix
Figure BDA0003612577690000081
Sum noise partial data matrix
Figure BDA0003612577690000082
And the training set and the test set are partitioned.
S3, hidden layer processing
The hidden layer adds an Attention mechanism on the basis of the CNN-LSTM model, removes a pooling layer, establishes a ConvLSTM-Attention model, and respectively performs model training on main data and noise part data obtained by the input layer.
S4, output layer processing
And receiving the output of the hidden layer by using a full connection layer at an output layer and outputting the prediction result, wherein the prediction result is the prediction result of each time point under the current model parameter.
S5, model training
In each iteration process, error calculation is carried out on the prediction result output by the output layer and the real sequence, and model parameters are updated through an optimization algorithm.
Example 1
In this embodiment, a data set of a backrush rail transit transaction record in 2015 is used as original data, and two-dimensional data of a plurality of subway stations with space and time intervals of every 10 minutes is obtained by processing methods such as data cleaning, and a training set and a test set are divided by the data.
In the actual experiment process, considering that an LSTM model has better prediction capability on smooth shape signals, the invention selects high-frequency signals to be added as noise components, and adds other components as trend items to ensure that the main body shape of the passenger flow volume signal is not lost due to excessive decomposition, and the specific decomposition formula is as follows:
X=∑imf High +∑imf i
in step S2, the decomposition result of the CEEMDAN algorithm is shown in fig. 2, where signal is the original passenger flow volume data map of a single day. IMF1 is high-frequency noise partial data obtained by decomposition, and IMF2 is main body partial data obtained by decomposition. In the specific decomposition process, the data of a single day is not decomposed one by one, and the specific operation is to decompose the whole data set, so that the continuity of the data is not damaged in the decomposition process, the split data is split according to continuous time data, and unnecessary information loss or data alignment errors are avoided. The same splitting method is adopted for the working day data set and all the date data sets, and the fairness principle of comparison is guaranteed.
In the CNN-LSTM module, a one-dimensional CNN network is selected and the pooling layer of the CNN network is removed to extract the spatial characteristics of the passenger flow data. Since the passenger flow volume data is the distribution of the passenger flow volumes of a plurality of subway stations in time, the effect of extracting the spatial features among the passenger flow volumes of the stations without damaging the time features can be achieved by using the one-dimensional CNN network. And taking the output result of the CNN network as the input of the LSTM network to extract the time characteristics, thereby achieving the purpose of extracting the time and space characteristics. And a softmax layer is added behind the LSTM network to serve as an attention mechanism layer, and the passenger flow prediction result is further optimized by automatically learning the weight of each time point through each training turn.
As can be seen by comparing FIG. 3 with FIG. 4, the single CNN-LSTM network prediction result is poor at the peak, the CEEMDAN-ConvLSTM-orientation model prediction result at the peak is more excellent, and the fitting degree of the prediction result and the real data is higher. The following table shows the concrete data comparison of CEEMDAN-ConvLTM-Attention model prediction error with other model prediction errors. Where the errors are chosen as mean absolute error MAE and root mean square error RMSE and the predicted results are compared by the data set for both working and full date.
Figure BDA0003612577690000091
Figure BDA0003612577690000101
The above description is only illustrative of the preferred embodiments of the present application and is not intended to limit the scope of the present application in any way. Any changes or modifications made by those skilled in the art based on the above disclosure should be considered as equivalent effective embodiments, and all the changes or modifications should fall within the protection scope of the technical solution of the present application.

Claims (5)

1. A short-time passenger flow prediction method based on a CEEMDAN algorithm and an attention mechanism is characterized by comprising the following steps:
s1, data preprocessing
S2, input layer processing
Performing data decomposition on the time-space two-dimensional passenger flow volume data by using a CEEMDAN algorithm on an input layer to obtain a main part data matrix
Figure FDA0003612577680000011
Sum noise partial data matrix
Figure FDA0003612577680000012
And dividing a training set and a test set;
s3, hidden layer processing
The hidden layer adds an Attention mechanism on the basis of the CNN-LSTM model, removes a pooling layer, establishes a ConvLSTM-Attention model, and respectively performs model training on main data and noise part data obtained by an input layer;
s4, output layer processing
Receiving the output of the hidden layer by using a full connection layer on an output layer and outputting a prediction result, wherein the prediction result is the prediction result of each time point under the current model parameter;
s5, model training
In each iteration process, error calculation is carried out on the prediction result output by the output layer and the real sequence, and model parameters are updated through an optimization algorithm.
2. The method for predicting short-term passenger flow based on the CEEMDAN algorithm and the attention mechanism as claimed in claim 1, wherein the step S1 is as follows:
preprocessing original data to obtain rail transit passenger flow volume data based on two dimensions of time and space, wherein the original data are transaction records which can be collected by a rail transit system;
the pretreatment process is as follows: cleaning and counting original data according to fields such as transaction time, a station where a transaction occurs, transaction types and the like, namely screening and clearing transaction records such as buses and ferrys according to each station, only keeping the transaction records of subway stations, counting the total number of people entering each subway station within a fixed time interval, and taking the total number of people entering each subway station within each time interval as passenger flow data of the subway station at the current time point;
the time-space two-dimensional matrix obtained after the raw data preprocessing is as follows:
Figure FDA0003612577680000013
wherein S represents a site index ranging from 1 to m, and t represents a time interval index ranging from 1 to n; the passenger flow data of a certain subway station k from the time t-n to the last time interval t-1 can be expressed as:
Figure FDA0003612577680000021
the passenger flow data of all subway stations at a certain time interval i can be represented as:
Figure FDA0003612577680000022
3. the method for predicting the short-term passenger flow based on the CEEMDAN algorithm and the attention mechanism as claimed in claim 1, wherein the step S2 is as follows:
s2.1 CEEMDAN Algorithm processing procedure
Inputting the original input matrix into each column
Figure FDA0003612577680000023
As a continuous-time signal x (t), as the signal to be decomposed; the treatment process is as follows:
(1) introducing normally distributed Gaussian white noise into the signal to be decomposed, wherein X (t) is the original signal, n i (t) is white Gaussian noise conforming to normal distribution, N is the number of times noise is added, ξ 0 Standard deviation for noise:
X i (t)=X(t)+ξ 0 n i (t),i=1,2,3...,N
(2) the preprocessed signal is decomposed using the EMD algorithm, resulting in a number of first order imf components:
imf i 1 (t)=EMD(X i (t))
(3) the average of all imf components was taken as the first-order imf component of the CEEMDAN decomposition:
Figure FDA0003612577680000024
(4) the first order residual term res is calculated from the first order imf component:
Figure FDA0003612577680000025
(5) the first order residual term res 1 (t) repeating the above process as a new signal to obtain a second order imf component and a second order remainder; the new input signal is, after white noise is introduced:
res 1 (t)+ξ 0 n i (t),i=1,2,3...,N。
the second-order imf component after EMD algorithm decomposition is:
Figure FDA0003612577680000026
the second-order residue resulting from removing the second-order imf component is:
Figure FDA0003612577680000031
(6) when K is 1,2., K, the K-th margin is calculated as:
Figure FDA0003612577680000032
repeating the content in the step (5) by taking the obtained remainder as a new signal, and repeating the process to K order until the generated remainder cannot be decomposed (is a monotonous function or has no more than two extreme points); recording the residual items which can not be decomposed finally as res (t);
(7) the final decomposition results are:
Figure FDA0003612577680000033
where res (t) is the remainder of the CEEMDAN decomposition K order imf component;
s2.2, adding the high-frequency small-amplitude signals obtained by decomposition to obtain noise part data, and adding the smoother low-frequency signals to obtain main part data; and integrating the time sequence decomposition results of a plurality of sites to obtain a main part data matrix
Figure FDA0003612577680000034
Sum noise partial data matrix
Figure FDA0003612577680000035
The following were used:
Figure FDA0003612577680000036
Figure FDA0003612577680000037
s2.3, normalization processing is carried out on the decomposed data, so that the preprocessed data are limited in a certain range, and the problems of non-convergence and the like caused by singular sample data in training are solved;
the scheme adopts a min-max normalization method, and the specific formula is as follows:
Figure FDA0003612577680000038
wherein X is the data currently being normalized, Y is the processed output data, X is max Is the maximum of all data points, X min Is the minimum of all data points;
s2.4, after normalization processing, converting the obtained passenger flow volume data continuous in the time dimension into a supervised learning sequence shape acceptable by the LSTM network;
the body part data matrix and the noise part data are converted into supervised learning sequence shapes, respectively, and provided to S3 and S5.
4. The method for predicting the short-term passenger flow based on the CEEMDAN algorithm and the attention mechanism as claimed in claim 1, wherein the step S3 is as follows:
the hidden layer adds an Attention mechanism on the basis of the CNN-LSTM model, removes a pooling layer, builds a ConvLSTM-Attention model, and respectively performs model training on main data and noise part data obtained by an input layer; the specific process is as follows:
firstly, extracting spatial features of a two-dimensional matrix through a convolutional layer, then inputting an obtained spatial feature sequence of a time dimension into an LSTM network for time feature extraction, hiding part of neurons through a Dropout layer to prevent an overfitting phenomenon, inputting an output sequence of the LSTM network into an attention mechanism layer to calculate a weight value of each data in the sequence, multiplying the weight value with the data, and finally flattening the output matrix into a one-dimensional sequence which can be received by an output layer through a Flatten layer;
selecting a 1-dimensional CNN network for spatial feature extraction, and performing main part data matrix processing in S2
Figure FDA0003612577680000041
Sum noise partial data matrix
Figure FDA0003612577680000042
As separate inputs to the two models; the specific implementation method of the convolutional layer is as follows:
Figure FDA0003612577680000043
wherein the content of the first and second substances,
Figure FDA0003612577680000044
is the input of the convolutional layer of
Figure FDA0003612577680000045
Or
Figure FDA0003612577680000046
Figure FDA0003612577680000047
Is the output of the convolutional layer, W is the weight obtained from the model training, b is the bias obtained from the model training, σ is the Relu activation function;
the matrix after CNN network feature extraction is X t =[x t-n x t-(n-1) …x t-1 ] T The matrix is the distribution of the space characteristic value in time, and the characteristic matrix is used as the input of an LSTM network for model training;
the LSTM network comprises an input layer, an LSTM layer, a full connection layer and an output layer; the LSTM layer comprises a forgetting gate, an input gate and an output gate, and the specific algorithm is as follows:
i t =σ(w i g[h t-1 ,x t ]+b i )
f t =σ(w f g[h t-1 ,x t ]+b f )
Figure FDA0003612577680000048
o t =σ(w o g[h t-1 ,x t ]+b o )
Figure FDA0003612577680000049
h t =o t *tanh(C t )
wherein i t Representing input Gate calculationsProcess, f t And
Figure FDA00036125776800000410
indicating a forgetting gate calculation procedure, o t Representing the output gate calculation procedure, C t And h t Long term and short term memory parameters, respectively, σ denotes a sigmoid activation function, w and b denote weights and offsets, respectively t Inputting the final output value, the weight and the bias of each supervised learning sequence after the model is input into the supervised learning sequence as parameters learned by the model;
the attention mechanism implementation method is that a dense layer with an activation function of softmax is added behind an LSTM network; the full connection layer takes LSTM network output as input, and calculates a corresponding weight matrix through a softmax activation function, thereby realizing the effect of automatically learning weight parameters in the training process; the Softmax function is an activation function that maps neuron outputs to an interval of (0, 1), and the mapping result can be regarded as a probability; this probability formula is as follows:
Figure FDA0003612577680000051
wherein x is k For the element for which we want to calculate the weight at present, x i Historical data required for use in calculating current prediction data; in an LSTM network, β t For data h output at a time t Corresponding weight, and multiplying the weight by the data to obtain the enhanced output data h t ' and use it as input data for the next calculation; as the result after the LSTM network training is a two-dimensional matrix formed by a plurality of input sequences, a Flatten layer is added behind the LSTM network to Flatten the two-dimensional matrix into a one-dimensional sequence which can be received by an output layer.
5. The method for predicting the short-term passenger flow based on the CEEMDAN algorithm and the attention mechanism as claimed in claim 1, wherein the step S5 is as follows:
after each round of training is finished, firstly calculating a loss function value, updating parameters through a model optimization algorithm, reducing the loss function value by turns, and reducing a prediction error; the optimization algorithm selected by the scheme is an Adam optimizer, so that the convergence can be more effective, and the problems of gradient dip and the like can be solved; the selected loss function is Mean Square Error (MSE), and the specific formula is as follows:
Figure FDA0003612577680000052
where N is the total number of input samples, y i In order to achieve the target value,
Figure FDA0003612577680000053
is a predicted value; and updating the model parameters through an Adam optimization algorithm to reduce the error function value until the model converges.
CN202210434929.XA 2022-04-24 2022-04-24 Short-time passenger flow volume prediction method based on CEEMDAN algorithm and attention mechanism Pending CN114912666A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210434929.XA CN114912666A (en) 2022-04-24 2022-04-24 Short-time passenger flow volume prediction method based on CEEMDAN algorithm and attention mechanism

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210434929.XA CN114912666A (en) 2022-04-24 2022-04-24 Short-time passenger flow volume prediction method based on CEEMDAN algorithm and attention mechanism

Publications (1)

Publication Number Publication Date
CN114912666A true CN114912666A (en) 2022-08-16

Family

ID=82764719

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210434929.XA Pending CN114912666A (en) 2022-04-24 2022-04-24 Short-time passenger flow volume prediction method based on CEEMDAN algorithm and attention mechanism

Country Status (1)

Country Link
CN (1) CN114912666A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116050673A (en) * 2023-03-31 2023-05-02 深圳市城市交通规划设计研究中心股份有限公司 Urban public transport passenger flow short-time prediction method based on CNN-BiLSTM
CN116842444A (en) * 2023-07-03 2023-10-03 海南大学 EEMD-CEEMDAN combined LSTM-based mixed time series data prediction method
CN117313043A (en) * 2023-10-25 2023-12-29 四川大学 Wind power generation power prediction method

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116050673A (en) * 2023-03-31 2023-05-02 深圳市城市交通规划设计研究中心股份有限公司 Urban public transport passenger flow short-time prediction method based on CNN-BiLSTM
CN116050673B (en) * 2023-03-31 2023-08-01 深圳市城市交通规划设计研究中心股份有限公司 Urban public transport passenger flow short-time prediction method based on CNN-BiLSTM
CN116842444A (en) * 2023-07-03 2023-10-03 海南大学 EEMD-CEEMDAN combined LSTM-based mixed time series data prediction method
CN117313043A (en) * 2023-10-25 2023-12-29 四川大学 Wind power generation power prediction method
CN117313043B (en) * 2023-10-25 2024-04-30 四川大学 Wind power generation power prediction method

Similar Documents

Publication Publication Date Title
CN109214575B (en) Ultrashort-term wind power prediction method based on small-wavelength short-term memory network
CN114912666A (en) Short-time passenger flow volume prediction method based on CEEMDAN algorithm and attention mechanism
CN109490814B (en) Metering automation terminal fault diagnosis method based on deep learning and support vector data description
CN109583565B (en) Flood prediction method based on attention model long-time and short-time memory network
CN111292525B (en) Traffic flow prediction method based on neural network
CN111563706A (en) Multivariable logistics freight volume prediction method based on LSTM network
CN113094357B (en) Traffic missing data completion method based on space-time attention mechanism
CN113905391B (en) Integrated learning network traffic prediction method, system, equipment, terminal and medium
CN110580543A (en) Power load prediction method and system based on deep belief network
CN111193256A (en) Power load prediction method based on variational modal decomposition and gated cyclic unit
CN112330951B (en) Method for realizing road network traffic data restoration based on generation of countermeasure network
Han et al. Network traffic prediction using variational mode decomposition and multi-reservoirs echo state network
CN109583588B (en) Short-term wind speed prediction method and system
CN116316591A (en) Short-term photovoltaic power prediction method and system based on hybrid bidirectional gating cycle
CN114548592A (en) Non-stationary time series data prediction method based on CEMD and LSTM
CN111553510A (en) Short-term wind speed prediction method
CN114548532A (en) VMD-based TGCN-GRU ultra-short-term load prediction method and device and electronic equipment
CN111141879B (en) Deep learning air quality monitoring method, device and equipment
CN115204035A (en) Generator set operation parameter prediction method and device based on multi-scale time sequence data fusion model and storage medium
CN115423145A (en) Photovoltaic power prediction method based on Kmeans-VMD-WT-LSTM method
CN115034430A (en) Carbon emission prediction method, device, terminal and storage medium
CN116052254A (en) Visual continuous emotion recognition method based on extended Kalman filtering neural network
CN113255366A (en) Aspect-level text emotion analysis method based on heterogeneous graph neural network
CN112508286A (en) Short-term load prediction method based on Kmeans-BilSTM-DMD model
CN116227716A (en) Multi-factor energy demand prediction method and system based on Stacking

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination