CN114912666A - Short-time passenger flow volume prediction method based on CEEMDAN algorithm and attention mechanism - Google Patents
Short-time passenger flow volume prediction method based on CEEMDAN algorithm and attention mechanism Download PDFInfo
- Publication number
- CN114912666A CN114912666A CN202210434929.XA CN202210434929A CN114912666A CN 114912666 A CN114912666 A CN 114912666A CN 202210434929 A CN202210434929 A CN 202210434929A CN 114912666 A CN114912666 A CN 114912666A
- Authority
- CN
- China
- Prior art keywords
- data
- layer
- passenger flow
- time
- output
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 53
- 230000007246 mechanism Effects 0.000 title claims abstract description 23
- 238000012549 training Methods 0.000 claims abstract description 33
- 238000000354 decomposition reaction Methods 0.000 claims abstract description 32
- 238000012545 processing Methods 0.000 claims abstract description 21
- 238000007781 pre-processing Methods 0.000 claims abstract description 8
- 239000011159 matrix material Substances 0.000 claims description 37
- 230000006870 function Effects 0.000 claims description 24
- 230000008569 process Effects 0.000 claims description 22
- 238000010606 normalization Methods 0.000 claims description 12
- 230000004913 activation Effects 0.000 claims description 10
- 238000004364 calculation method Methods 0.000 claims description 10
- 238000005457 optimization Methods 0.000 claims description 9
- 238000000605 extraction Methods 0.000 claims description 7
- 238000012360 testing method Methods 0.000 claims description 7
- 238000011176 pooling Methods 0.000 claims description 6
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 claims description 4
- 210000002569 neuron Anatomy 0.000 claims description 4
- 238000004140 cleaning Methods 0.000 claims description 3
- 230000000694 effects Effects 0.000 claims description 3
- 230000007787 long-term memory Effects 0.000 claims description 2
- 238000013507 mapping Methods 0.000 claims description 2
- 238000012216 screening Methods 0.000 claims description 2
- 230000006403 short-term memory Effects 0.000 claims description 2
- 239000000126 substance Substances 0.000 claims description 2
- 238000013528 artificial neural network Methods 0.000 abstract description 4
- 230000003044 adaptive effect Effects 0.000 abstract description 3
- 230000004927 fusion Effects 0.000 abstract description 3
- 230000001788 irregular Effects 0.000 abstract description 2
- 238000013527 convolutional neural network Methods 0.000 description 10
- 230000015654 memory Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000000306 recurrent effect Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000037237 body shape Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/16—Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/047—Probabilistic or stochastic networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G06Q50/40—
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T90/00—Enabling technologies or technologies with a potential or indirect contribution to GHG emissions mitigation
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Data Mining & Analysis (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- General Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- Biomedical Technology (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Business, Economics & Management (AREA)
- Computational Mathematics (AREA)
- Pure & Applied Mathematics (AREA)
- Economics (AREA)
- Mathematical Optimization (AREA)
- Human Resources & Organizations (AREA)
- Strategic Management (AREA)
- Mathematical Analysis (AREA)
- Probability & Statistics with Applications (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Tourism & Hospitality (AREA)
- General Business, Economics & Management (AREA)
- Marketing (AREA)
- Entrepreneurship & Innovation (AREA)
- Game Theory and Decision Science (AREA)
- Development Economics (AREA)
- Algebra (AREA)
- Databases & Information Systems (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention belongs to the field of short-time passenger flow prediction of rail transit, and provides a CNN-LSTM fusion neural network short-time passenger flow prediction method based on a complete empirical mode decomposition algorithm (CEEMDAN) and an Attention mechanism (Attention) of adaptive noise. The method comprises the following steps: s1, preprocessing data; s2, processing an input layer; s3, hidden layer processing; s4, processing an output layer; and S5, training a model. The method has the advantages of strong peak prediction capability, noise problem consideration of prediction results and strong capability of resisting irregular training data shapes.
Description
Technical Field
The invention belongs to the field of short-time passenger flow prediction of rail transit, and particularly relates to a CNN-LSTM fusion neural network short-time passenger flow prediction method based on a complete empirical mode decomposition algorithm (CEEMDAN) and an Attention mechanism (Attention) of adaptive noise.
Background
The short-time passenger flow prediction of the rail transit takes several minutes as a span, and the number of passengers in a certain time period in the future is predicted according to the passenger flow data of the historical subway station. Accurate short-time passenger flow prediction can provide early warning information for the subway station, so that workers can take specific measures to relieve load pressure of the subway station. In terms of image recognition, Convolutional Neural Networks (CNNs) are widely used due to the fixed-scale input and the large number of similar non-essential pixel-neuron connections that can be eliminated. In the field of machine translation and speech recognition, a Recurrent Neural Network (RNN) and a long-short term memory network (LSTM) of a variant thereof enable a model to learn in sequence prediction through an internal state mechanism and utilize information such as time or sequence of previous and later periods, so that words and sentences can be well predicted. Deep learning has universality to problems to be solved, and a model training task can be completed only by providing enough learning data. In the short-time passenger flow prediction problem, a proper model can be built according to the historical passenger flow information of the subway line for training, so that the future passenger flow change can be predicted. However, the utilization rate of the original data information by a single LSTM network is low, and only the time characteristics can be extracted. The passenger flow volume data is time-space two-dimensional data, and the existing method realizes the extraction of space characteristics by fusing a CNN network, thereby realizing certain performance improvement. But has certain disadvantages: the method comprises the steps of 1, the peak value prediction capability is insufficient 2, the noise part cannot be predicted properly, the prediction result is smooth, a certain difference is formed between the prediction result and real data, and the capability of resisting the shape of a small amount of irregular original data is poor.
The adaptive noise complete empirical mode decomposition algorithm (CEEMDAN) is a signal decomposition algorithm proposed by Torres et al on the basis of an empirical mode decomposition algorithm (EMD) and an Ensemble Empirical Mode Decomposition (EEMD) algorithm. The EMD algorithm decomposes the time series signals step by step to obtain a plurality of eigenmode functions imf (intrinsic mode function) and a residual term res. However, due to the existence of the mode aliasing problem, the time-frequency distribution is wrong, so that imf only meets the formal correctness and loses the real meaning. Wu and Huang propose an EEMD algorithm, and the mode aliasing caused by factors such as intermittent high-frequency components is solved by introducing white noise, but the mode aliasing cannot be overcome due to the fact that the introduced white noise assignment cannot be specifically quantized and when the initial value is set incorrectly. And when the original signal is recovered, the introduced white noise cannot be effectively eliminated, thereby bringing a large error. The method is different from the method that the EEMD algorithm directly adds the whole white noise, and the CEEMDAN algorithm adds the white noise component which is decomposed by the EMD during decomposition, so that the problems that the white noise is too much and cannot be removed when finally various dispersed signals are added are effectively solved. In the decomposition process, the CEEMDAN algorithm carries out weighted average immediately when decomposing imf components, and the problem that the inaccuracy of imf components brings too large errors for subsequent decomposition is effectively solved. The CEEMDAN algorithm is not used for short-time passenger flow prediction of rail transit in the prior art.
Disclosure of Invention
Aiming at the problems, on the basis of the existing CNN-LSTM model, the CEEMDAN algorithm is added to an input layer of the CNN-LSTM model to realize the separation of main data and noise data, the noise and the main data are respectively trained and predicted, and finally the two prediction results are integrated to achieve a more accurate passenger flow prediction result. In order to solve the problem of insufficient peak prediction capability, the invention adds an Attention mechanism aiming at time dimension on the basis of a CEEMDAN-CNN-LSTM model structure, reduces information loss by removing a pooling layer in a CNN network, finally realizes a CEEMDAN-ConvLSTM-Attention model, and improves passenger flow prediction performance.
Technical scheme
A short-time passenger flow prediction method based on a CEEMDAN algorithm and an attention mechanism is characterized by comprising the following steps:
s1, data preprocessing
Firstly, preprocessing original data to obtain rail transit passenger flow volume data based on two dimensions of time and space. The raw data is transaction records which can be collected by a rail transit system. The pretreatment process is as follows: and cleaning and counting the original data according to fields such as transaction time, a station where the transaction occurs, transaction types and the like, namely screening and clearing transaction records such as buses and ferries according to each station, only keeping the transaction records of the subway stations, counting the total number of people entering the subway stations within a fixed time interval, and taking the total number of people entering the subway stations within each time interval as passenger flow data of the subway stations at the current time point.
The time-space two-dimensional matrix obtained after the raw data preprocessing is as follows:
where S denotes a site index ranging from 1 to m, and t denotes a time interval index ranging from 1 to n.
The passenger flow data of a certain subway station k from the time t-n to the last time interval t-1 can be expressed as:
the passenger flow data of all subway stations at a certain time interval i can be represented as:
s2, input layer processing
Performing data decomposition on the time-space two-dimensional passenger flow volume data by using a CEEMDAN algorithm on an input layer to obtain a main part data matrixSum noise partial data matrixAnd the training set and the test set are partitioned.
S2.1 CEEMDAN Algorithm processing procedure
Inputting the original input matrix into each columnThis is regarded as a continuous-time signal x (t) as the signal to be decomposed. The treatment process is as follows:
(3) introducing normally distributed Gaussian white noise into the signal to be decomposed, wherein X (t) is the original signal, n i (t) is Gaussian according to normal distributionWhite noise, N being the number of times noise is added, ξ 0 Standard deviation for noise:
X i (t)=X(t)+ξ 0 n i (t),i=1,2,3...,N
(4) the preprocessed signal is decomposed using the EMD algorithm, resulting in a number of first order imf components:
imf i 1 (t)=EMD(X i (t))
(3) the average of all imf components was taken as the first-order imf component of the CEEMDAN decomposition:
(4) the first order residual term res is calculated from the first order imf component:
(5) the first-order residual term res 1 (t) repeating the above process as a new signal, resulting in a second order imf component and a second order remainder. The new input signal is, after white noise is introduced:
res 1 (t)+ξ 0 n i (t),i=1,2,3...,N。
the second-order imf component after EMD algorithm decomposition is:
the second-order residue resulting from removing the second-order imf component is:
(6) when K is 1,2., K, the K-th margin is calculated as:
and (5) repeating the content in the step (5) by taking the obtained remainder as a new signal, and repeating the process to the K order until the generated remainder cannot be decomposed (the generated remainder is a monotonous function or the extreme value point is not more than two). Let res (t) be the remainder that cannot be decomposed finally.
(7) The final decomposition results are:
where res (t) is the remainder of the CEEMDAN decomposition K order imf component.
And S2.2, adding the high-frequency small amplitude signals obtained by decomposition to obtain noise part data, and adding the smoother low-frequency signals to obtain main part data. And integrating the time sequence decomposition results of a plurality of sites to obtain a main part data matrixSum noise partial data matrixThe following were used:
and S2.3, carrying out normalization processing on the decomposed data to enable the preprocessed data to be limited in a certain range, so that the problems of non-convergence and the like caused by singular sample data in training are solved.
Since the traffic data is very unevenly distributed over time, the value is very large at the peak and the traffic is 0 at some night time, normalization processing is indispensable. The scheme adopts a min-max normalization method, and the specific formula is as follows:
wherein X is the data currently being normalized, Y is the processed output data, X is max Is the maximum of all data points, X min The minimum of all data points. After normalization, all data points are at [0,1 ]]The numerical values within the interval.
S2.4 after normalization, the resulting passenger flow data that is continuous in the time dimension is converted into a supervised learning sequence shape that is acceptable to the LSTM network.
The input data to the LSTM network must conform to its required supervised learning sequence shape, changing the original data shape according to the predicted step size desired during the training process. The predicted step size represents the longest limit that each target value can carry historical passenger flow information during the training process. By inputting the sequence X t =[x 1 x 2 … x n ]For example, the prediction step is k, and the converted data format is:
the body part data matrix and the noise part data are converted into supervised learning sequence shapes, respectively, and provided to S3 and S5.
S3, hidden layer processing
According to the invention, the hidden layer adds an Attention mechanism on the basis of a CNN-LSTM model, removes a pooling layer, builds a ConvLSTM-Attention model, and respectively performs model training on main data and noise part data obtained by an input layer. The specific process is as follows:
firstly, extracting the spatial characteristics of a two-dimensional matrix through a convolutional layer, then inputting the obtained spatial characteristic sequence of a time dimension into an LSTM network for time characteristic extraction, hiding part of neurons through a Dropout layer to prevent an overfitting phenomenon (namely, a model converges in a training set and shows an overfitting to a test set), inputting an output sequence of the LSTM network into an attention mechanism layer to calculate a weight value of each data in the sequence, and then multiplying the weight value with the data. The output matrix is finally flattened by the Flatten layer into a one-dimensional sequence that the output layer can receive.
The method selects a 1-dimensional CNN network to perform spatial feature extraction, and extracts the main part data matrix processed in S2Sum noise partial data matrixAs separate inputs to the two models. The specific implementation method of the convolutional layer is as follows:
wherein the content of the first and second substances,is the input of the convolutional layer ofOr Is the convolution layer output, W is the weight obtained from model training, b is the bias obtained from model training, and σ is the Relu activation function.
The matrix after CNN network feature extraction is X t =[x t-n x t-(n-1) … x t-1 ] T The matrix is the distribution of the spatial eigenvalue in time, and the eigen matrix is used as the input of the LSTM network for model training.
The long-short term memory network LSTM is a recurrent neural network model constructed according to the LSTM concept proposed by Juergen Schmidhuber et al. The LSTM network includes an input layer, an LSTM layer, a fully connected layer, and an output layer. The LSTM layer includes a forgetting gate, an input gate, and an output gate, and the specific algorithm is as follows.
i t =σ(w i g[h t-1 ,x t ]+b i )
f t =σ(w f g[h t-1 ,x t ]+b f )
o t =σ(w o g[h t-1 ,x t ]+b o )
h t =o t *tanh(C t )
Wherein i t Representing the input Gate calculation procedure, f t Andindicating a forgetting gate calculation procedure, o t Representing the output gate calculation procedure, C t And h t Long-term and short-term memory parameters, respectively, σ denotes the sigmoid activation function, a denotes the matrix element-wise product, and w and b are weights and biases, respectively. h is t The final output values, weights and biases for the model learned parameters after the model is input for each supervised learning sequence.
Attention is quantified in neural networks as specific weight values. The attention mechanism implementation method of the invention is to add a dense layer with an activation function of softmax after the LSTM network. The full connection layer takes LSTM network output as input, and calculates a corresponding weight matrix through a softmax activation function, thereby realizing the effect of automatically learning weight parameters in the training process. The Softmax function is an activation function that maps neuron outputs to an interval of (0, 1), and the mapping result can be regarded as a probability. This probability formula is as follows.
Wherein x is k For the element for which we want to calculate the weight at present, x i Historical data required for use in calculating current prediction data. In an LSTM network, β t For data h output at a time t Corresponding weight, and multiplying the weight by the data to obtain enhanced output data h t ' and takes it as input data for the next calculation. As the result after the LSTM network training is a two-dimensional matrix formed by a plurality of input sequences, a Flatten layer is added behind the LSTM network to Flatten the two-dimensional matrix into a one-dimensional sequence which can be received by an output layer.
S4, output layer processing
And receiving the output of the hidden layer by using a full connection layer at an output layer and outputting the prediction result, wherein the prediction result is the prediction result of each time point under the current model parameter.
S5, model training
In each iteration process, error calculation is carried out on the prediction result output by the output layer and the real sequence, and model parameters are updated through an optimization algorithm.
After each round of training is finished, the loss function value is firstly calculated, parameters are updated through a model optimization algorithm, the loss function value is reduced by turns, and the prediction error is reduced. The optimization algorithm selected by the scheme is an Adam optimizer, so that the problems of gradient dip and the like can be effectively converged and overcome. The selected loss function is Mean Square Error (MSE), and the specific formula is as follows:
where N is the total number of input samples, y i In order to achieve the target value,is a predicted value. And updating the model parameters through an Adam optimization algorithm to reduce the error function value until the model converges.
Model testing
Inputting test set data into the trained model for prediction, integrating prediction results of the two models, adding noise prediction data and main part prediction data at the same time point to obtain final passenger flow prediction data, performing inverse normalization processing on the prediction results, calculating MAE and RMSE errors, and comparing model prediction performance. The anti-normalization processing formula is as follows:
wherein Y is the result after inverse normalization,for the current predicted value, X max And X max The maximum and minimum values of the input data during the normalization process, respectively.
The invention has the beneficial effects that:
(1) the CEEMDAN-ConvLSTM-Attention model designed by the invention greatly improves the accuracy of the short-time passenger flow prediction result. The ConvLSTM fusion model makes full use of time and space characteristics in the original data of subway passenger flow, so that the accuracy of a prediction result is far higher than that of a single LSTM network prediction result.
(2) The method integrates an attention mechanism aiming at the time dimension, solves the problem that the traditional model is insufficient in peak value prediction capability when a short-time passenger flow prediction task is realized, and improves the practical significance of a prediction result.
(3) The model provided by the invention is integrated with a CEEMDAN algorithm, so that noise data and main data are respectively predicted, and the problems that the prediction result is smooth and the real situation cannot be well fitted at the data fluctuation position in the past are effectively solved.
Drawings
FIG. 1 is a diagram of the CEEMDAN-CNN-LSTM-Attention model structure
FIG. 2 shows the decomposition results of CEEMDAN algorithm
FIG. 3 shows the prediction results of CEEMDAN-ConvLSTM-Attention model
FIG. 4 shows the predicted results of the CNN-LSTM model
Detailed Description
The technical solutions provided in the present application will be further described with reference to the following specific embodiments and accompanying drawings. The advantages and features of the present application will become more apparent in conjunction with the following description.
As shown in fig. 1, a short-time passenger flow prediction method based on a CEEMDAN algorithm and an attention mechanism is characterized by comprising the following steps:
s1, data preprocessing
S2, input layer processing
Performing data decomposition on the time-space two-dimensional passenger flow volume data by using a CEEMDAN algorithm on an input layer to obtain a main part data matrixSum noise partial data matrixAnd the training set and the test set are partitioned.
S3, hidden layer processing
The hidden layer adds an Attention mechanism on the basis of the CNN-LSTM model, removes a pooling layer, establishes a ConvLSTM-Attention model, and respectively performs model training on main data and noise part data obtained by the input layer.
S4, output layer processing
And receiving the output of the hidden layer by using a full connection layer at an output layer and outputting the prediction result, wherein the prediction result is the prediction result of each time point under the current model parameter.
S5, model training
In each iteration process, error calculation is carried out on the prediction result output by the output layer and the real sequence, and model parameters are updated through an optimization algorithm.
Example 1
In this embodiment, a data set of a backrush rail transit transaction record in 2015 is used as original data, and two-dimensional data of a plurality of subway stations with space and time intervals of every 10 minutes is obtained by processing methods such as data cleaning, and a training set and a test set are divided by the data.
In the actual experiment process, considering that an LSTM model has better prediction capability on smooth shape signals, the invention selects high-frequency signals to be added as noise components, and adds other components as trend items to ensure that the main body shape of the passenger flow volume signal is not lost due to excessive decomposition, and the specific decomposition formula is as follows:
X=∑imf High +∑imf i
in step S2, the decomposition result of the CEEMDAN algorithm is shown in fig. 2, where signal is the original passenger flow volume data map of a single day. IMF1 is high-frequency noise partial data obtained by decomposition, and IMF2 is main body partial data obtained by decomposition. In the specific decomposition process, the data of a single day is not decomposed one by one, and the specific operation is to decompose the whole data set, so that the continuity of the data is not damaged in the decomposition process, the split data is split according to continuous time data, and unnecessary information loss or data alignment errors are avoided. The same splitting method is adopted for the working day data set and all the date data sets, and the fairness principle of comparison is guaranteed.
In the CNN-LSTM module, a one-dimensional CNN network is selected and the pooling layer of the CNN network is removed to extract the spatial characteristics of the passenger flow data. Since the passenger flow volume data is the distribution of the passenger flow volumes of a plurality of subway stations in time, the effect of extracting the spatial features among the passenger flow volumes of the stations without damaging the time features can be achieved by using the one-dimensional CNN network. And taking the output result of the CNN network as the input of the LSTM network to extract the time characteristics, thereby achieving the purpose of extracting the time and space characteristics. And a softmax layer is added behind the LSTM network to serve as an attention mechanism layer, and the passenger flow prediction result is further optimized by automatically learning the weight of each time point through each training turn.
As can be seen by comparing FIG. 3 with FIG. 4, the single CNN-LSTM network prediction result is poor at the peak, the CEEMDAN-ConvLSTM-orientation model prediction result at the peak is more excellent, and the fitting degree of the prediction result and the real data is higher. The following table shows the concrete data comparison of CEEMDAN-ConvLTM-Attention model prediction error with other model prediction errors. Where the errors are chosen as mean absolute error MAE and root mean square error RMSE and the predicted results are compared by the data set for both working and full date.
The above description is only illustrative of the preferred embodiments of the present application and is not intended to limit the scope of the present application in any way. Any changes or modifications made by those skilled in the art based on the above disclosure should be considered as equivalent effective embodiments, and all the changes or modifications should fall within the protection scope of the technical solution of the present application.
Claims (5)
1. A short-time passenger flow prediction method based on a CEEMDAN algorithm and an attention mechanism is characterized by comprising the following steps:
s1, data preprocessing
S2, input layer processing
Performing data decomposition on the time-space two-dimensional passenger flow volume data by using a CEEMDAN algorithm on an input layer to obtain a main part data matrixSum noise partial data matrixAnd dividing a training set and a test set;
s3, hidden layer processing
The hidden layer adds an Attention mechanism on the basis of the CNN-LSTM model, removes a pooling layer, establishes a ConvLSTM-Attention model, and respectively performs model training on main data and noise part data obtained by an input layer;
s4, output layer processing
Receiving the output of the hidden layer by using a full connection layer on an output layer and outputting a prediction result, wherein the prediction result is the prediction result of each time point under the current model parameter;
s5, model training
In each iteration process, error calculation is carried out on the prediction result output by the output layer and the real sequence, and model parameters are updated through an optimization algorithm.
2. The method for predicting short-term passenger flow based on the CEEMDAN algorithm and the attention mechanism as claimed in claim 1, wherein the step S1 is as follows:
preprocessing original data to obtain rail transit passenger flow volume data based on two dimensions of time and space, wherein the original data are transaction records which can be collected by a rail transit system;
the pretreatment process is as follows: cleaning and counting original data according to fields such as transaction time, a station where a transaction occurs, transaction types and the like, namely screening and clearing transaction records such as buses and ferrys according to each station, only keeping the transaction records of subway stations, counting the total number of people entering each subway station within a fixed time interval, and taking the total number of people entering each subway station within each time interval as passenger flow data of the subway station at the current time point;
the time-space two-dimensional matrix obtained after the raw data preprocessing is as follows:
wherein S represents a site index ranging from 1 to m, and t represents a time interval index ranging from 1 to n; the passenger flow data of a certain subway station k from the time t-n to the last time interval t-1 can be expressed as:
the passenger flow data of all subway stations at a certain time interval i can be represented as:
3. the method for predicting the short-term passenger flow based on the CEEMDAN algorithm and the attention mechanism as claimed in claim 1, wherein the step S2 is as follows:
s2.1 CEEMDAN Algorithm processing procedure
Inputting the original input matrix into each columnAs a continuous-time signal x (t), as the signal to be decomposed; the treatment process is as follows:
(1) introducing normally distributed Gaussian white noise into the signal to be decomposed, wherein X (t) is the original signal, n i (t) is white Gaussian noise conforming to normal distribution, N is the number of times noise is added, ξ 0 Standard deviation for noise:
X i (t)=X(t)+ξ 0 n i (t),i=1,2,3...,N
(2) the preprocessed signal is decomposed using the EMD algorithm, resulting in a number of first order imf components:
imf i 1 (t)=EMD(X i (t))
(3) the average of all imf components was taken as the first-order imf component of the CEEMDAN decomposition:
(4) the first order residual term res is calculated from the first order imf component:
(5) the first order residual term res 1 (t) repeating the above process as a new signal to obtain a second order imf component and a second order remainder; the new input signal is, after white noise is introduced:
res 1 (t)+ξ 0 n i (t),i=1,2,3...,N。
the second-order imf component after EMD algorithm decomposition is:
the second-order residue resulting from removing the second-order imf component is:
(6) when K is 1,2., K, the K-th margin is calculated as:
repeating the content in the step (5) by taking the obtained remainder as a new signal, and repeating the process to K order until the generated remainder cannot be decomposed (is a monotonous function or has no more than two extreme points); recording the residual items which can not be decomposed finally as res (t);
(7) the final decomposition results are:
where res (t) is the remainder of the CEEMDAN decomposition K order imf component;
s2.2, adding the high-frequency small-amplitude signals obtained by decomposition to obtain noise part data, and adding the smoother low-frequency signals to obtain main part data; and integrating the time sequence decomposition results of a plurality of sites to obtain a main part data matrixSum noise partial data matrixThe following were used:
s2.3, normalization processing is carried out on the decomposed data, so that the preprocessed data are limited in a certain range, and the problems of non-convergence and the like caused by singular sample data in training are solved;
the scheme adopts a min-max normalization method, and the specific formula is as follows:
wherein X is the data currently being normalized, Y is the processed output data, X is max Is the maximum of all data points, X min Is the minimum of all data points;
s2.4, after normalization processing, converting the obtained passenger flow volume data continuous in the time dimension into a supervised learning sequence shape acceptable by the LSTM network;
the body part data matrix and the noise part data are converted into supervised learning sequence shapes, respectively, and provided to S3 and S5.
4. The method for predicting the short-term passenger flow based on the CEEMDAN algorithm and the attention mechanism as claimed in claim 1, wherein the step S3 is as follows:
the hidden layer adds an Attention mechanism on the basis of the CNN-LSTM model, removes a pooling layer, builds a ConvLSTM-Attention model, and respectively performs model training on main data and noise part data obtained by an input layer; the specific process is as follows:
firstly, extracting spatial features of a two-dimensional matrix through a convolutional layer, then inputting an obtained spatial feature sequence of a time dimension into an LSTM network for time feature extraction, hiding part of neurons through a Dropout layer to prevent an overfitting phenomenon, inputting an output sequence of the LSTM network into an attention mechanism layer to calculate a weight value of each data in the sequence, multiplying the weight value with the data, and finally flattening the output matrix into a one-dimensional sequence which can be received by an output layer through a Flatten layer;
selecting a 1-dimensional CNN network for spatial feature extraction, and performing main part data matrix processing in S2Sum noise partial data matrixAs separate inputs to the two models; the specific implementation method of the convolutional layer is as follows:
wherein the content of the first and second substances,is the input of the convolutional layer ofOr Is the output of the convolutional layer, W is the weight obtained from the model training, b is the bias obtained from the model training, σ is the Relu activation function;
the matrix after CNN network feature extraction is X t =[x t-n x t-(n-1) …x t-1 ] T The matrix is the distribution of the space characteristic value in time, and the characteristic matrix is used as the input of an LSTM network for model training;
the LSTM network comprises an input layer, an LSTM layer, a full connection layer and an output layer; the LSTM layer comprises a forgetting gate, an input gate and an output gate, and the specific algorithm is as follows:
i t =σ(w i g[h t-1 ,x t ]+b i )
f t =σ(w f g[h t-1 ,x t ]+b f )
o t =σ(w o g[h t-1 ,x t ]+b o )
h t =o t *tanh(C t )
wherein i t Representing input Gate calculationsProcess, f t Andindicating a forgetting gate calculation procedure, o t Representing the output gate calculation procedure, C t And h t Long term and short term memory parameters, respectively, σ denotes a sigmoid activation function, w and b denote weights and offsets, respectively t Inputting the final output value, the weight and the bias of each supervised learning sequence after the model is input into the supervised learning sequence as parameters learned by the model;
the attention mechanism implementation method is that a dense layer with an activation function of softmax is added behind an LSTM network; the full connection layer takes LSTM network output as input, and calculates a corresponding weight matrix through a softmax activation function, thereby realizing the effect of automatically learning weight parameters in the training process; the Softmax function is an activation function that maps neuron outputs to an interval of (0, 1), and the mapping result can be regarded as a probability; this probability formula is as follows:
wherein x is k For the element for which we want to calculate the weight at present, x i Historical data required for use in calculating current prediction data; in an LSTM network, β t For data h output at a time t Corresponding weight, and multiplying the weight by the data to obtain the enhanced output data h t ' and use it as input data for the next calculation; as the result after the LSTM network training is a two-dimensional matrix formed by a plurality of input sequences, a Flatten layer is added behind the LSTM network to Flatten the two-dimensional matrix into a one-dimensional sequence which can be received by an output layer.
5. The method for predicting the short-term passenger flow based on the CEEMDAN algorithm and the attention mechanism as claimed in claim 1, wherein the step S5 is as follows:
after each round of training is finished, firstly calculating a loss function value, updating parameters through a model optimization algorithm, reducing the loss function value by turns, and reducing a prediction error; the optimization algorithm selected by the scheme is an Adam optimizer, so that the convergence can be more effective, and the problems of gradient dip and the like can be solved; the selected loss function is Mean Square Error (MSE), and the specific formula is as follows:
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210434929.XA CN114912666A (en) | 2022-04-24 | 2022-04-24 | Short-time passenger flow volume prediction method based on CEEMDAN algorithm and attention mechanism |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210434929.XA CN114912666A (en) | 2022-04-24 | 2022-04-24 | Short-time passenger flow volume prediction method based on CEEMDAN algorithm and attention mechanism |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114912666A true CN114912666A (en) | 2022-08-16 |
Family
ID=82764719
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210434929.XA Pending CN114912666A (en) | 2022-04-24 | 2022-04-24 | Short-time passenger flow volume prediction method based on CEEMDAN algorithm and attention mechanism |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114912666A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116050673A (en) * | 2023-03-31 | 2023-05-02 | 深圳市城市交通规划设计研究中心股份有限公司 | Urban public transport passenger flow short-time prediction method based on CNN-BiLSTM |
CN116842444A (en) * | 2023-07-03 | 2023-10-03 | 海南大学 | EEMD-CEEMDAN combined LSTM-based mixed time series data prediction method |
CN117313043A (en) * | 2023-10-25 | 2023-12-29 | 四川大学 | Wind power generation power prediction method |
-
2022
- 2022-04-24 CN CN202210434929.XA patent/CN114912666A/en active Pending
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116050673A (en) * | 2023-03-31 | 2023-05-02 | 深圳市城市交通规划设计研究中心股份有限公司 | Urban public transport passenger flow short-time prediction method based on CNN-BiLSTM |
CN116050673B (en) * | 2023-03-31 | 2023-08-01 | 深圳市城市交通规划设计研究中心股份有限公司 | Urban public transport passenger flow short-time prediction method based on CNN-BiLSTM |
CN116842444A (en) * | 2023-07-03 | 2023-10-03 | 海南大学 | EEMD-CEEMDAN combined LSTM-based mixed time series data prediction method |
CN117313043A (en) * | 2023-10-25 | 2023-12-29 | 四川大学 | Wind power generation power prediction method |
CN117313043B (en) * | 2023-10-25 | 2024-04-30 | 四川大学 | Wind power generation power prediction method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109214575B (en) | Ultrashort-term wind power prediction method based on small-wavelength short-term memory network | |
CN114912666A (en) | Short-time passenger flow volume prediction method based on CEEMDAN algorithm and attention mechanism | |
CN109490814B (en) | Metering automation terminal fault diagnosis method based on deep learning and support vector data description | |
CN109583565B (en) | Flood prediction method based on attention model long-time and short-time memory network | |
CN111292525B (en) | Traffic flow prediction method based on neural network | |
CN111563706A (en) | Multivariable logistics freight volume prediction method based on LSTM network | |
CN113094357B (en) | Traffic missing data completion method based on space-time attention mechanism | |
CN113905391B (en) | Integrated learning network traffic prediction method, system, equipment, terminal and medium | |
CN110580543A (en) | Power load prediction method and system based on deep belief network | |
CN111193256A (en) | Power load prediction method based on variational modal decomposition and gated cyclic unit | |
CN112330951B (en) | Method for realizing road network traffic data restoration based on generation of countermeasure network | |
Han et al. | Network traffic prediction using variational mode decomposition and multi-reservoirs echo state network | |
CN109583588B (en) | Short-term wind speed prediction method and system | |
CN116316591A (en) | Short-term photovoltaic power prediction method and system based on hybrid bidirectional gating cycle | |
CN114548592A (en) | Non-stationary time series data prediction method based on CEMD and LSTM | |
CN111553510A (en) | Short-term wind speed prediction method | |
CN114548532A (en) | VMD-based TGCN-GRU ultra-short-term load prediction method and device and electronic equipment | |
CN111141879B (en) | Deep learning air quality monitoring method, device and equipment | |
CN115204035A (en) | Generator set operation parameter prediction method and device based on multi-scale time sequence data fusion model and storage medium | |
CN115423145A (en) | Photovoltaic power prediction method based on Kmeans-VMD-WT-LSTM method | |
CN115034430A (en) | Carbon emission prediction method, device, terminal and storage medium | |
CN116052254A (en) | Visual continuous emotion recognition method based on extended Kalman filtering neural network | |
CN113255366A (en) | Aspect-level text emotion analysis method based on heterogeneous graph neural network | |
CN112508286A (en) | Short-term load prediction method based on Kmeans-BilSTM-DMD model | |
CN116227716A (en) | Multi-factor energy demand prediction method and system based on Stacking |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |