AU2020102350A4 - A Spark-Based Deep Learning Method for Data-Driven Traffic Flow Forecasting - Google Patents
A Spark-Based Deep Learning Method for Data-Driven Traffic Flow Forecasting Download PDFInfo
- Publication number
- AU2020102350A4 AU2020102350A4 AU2020102350A AU2020102350A AU2020102350A4 AU 2020102350 A4 AU2020102350 A4 AU 2020102350A4 AU 2020102350 A AU2020102350 A AU 2020102350A AU 2020102350 A AU2020102350 A AU 2020102350A AU 2020102350 A4 AU2020102350 A4 AU 2020102350A4
- Authority
- AU
- Australia
- Prior art keywords
- data
- rdd
- traffic flow
- bilstm
- spark
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
- 238000000034 method Methods 0.000 title claims description 18
- 238000013135 deep learning Methods 0.000 title description 4
- 238000009826 distribution Methods 0.000 claims abstract description 18
- 238000005457 optimization Methods 0.000 claims abstract description 6
- 238000003860 storage Methods 0.000 claims abstract description 5
- 230000003993 interaction Effects 0.000 claims abstract description 3
- 208000022417 sinus histiocytosis with massive lymphadenopathy Diseases 0.000 claims description 57
- 238000007781 pre-processing Methods 0.000 claims description 13
- 238000005192 partition Methods 0.000 claims description 12
- 239000013598 vector Substances 0.000 claims description 9
- 238000013528 artificial neural network Methods 0.000 claims description 8
- 230000009466 transformation Effects 0.000 claims description 8
- 238000000844 transformation Methods 0.000 claims description 8
- 230000008569 process Effects 0.000 claims description 7
- 238000012545 processing Methods 0.000 claims description 7
- 238000006243 chemical reaction Methods 0.000 claims description 6
- 238000012549 training Methods 0.000 claims description 6
- 238000013500 data storage Methods 0.000 claims description 4
- 239000011159 matrix material Substances 0.000 claims description 4
- 230000006870 function Effects 0.000 claims description 3
- 230000004913 activation Effects 0.000 claims description 2
- 230000004931 aggregating effect Effects 0.000 claims description 2
- 230000002776 aggregation Effects 0.000 claims description 2
- 238000004220 aggregation Methods 0.000 claims description 2
- 230000007774 longterm Effects 0.000 claims description 2
- 230000008054 signal transmission Effects 0.000 claims description 2
- 238000000638 solvent extraction Methods 0.000 claims description 2
- 238000009499 grossing Methods 0.000 abstract description 5
- 238000013459 approach Methods 0.000 abstract description 4
- 230000008901 benefit Effects 0.000 abstract description 4
- 230000002159 abnormal effect Effects 0.000 abstract description 2
- 230000002457 bidirectional effect Effects 0.000 abstract description 2
- 238000004364 calculation method Methods 0.000 abstract description 2
- 238000012517 data analytics Methods 0.000 abstract 1
- 230000000116 mitigating effect Effects 0.000 abstract 1
- 230000006403 short-term memory Effects 0.000 abstract 1
- 101001095088 Homo sapiens Melanoma antigen preferentially expressed in tumors Proteins 0.000 description 9
- 102100037020 Melanoma antigen preferentially expressed in tumors Human genes 0.000 description 9
- 238000013527 convolutional neural network Methods 0.000 description 8
- 238000002474 experimental method Methods 0.000 description 6
- 238000011156 evaluation Methods 0.000 description 5
- 238000004458 analytical method Methods 0.000 description 2
- 230000007423 decrease Effects 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 238000004140 cleaning Methods 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000003745 diagnosis Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 238000005065 mining Methods 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 230000001932 seasonal effect Effects 0.000 description 1
- 230000029305 taxis Effects 0.000 description 1
- 238000000714 time series forecasting Methods 0.000 description 1
- 238000005303 weighing Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
-
- G—PHYSICS
- G08—SIGNALLING
- G08G—TRAFFIC CONTROL SYSTEMS
- G08G1/00—Traffic control systems for road vehicles
- G08G1/01—Detecting movement of traffic to be counted or controlled
- G08G1/0104—Measuring and analyzing of parameters relative to traffic conditions
- G08G1/0125—Traffic data processing
- G08G1/0129—Traffic data processing for creating historical data or processing based on historical data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/10—Services
- G06Q50/26—Government or public services
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Business, Economics & Management (AREA)
- Human Resources & Organizations (AREA)
- Strategic Management (AREA)
- Economics (AREA)
- General Business, Economics & Management (AREA)
- Biomedical Technology (AREA)
- Development Economics (AREA)
- Entrepreneurship & Innovation (AREA)
- Marketing (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Tourism & Hospitality (AREA)
- Chemical & Material Sciences (AREA)
- Analytical Chemistry (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Game Theory and Decision Science (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Traffic Control Systems (AREA)
Abstract
Taking advantage of big data technology has become a new concept and practice to improve
the capability of traffic management and control in data-driven intelligent transportation
systems, and especially timely and accurate traffic flow forecasting (TFF) is significant for
mitigating traffic congestion. To solve the problems of calculation and storage in dealing with
traffic big data using the traditional centralized models on a single machine, this invention
presents a Spark-based Weighted Bidirectional Long Short-Term Memory (SW-BiLSTM)
model to improve the robustness, accuracy, and timeliness of TFF in real time. Specifically,
we utilize the resilient distributed dataset (RDD) to preprocess mobile trajectory big data
(e.g., large-scale GPS trajectories of taxicabs) based on the Spark parallel distributed com
puting platform and then employ the Kalman filter (KF) approach to eliminate abnormal
GPS points and achieve discrete smoothing of traffic flow data. Moreover, a distributed
SW-BiLSTM model on Spark is proposed to enhance the accuracy and efficiency of real
time TFF, combined with the normal distribution to weigh the interaction between adjacent
road segments and the time window for achieving the optimization of BiLSTM. Finally, the
SW-BiLSTM model is implemented on a Spark parallel computing framework to improve
the efficiency and scalability of TFF. The present invention has broad applications in big
data analytics.
Description
1. Technical Field
[0001] The present invention relates to a field of big-data-driven intelligent transportation systems (ITSs).
2. Background
[0002] With the rapid development of urbanization, increased vehicle ownership has resulted in an increasingly severe problem - traffic congestion. Therefore, Traffic flow forecasting (TFF) has become one of the essential research fields of intelligent transportation systems (ITS) and advanced traffic management systems (ATMS). The underlying issue in this field is how to utilize the past (historical) and current traffic flow big data to forecast traffic flow accurately at the future time interval. Accurate and timely TFF is increasingly crucial for alleviating traffic congestion, improving the urban environment, and helping drivers make better travel decisions. However, real-time TFF is uncertain owing to the stochastic and nonlinear characteristics of traffic flow. Meanwhile, the traffic flow on the targeted road segment (TRS) is often affected by the adjacent road segments, which poses a threat to TFF, especially in complex transportation networks.
[0003] In recent years, data-driven TFF has become a hot research in the fields of ITS. Many cutting-edge models proposed by researchers to solve the TFF problem could be roughly di vided into different groups: (1) Parametric models, such as autoregressive integrated moving average (ARIMA), seasonal ARIMA (SARIMA), and Kalman filter (KF). (2) Nonparamet ric models, such as k-nearest neighbor (KNN), support vector regression (SVR), and neural networks (NNs). Although these methods have achieved ideal forecasting performance in processing small-scale sample data through a shallow structure model, they still have lim itations in handling large-scale sample data, especially for complex and variable nonlinear traffic flow data. Recently, deep learning has been successfully utilized to deal with these problems.
[0004] As an emerging machine learning method, deep learning has achieved remarkable performance in image processing, voice recognition, medical diagnosis, and intelligent trans portation. In comparison to conventional shallow learning architectures, the deep neural network can model deep complex nonlinear models using distributed and hierarchical fea ture representation Therefore, deep learning methods have been increasingly prominent in TFF in recent years. The LSTM neural network can capture the features on a comparatively long-time span, which is widely employed in TFF, and it is common that LSTM can utilize the past information rather than future information in the existing methods. However, the
BiLSTM neural network can process bi-directional data based on two different hidden layers to capture better information using past and future data information. Real-time TFF is a time series forecasting problem, and not only are traffic conditions related to the state at the current time interval, but also they are associated with the historical periods. Currently, when it comes to large-scale traffic flow data, there are still some issues in processing big traffic data such as low accuracy and inability to use the past and future traffic flow infor mation effectively, which significantly reduces the forecasting performance in accuracy and timeliness. BiLSTM has the capability of taking full advantage of past and future informa tion and capture the non-linear data from traffic flow, particularly the data with a more extended period, to enhance the forecasting accuracy of TFF. With the above-mentioned prominent advantages, the existing studies demonstrate that BiLSTM can produce a better performance than RNN, LSTM, ARIMA, SAE, SVM, CNN-LSTM, and DBN. However, only a few works have been done for real-time TFF with large-scale sample data on the Spark parallel distributed computing platform. This invention may be the first attempt to imple ment the optimization of BiLSTM based on the Spark framework for TFF. The optimized BiLSTM model implemented on Spark is combined with the normal distribution to weigh the influence of adjacent road segments, and real-time TFF on the TRS can be accurately forecasted by the time window.
[0005] In this invention, to improve the accuracy and timeliness of TFF, we develop a dis tributed Spark-based optimization model on the weighted BiLSTM neural network (SW BiLSTM). More specifically, the traditional BiLSTM model is optimized by combining with a normal distribution and time window, and then the optimized BiLSTM model is imple mented on a Spark parallel computing platform (SW-BiLSTM). Finally, SW-BiLSTM is employed to forecast real-time traffic flow with large-scale mobile trajectory data.
3. SW-BiLSTM model
[0006] In this invention, we put forward a distributed SW-BiLSTM model on Spark to improve the performance of traffic flow forecasting (TFF) in terms of accuracy, timeliness, and scalability, thereby addressing the real-time application problem in ITS.
3.1. Overview
[0007] In the SW-BiLSTM model, for improving the accuracy and timeliness of TFF with mobile trajectory big data (e.g., large-scale GPS trajectories of taxicabs), we employ the normal distribution and time window to optimize the traditional BiLSTM model and then implement the optimized BiLSTM model on Spark for the parallel forecasting and distributed computing of real-time traffic flow. As shown in Fig. 1, our method for real-time TFF based on SW-BiLSTM mainly consists of three steps. The transformations and actions of RDD are utilized for data preprocessing, which includes creating RDD for data reading, converting RDD for data computing, and starting RDD for data storage. Furthermore, a distributed SW-BiLSTM model is proposed to address the TFF problem in real time. Specifically, in SW-BiLTSM, the normal distribution is adopted to weigh the influence of adjacent road segments on the traffic flow on the targeted road segment (TRS), and a time window of size eight is used for TFF. Finally, the SW-BiLSTM model is implemented on the Spark parallel distributed computing platform by different RDD partitions.
3.2. Preprocessing
[0008] We preprocess traffic flow data to reduce the impact on the accuracy of TFF due to traffic abnormity, uncertain traffic conditions, and signal transmission. The original raw data are processed based on the Spark framework, namely performing data preprocessing by RDD (the core of Spark). RDD is a fault-tolerant, parallel data structure enabling users to store intermediate results in memory. It also controls the partitioning of the data set to achieve optimal data storage and processing and handles the data via a rich set of operators. Therefore, two types of operators of RDD (i.e., transformations and actions) are used for data preprocessing. Specifically, data are loaded into memory by transformations,then RDD is transformed, and finally data are stored through actions. As illustrated in Fig. 2, the data preprocessing based on the Spark parallel distributed computing platform mainly includes three Steps: RDD creation, RDD conversion, and RDD starting.
* [0009] Step 1: RDD creation.
[0010] Mobile trajectory data (e.g., GPS trajectory data) are extracted and uploaded to HDFS by adopting external storage with the way of creating RDD. The data stored in HDFS are read through the textFile in SparkContext object, and then they are loaded into the memory of a cluster. Finally, many RDDs could be created.
* [0011] Step 2: RDD conversion.
[0012] Data processing and conversion on RDD are conducted by calling map, filter, flatMap, sortByKey, distinct, reduceByKey, and other operators of Spark, which mainly consists of three tasks.
[0013] Task 1: The vehicle information on the TRS at the current time interval (CTI) is extracted. We first convert the data on each node into < keyl, value > pairs using the flatMap operator, and then employ the map operator to set keyl = time and vehicle ID, value = the number of the TRS. Moreover, the filter operator is utilized to filter the GPS trajectory data that do not belong to the selected TRS in the RDD. Finally, the sortByKey operator is used to sort keyl according to keyl in RDD < keyl, value >, and the distinct operator is employed to remove duplicates of the same vehicle at the CTI in RDD to obtain the vehicle information on the TRS.
[0014] Task 2: The data information extracted in Step 1 is read. We first convert the data on each node into < key2, value > pairs via the flatMap operator, and then the map operator is adopted to set key2 = time and area number and the value of value adds one. Finally, the reduceByKey operator is utilized to perform reduce operation according to the value of key2. The number of vehicles at the CTI is counted, and the total number of vehicles on the selected TRS at per time interval is obtained (i.e., traffic flow).
[0015] Task 3: The total number of vehicles on the TRS at the CTI t is incorporated into a one-dimensional array Xt composing a matrix X. We first convert the data distributed on each node into < key3,value3 > pairs by the flatMap operator, and then the map operator is employed to set key3 = time interval and value = the total number of vehicles on each TRS. Finally, the key3 is sorted through the sortByKey operator, and then results are outputted.
• [0016] Step 3: RDD starting.
[0017] The data flow is transferred in different RDD partitions based on transformations operator, and then the Actions operator is started, and the data are stored in HDFS by calling saveAsTextFile to achieve distributed storage of data.
3.3. Model
[0018] LSTM can address the long-term dependency learning problem, and thus has great potential to forecast traffic flow. The LSTM unit maintains a separate memory cell, which is the crucial element of the LSTM neural network structure. LSTM can only utilize past information rather than future information. BiLSTM, however, can meditate the information in both the past and future, and consists of two stacked unidirectional LSTMs. In this invention, we employ the past traffic flow with the positive state and the future traffic flow with the reverse state to forecast real-time traffic flow. Specifically, we first input the traffic flow into the forward LSTM network layer for getting an output vector and then input the traffic flow in the opposite direction into the backward LSTM network layer to get another output vector. Finally, the final output can be produced by combining the two output vectors.
[0019] In this invention, it is necessary to measure the weight of the influence of the adjacent road segments on the TRS because traffic flow is affected by the traffic flow on the adjacent road segments and does not exist independently. Parameters 6 and u in the normal distri bution determine the position and size of the image. Thus, when the parameter 6 is fixed, the closer the sample point is next to theu parameter, the higher the weighted value is. That is why normal distribution is used to calculate the weight of the targeted road segment and adjacent road segments. Moreover, to enhance the forecasting accuracy of TFF, this invention takes the variation of traffic flow between the adjacent road segment and the TRS into consideration, as shown in Formula (1).
1 ((x -u)2\ f (X) v26 exp - 262 2 (di - dm) (1)
where the u of normal distribution represents traffic flow on the TRS, 6 can be set as 0.6, x denotes discrete value of the traffic flow on each road segment, di is the traffic flow on each road segment, and dm represents the traffic flow on each TRS.
[0020] With the traffic flow V/ on the TRS iat the CTI t, we extract the past (historical) traffic flows Vi_ 7,1 Vt,1 Vi,1 Vi 4,1 Vi,1 Vi 21 1. Then, a time window with a size of 8 is employed for TFF, as shown in Formula (2).
1 V8 / V1 V2 V3 V4 V5 V6 V 7
=2 V2 V3 V4 V5 V6 V7 V8 V9 (2)
Vn-8 n-8 Vn-7 Vn-6 Vn-5 Vn-4 Vn-3 Vn-2 Vn-1
[0021] In this invention, according to time series composed by time intervals t - 7, t - 6, t
, t - 4, t - 3, t - 2, t - 1, t, the weighted traffic flow is input into the BiLSTM model to
produce forecasting results at the next time interval t + 1.
[0022] Therefore, we train the BiLSTM model through the following formulas:
it = c-(WiA + Whiht-1 + Weict- + bi), (3) ft= -(WfA+Whf hh-1 +Wft-1 + bf), (4) where W is a weight matrix and b is a bias vector. The cell state in the hidden layer of LSTM is able to be achieved by the following formulas: ct= ftcti + 1itg(WcA +Whcht-1 + be), (5) ot= a(W A +Whoht-1 +Weoct + b), (6) ht = oth(ct), (7) lit = LSTM (ht_1, X,ct_1), (8) h = LSTM(ht_1,1, cti), (9)
Ht = [it,?], (10)
where it represents input gate of output in current time-step, ft denotes forget gate, ct represents cell state, ot denotes output gate, the ht is hidden layer output. -(-) represents the activation function of sigmoid, W denotes the connection weights of matrices. The hidden state Ht of Bi-LSTM at time interval t contains the forward ht and backward hidden states.
3.4. Implementation
[0023] For improving the timeliness and scalability of real-time TFF, we employ a Spark distributed parallel computing framework to implement the SW-BiLSTM model by reducing the computational cost and memory consumption. As illustrated in Fig. 3, SW-BiLSTM is divided into many RDD partitions based on the Spark framework, and different RDD partitions are executed via the following three steps in parallel.
[0024] Step 1: Initialing the RDD data set. The RDDs read the data sets of the TRS and the adjacent road segments in different RDD partitions to generate different < key, value > pairs. Next, the normal distribution with the Formula (1) is applied to calculate the weight of the interaction between the adjacent road segments and the TRS in parallel, and then the weighted data sets are obtained.
[0025] Step 2: Aggregating the intermediate results. After the data go through in Step 1, aggregation is performed at each node. The results of RDD partitions are sorted, aggregated, and cached, and intermediate results can be directly stored in memory for the process of reading in the next step. Then, the distributed SW-BiLSTM model is established by the optimization of BiLSTM, which is combined with the weighted data sets via the Formula (1) and the time window that determines the data set of the input model through the Formula (2).
[0026] Step 3: Producing the forecasting results. The determined data set is input into the model with the Formulas (3)-(10) for model training by the SparkD4jMultiLayer instance using the distributed SW-BiLSTM model on Spark for real-time TFF, and then forecasting results are outputted.
4. Innovation
[0027] (1) To improve the accuracy and robustness of real-time TFF, a distributed weighted bidirectional LSTM neural network model (SW-BiLSTM) on Spark is proposed by taking advantage of a normal distribution for weighing the influence of adjacent road segments. Different from the traditional weighted method, our approach considers the influence degree of the TRS. Moreover, a time window method is incorporated into TFF.
[0028] (2) To enhance the timeliness and scalability of real-time TFF, the SW-BiLSTM model is implemented on the Spark parallel computing framework to conduct parallel forecasting and distributed training of real-time traffic flow. In the data preprocessing, we employ the RDD on Spark to process traffic flow data in data cleaning. In addition, we use the KF method to eliminate abnormal GPS points, thereby achieving discrete smoothing for large-scale traffic flow data.
[0029] (3) The real-time traffic flow of Sanlihe East Road in Beijing is forecasted successfully using our SW-BiLSTM model on Spark with the real-word GPS trajectories of taxicabs. In particular, the empirical results from extensive experiments demonstrate that, compared with several state-of-the-art models, the MAPE value of SW-BiLSTM is lower than that of ARIMA, LR, GBN, CNN, GRU, LSTM, and WND-LSTM, respectively.
5. Experimental evaluations
[0030] In this invention, we compare our SW-BiLSTM model with several state-of-the-art models to validate the performance of traffic flow forecasting, and then report the results and give the analyses in detail.
5.1. Experimental setup
[0031] This experiment adopts a wholly distributed model to build a distributed parallel computing platform based on the Spark framework, including a cluster of 1I Master node and 3 Slave nodes, and the necessary hardware is a Lenovo Host i7 with Inteli7-3550 CPU and ECC DDR3 8.0 GB Memory. All experiments are conducted on Ubuntu 18.64 OS with Hadoop 3.1.1, Spark 2.4.3, Idea 2018.2.2, and Pycharm 2019.2.5 using Java and Python.
[0032] Moreover, we select seven cutting-edge models as baselines in this experiment, i.e., ARIMA, LR, GNB, CNN, GRU, LSTM, and WND-LSTM, besides BiLSTM.
5.2. Experimental data
[0033] A real-world GPS trajectory data set is employed in this case study produced by 12,000 taxis of Beijing between Nov.05 and Nov.17 in 2012. Furthermore, the trajectory data are divided into four groups (i.e., one day: Nov.05, five days: Nov.05 to Nov.09, nine days: Nov.05 to Nov.13, thirteen days: Nov.05 to Nov.17) for performance evaluation. In extensive experiments, 65% data of a day is chosen as the training set, and the rest 35% of the data set is the test set. In other groups of data sets, the data of any day are used as the test set, and the rest of the data are utilized as the training set.
[0034] In addition, the Sanlihe East Road of Beijing in China is selected as the targeted road segment (TRS), including three subsegments, i.e., Fuchengmen Outer Street - Yuetan North Street, Yuetan North Street - Yuetan South Street, Yuetan South Street - Fuxingmen Outer Street.
[0035] The fact that each road segment has randomness in traffic flow changes, especially traffic flow data with noise that affects the accuracy of TFF. To obtain a smooth curve for training, we adopt KF to process the noise data in the experiment.
5.3. Evaluation metrics
[0036] To validate the accuracy and robustness of SW-BiLSTM, MAPE, RMSE, MAE, and ME are taken as the evaluation metrics for the measure of the effectiveness (MOEs), which are defined as:
MAPE = n X- x 100%, (11) t=1
RMSE iL(XtX),(2
RMSE = 1 -j%2, (12) in MAE EX--X, (13)
ME= max IX - Zl, (14) t=1,---,n
where Xt denotes the real value of the traffic flow on the TRS at the CTI t, It represents the forecasting value of the traffic flow on the same road segment at the same time interval, n is the total number of handling traffic flow during the time intervals provided.
5.4. Parametertuning
[0037] Here, we evaluate how to determine the parameters of SW-BiLSTM in this empirical study automatically.
[0038] In this case study, experimental data are divided into several groups by batch-size, and parameters are updated by groups. A set of data in batch-size determines the direction of gradient descent and reduces the randomness and calculation amount during gradient descent. When batch-size increases, a local optimum may occur. When batch-size decreases, the introduced randomness is more significant, and it is not easy to achieve convergence. Meanwhile, as the batch-size increases, the number of required epochs ascends as well; as batch-size decreases, the number of required epochs also descends. Therefore, we need to adjust batch-size and epochs to improve the evaluation metrics for forecasting, but there are no rules about adjusting the parameters in the existing approaches. Therefore, to obtain optimal parameter combinations for our model, we select the Grid Search (GS) method to train the model repeatedly.
[0039] In this invention, the batch-size and epochs are optimized using the GS method. That is, tuning batch-size and epochs are batch-size = range(0, 32, 2) and epochs = range(0, 500, ), respectively.
[0040] The parameters of batch-size and epochs are optimized in this invention. Parameters of the parameter grid are initialized through the GridSearchCV function, and the GS model is returned by grid.fit). Best-score provides the best score observed during the optimization process, and best-params describe the parameter combinations that have achieved the best result. From the output (Best: 0.106227 using 'batch-size': 24,'nb-epoch': 400), we can find out that the best result will be reached when the optimal parameter combination is batch-size = 24 and epochs = 400. The experimental results are illustrated in Table 1.
Table 1: The parameter combinations of batch-size and epochs(%).
0 50 100 150 200 250 300 350 400 450
2 0.0733 0.3663 0.2198 0.0733 0.0733 0.2198 0.0733 0.0733 0.0733 0.2930 4 0.1465 0.0000 0.0733 0.1465 0.2198 0.1465 0.1465 0.1465 0.1465 0.0733 6 0.5128 0.0733 0.5128 0.5128 0.4396 0.5128 0.2930 0.0733 0.5128 0.5128 8 0.0733 0.0733 0.0733 0.0733 0.0733 0.0733 0.0733 0.2198 0.0733 0.0733 10 0.2930 0.1465 0.2930 0.2930 0.2930 0.2930 0.2930 0.2930 0.2930 0.2930 12 0.0733 0.0733 0.0000 0.1465 0.0733 0.0000 0.0733 0.0000 0.0733 0.0000 14 0.0733 0.0733 0.0733 0.0733 0.0000 0.0733 0.0733 0.0733 0.0733 0.0733 16 0.0733 0.0733 0.0733 0.0733 0.0733 0.0733 0.0733 0.0733 0.0733 0.0733 18 0.0733 6.0073 0.0733 0.0733 0.0733 0.0733 1.1722 0.0733 0.0733 0.0733 20 0.6593 10.5495 1.0989 0.0733 1.0256 6.3004 5.9341 0.0733 1.1722 0.0733 22 10.4762 6.7399 6.1538 1.1722 6.8864 1.1722 6.7399 1.7582 0.0733 10.1832 24 10.3297 9.6703 1.0256 10.2564 9.9634 9.8168 9.9634 10.5495 10.6227 9.5238 26 9.9634 10.6227 10.6227 10.5495 10.0366 10.6227 10.6227 10.6227 10.6227 10.6227 28 10.5495 10.6227 10.6227 10.6227 10.6227 10.6227 10.6227 10.6227 10.6227 10.6227 30 10.6227 10.6227 10.6227 10.6227 10.6227 10.6227 10.6227 10.6227 10.6227 10.6227
5.5. Experimental results
[0041] In this empirical study, we train the model, which is composed of four parts in a limited data set. The first is an input layer with eight dimensions, the second is two LSTM hidden layers with sixteen dimensions, the third is a dropout layer with sixteen dimensions in which dropout-rate is 0.2, and the fourth is the output layer with one dimension where batch-size is 24 and epochs is 400.
[0042] To validate the performance of the SW-BiLSTM model, we compare it with the traditional BiLSTM model. We use KF to smooth the experimental data set after data preprocessing in the SW-BiLSTM and BiLSTM models, and then plot the results in Fig. 4. Moreover, we compare the MOE values (i.e., MAPE, MAE, RMSE, and ME) of SW-BiLSTM and BiLSTM. Next, the real value and the forecasting values produced through BiLSTM and SW-BiLSTM are shown in Figs. 5 and 6, respectively.
[0043] The MOEs of SW-BiLSTM are better than the traditional BiLSTM model with four different data sets. The MAPE value of SW-BiLSTM is 29.73% lower than that of BiLSTM on an average. It is obvious that the proposed approach in SW-BiLSTM that combines weighted normal distribution and time window can produce better forecasting performance and achieve higher accuracy than BiLSTM. As shown in Figs. 5 and 6, SW-BiLSTM has a better fitting effect with real traffic flow than BiLSTM.
[0044] Furthermore, for further validating the performance of SW-BiLSTM, we compared it with ARIMA, LR, GNB, CNN, GRU, LSTM, and WND-LSTM. The MOE values of SW BiLSTM with a data set on 13 days are compared with that of other models mentioned above.
[0045] Based on the results, we can conclude that the MAPE of SW-BiLSTM is much lower than that of other models in most cases. More specifically, the MAPE of SW-BiLSTM is 65.62%, 69.10%, 87.30%, 3.52%, 17.78%, 42.86%, and 1.23% lower than that of ARIMA, LR, GBN, CNN, GRU, LSTM, and WND-LSTM, and particularly the accuracy improvement reaches 41.06% on an average. Therefore, our SW-BiLSTM model can provide more accurate predictions than other cutting-edge models. The RMSE, MAE, and ME values of SW BiLSTM are lower than that of other models, respectively, because the forecasting peak value is close to the real peak value, and there is time deviation between them. Moreover, the results demonstrate that the MAPE value of SW-BiLSTM is lower than that of WND-LSTM in most cases, which means that SW-BiLSTM obtains better forecasting performance owing to the use of normal distribution and time window based on BiLSTM. The MAPE value of SW-BiLSTM is decreased by 1.23% compared with that of WND-LSTM, which indicates that BiLSTM specializes in mining traffic flow information from forward and reverse directions. Based on the aforementioned analysis, SW-BiLSTM can provide more accurate forecasting than ARIMA, LR, GBN, CNN, GRU, LSTM, and WND-LSTM.
[0046] From the experimental results mentioned above, it can be found that SW-BiLSTM improves the forecasting accuracy significantly, which is superior to other comparable models to address the TFF problem in real time.
6. Brief Description of The Drawings
[0047] Figure 1 is an overview of SW-BiLSTM.
[0048] Figure 2 is the process of data preprocessing on Spark.
[0049] Figure 3 is the implementation of SW-BiLSTM based on the Spark framework.
[0050] Figure 4 is the experimental data processed by KF for smoothing. (a) Before smooth ing and (b) after smoothing with the data set on thirteen days.
[0051] Figure 5 is the forecasting results on the same data set with different models. (a) BiLSTM and (b) SW-BiLSTM.
[0052] Figure 6 is the forecasting results of BiLSTM and SW-BiLSTM with different data sets. (a) one day, (b) five days, (c) nine days, and (d) thirteen days.
[0053] Figure 7 is the MOEs of ARIMA, CNN, GNB, GRU, LR, LSTM, WND-LSTM, and SW-BiLSTM with different data sets. (a) one day, (b) five days, (c) nine days, and (d) thirteen days.
[0054] Figure 8 is the forecasting results on the same data set with different models. (a) ARIMA, (b) LR, (c) GNB, (d) CNN, (e) GRU, (f) LSTM, (g) WND-LSTM, and (h) SW BiLSTM.
[0055] Figure 9 is the forecasting results produced by ARIMA, LR, GBN, CNN, GRU, LSTM,
WND-LSTM, and SW-BiLSTM with different data sets. (a) one day, (b) five days, (c) nine days, and (d) thirteen days.
Claims (4)
1. SW-BiLSTM model
In this invention, we put forward a distributed SW-BiLSTM model on Spark to im prove the performance of traffic flow forecasting (TFF) in terms of accuracy, timeliness, and scalability, thereby addressing the real-time application problem in ITS.
1.1. Overview
In the SW-BiLSTM model, for improving the accuracy and timeliness of TFF with mobile trajectory big data (e.g., large-scale GPS trajectories of taxicabs), we employ the normal distribution and time window to optimize the traditional BiLSTM model and then implement the optimized BiLSTM model on Spark for the parallel forecasting and distributed computing of real-time traffic flow. As shown in Fig. 1, our method for real-time TFF based on SW-BiLSTM mainly consists of three steps. The transformations and actions of RDD are utilized for data preprocessing, which includes creating RDD for data reading, converting RDD for data computing, and starting RDD for data storage. Furthermore, a distributed SW-BiLSTM model is proposed to address the TFF problem in real time. Specifically, in SW-BiLTSM, the normal distribution is adopted to weigh the influence of adjacent road segments on the traffic flow on the targeted road segment (TRS), and a time window of size eight is used for TFF. Finally, the SW-BiLSTM model is implemented on the Spark parallel distributed computing platform by different RDD partitions.
1.
2. Preprocessing
We preprocess traffic flow data to reduce the impact on the accuracy of TFF due to traffic abnormity, uncertain traffic conditions, and signal transmission. The original raw data are processed based on the Spark framework, namely performing data preprocessing by RDD (the core of Spark). RDD is a fault-tolerant, parallel data structure enabling users to store intermediate results in memory. It also controls the partitioning of the data set to achieve optimal data storage and processing and handles the data via a rich set of operators. Therefore, two types of operators of RDD (i.e., transformations and actions) are used for data preprocessing. Specifically, data are loaded into memory by transformations,then RDD is transformed, and finally data are stored through actions. As illustrated in Fig. 2, the data preprocessing based on the Spark parallel distributed computing platform mainly includes three Steps: RDD creation, RDD conversion, and RDD starting.
* Step 1: RDD creation.
Mobile trajectory data (e.g., GPS trajectory data) are extracted and uploaded to HDFS by adopting external storage with the way of creating RDD. The data stored in HDFS are read through the textFile in SparkContext object, and then they are loaded into the memory of a cluster. Finally, many RDDs could be created.
* Step 2: RDD conversion.
Data processing and conversion on RDD are conducted by calling map,filter, flatMap, sortByKey, distinct, reduceByKey, and other operators of Spark, which mainly consists of three tasks.
Task 1: The vehicle information on the TRS at the current time interval (CTI) is extracted. We first convert the data on each node into < key1, value > pairs using the flatMap operator, and then employ the map operator to set keyl = time and vehicle ID, value = the number of the TRS. Moreover, the filter operator is utilized to filter the GPS trajectory data that do not belong to the selected TRS in the RDD. Finally, the sortByKey operator is used to sort keyl according to keyl in RDD < keyl, value >, and the distinct operator is employed to remove duplicates of the same vehicle at the CTI in RDD to obtain the vehicle information on the TRS.
Task 2: The data information extracted in Step 1 is read. We first convert the data on each node into < key2, value > pairs via the flatMap operator, and then the map operator is adopted to set key2 = time and area number and the value of value adds one. Finally, the reduceByKey operator is utilized to perform reduce operation according to the value of key2. The number of vehicles at the CTI is counted, and the total number of vehicles on the selected TRS at per time interval is obtained (i.e., traffic flow).
Task 3: The total number of vehicles on the TRS at the CTI t is incorporated into a one-dimensional array Xt composing a matrix X. We first convert the data distributed on each node into < key3, value > pairs by the flatMap operator, and then the map operator is employed to set key3 = time interval and value = the total number of vehicles on each TRS. Finally, the key3 is sorted through the sortByKey operator, and then results are outputted.
* Step 3: RDD starting.
The data flow is transferred in different RDD partitions based on transformations operator, and then the Actions operator is started, and the data are stored in HDFS by calling saveAsTextFile to achieve distributed storage of data.
1.
3. Model
LSTM can address the long-term dependency learning problem, and thus has great po tential to forecast traffic flow. The LSTM unit maintains a separate memory cell, which is the crucial element of the LSTM neural network structure. LSTM can only utilize past information rather than future information. BiLSTM, however, can meditate the informa tion in both the past and future, and consists of two stacked unidirectional LSTMs. In this invention, we employ the past traffic flow with the positive state and the future traffic flow with the reverse state to forecast real-time traffic flow. Specifically, we first input the traffic flow into the forward LSTM network layer for getting an output vector and then input the traffic flow in the opposite direction into the backward LSTM network layer to get anoth er output vector. Finally, the final output can be produced by combining the two output vectors. In this invention, it is necessary to measure the weight of the influence of the adjacent road segments on the TRS because traffic flow is affected by the traffic flow on the adjacent road segments and does not exist independently. Parameters 6 and u in the normal distribution determine the position and size of the image. Thus, when the parameter 6 is fixed, the closer the sample point is next to the u parameter, the higher the weighted value is. That is why normal distribution is used to calculate the weight of the targeted road segment and adjacent road segments. Moreover, to enhance the forecasting accuracy of TFF, this invention takes the variation of traffic flow between the adjacent road segment and the TRS into consideration, as shown in Formula (1).
f (X) 1 exp ((x -u)2\ 2 (c- - dm) (1) v26 262
where the u of normal distribution represents traffic flow on the TRS, 6 can be set as 0.6, x denotes discrete value of the traffic flow on each road segment, di is the traffic flow on each road segment, and dm represents the traffic flow on each TRS. With the traffic flow V on the TRS i at the CTI t, we extract the past (historical) traffic flows V 7 ,1Vt/,1V5,1 V 4,1V,1V7 2, Vt 1. Then, a time window with a size of 8 is employed
for TFF, as shown in Formula (2).
V1 V1 V2 V3 V4 V5 V6 V 7 V8
= 2 V2 V3 V4 V5 V6 V7 V8 V9 (2)
n-8 Vn- 8 Vn- 7 Vn- 6 Vn- 5 Vn-4 Vn- 3 Vn-2 Vn-1
In this invention, according to time series composed by time intervals t -7, t -6, t -5, t 4, t - 3, t - 2, t - 1, t, the weighted traffic flow is input into the BiLSTM model to produce
forecasting results at the next time interval t + 1. Therefore, we train the BiLSTM model through the following formulas:
it = c-(W A + Whiht-1 + Weict 1 + bi), (3)
ft= c-(WfA Whfhh-1 +Wefct-1 +bf), (4)
where W is a weight matrix and b is a bias vector. The cell state in the hidden layer of LSTM is able to be achieved by the following formulas:
ct ftct-i +itg(WcA+Whcht-1 +be), (5)
ot= a(Wx +Whoht-1 +Weoct + b), (6)
ht = oth(ct), (7)
h = LSTI(ht- 1 ,X,ct_ 1 ), (8)
h = LSTM(ht- 1 ,X,ct+1 ), (9)
Ht= [ht , (10)
where it represents input gate of output in current time-step, ft denotes forget gate, ct represents cell state, ot denotes output gate, the ht is hidden layer output. -(-) represents the activation function of sigmoid, W denotes the connection weights of matrices. The hidden state Ht of Bi-LSTM at time interval t contains the forward ht and backward hidden states.
1.
4. Implementation
For improving the timeliness and scalability of real-time TFF, we employ a Spark dis tributed parallel computing framework to implement the SW-BiLSTM model by reducing the computational cost and memory consumption. As illustrated in Fig. 3, SW-BiLSTM is divided into many RDD partitions based on the Spark framework, and different RDD partitions are executed via the following three steps in parallel. Step 1: Initialing the RDD data set. The RDDs read the data sets of the TRS and the adjacent road segments in different RDD partitions to generate different < key, value > pairs. Next, the normal distribution with the Formula (1) is applied to calculate the weight of the interaction between the adjacent road segments and the TRS in parallel, and then the weighted data sets are obtained. Step 2: Aggregating the intermediate results. After the data go through in Step 1, aggregation is performed at each node. The results of RDD partitions are sorted, aggregated, and cached, and intermediate results can be directly stored in memory for the process of reading in the next step. Then, the distributed SW-BiLSTM model is established by the optimization of BiLSTM, which is combined with the weighted data sets via the Formula (1) and the time window that determines the data set of the input model through the Formula (2).
Step 3: Producing the forecasting results. The determined data set is input into the model with the Formulas (3)-(10) for model training by the SparkD4jMultiLayer instance using the distributed SW-BiLSTM model on Spark for real-time TFF, and then forecasting results are outputted.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
AU2020102350A AU2020102350A4 (en) | 2020-09-21 | 2020-09-21 | A Spark-Based Deep Learning Method for Data-Driven Traffic Flow Forecasting |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
AU2020102350A AU2020102350A4 (en) | 2020-09-21 | 2020-09-21 | A Spark-Based Deep Learning Method for Data-Driven Traffic Flow Forecasting |
Publications (1)
Publication Number | Publication Date |
---|---|
AU2020102350A4 true AU2020102350A4 (en) | 2020-10-29 |
Family
ID=72926587
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
AU2020102350A Ceased AU2020102350A4 (en) | 2020-09-21 | 2020-09-21 | A Spark-Based Deep Learning Method for Data-Driven Traffic Flow Forecasting |
Country Status (1)
Country | Link |
---|---|
AU (1) | AU2020102350A4 (en) |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112365355A (en) * | 2020-12-10 | 2021-02-12 | 深圳迅策科技有限公司 | Method, device and readable medium for calculating fund valuation and risk index in real time |
CN112489420A (en) * | 2020-11-17 | 2021-03-12 | 中国科学院深圳先进技术研究院 | Road traffic state prediction method, system, terminal and storage medium |
CN113688877A (en) * | 2021-07-30 | 2021-11-23 | 联合汽车电子有限公司 | Test data processing method and device, storage medium, instrument and vehicle |
CN114692993A (en) * | 2022-04-18 | 2022-07-01 | 西南科技大学 | Water conservancy facility deformation prediction method fusing ARIMA and BilSTM in seasons |
CN114841624A (en) * | 2022-06-16 | 2022-08-02 | 中国平安人寿保险股份有限公司 | Scheduling model training method, device, equipment and medium based on artificial intelligence |
CN114863699A (en) * | 2022-06-14 | 2022-08-05 | 电子科技大学 | Urban vehicle-road cooperative traffic flow prediction method based on digital twins |
CN115240424A (en) * | 2022-07-26 | 2022-10-25 | 石河子大学 | Multi-view flow prediction method and system based on data driving |
CN115620524A (en) * | 2022-12-15 | 2023-01-17 | 中南大学 | Traffic jam prediction method, system, equipment and storage medium |
CN115631631A (en) * | 2022-11-14 | 2023-01-20 | 北京航空航天大学 | Traffic flow prediction method and device based on bidirectional distillation network |
CN115713856A (en) * | 2022-10-19 | 2023-02-24 | 东南大学 | Vehicle path planning method based on traffic flow prediction and actual road conditions |
CN115840877A (en) * | 2022-12-06 | 2023-03-24 | 中国科学院空间应用工程与技术中心 | Distributed stream processing method and system for MFCC extraction, storage medium and computer |
CN116738868A (en) * | 2023-08-16 | 2023-09-12 | 青岛中德智能技术研究院 | Rolling bearing residual life prediction method |
CN117113515A (en) * | 2023-10-23 | 2023-11-24 | 湖南大学 | Pavement design method, device, equipment and storage medium |
CN117253576A (en) * | 2023-10-30 | 2023-12-19 | 来未来科技(浙江)有限公司 | Outpatient electronic medical record generation method based on Chinese medical large model |
-
2020
- 2020-09-21 AU AU2020102350A patent/AU2020102350A4/en not_active Ceased
Cited By (22)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112489420A (en) * | 2020-11-17 | 2021-03-12 | 中国科学院深圳先进技术研究院 | Road traffic state prediction method, system, terminal and storage medium |
CN112365355A (en) * | 2020-12-10 | 2021-02-12 | 深圳迅策科技有限公司 | Method, device and readable medium for calculating fund valuation and risk index in real time |
CN112365355B (en) * | 2020-12-10 | 2023-12-26 | 深圳迅策科技有限公司 | Method, device and readable medium for calculating foundation valuation and risk index in real time |
CN113688877A (en) * | 2021-07-30 | 2021-11-23 | 联合汽车电子有限公司 | Test data processing method and device, storage medium, instrument and vehicle |
CN113688877B (en) * | 2021-07-30 | 2024-04-16 | 联合汽车电子有限公司 | Test data processing method and device, storage medium, instrument and vehicle |
CN114692993A (en) * | 2022-04-18 | 2022-07-01 | 西南科技大学 | Water conservancy facility deformation prediction method fusing ARIMA and BilSTM in seasons |
CN114692993B (en) * | 2022-04-18 | 2024-06-04 | 西南科技大学 | Water conservancy facility deformation prediction method integrating ARIMA and BiLSTM |
CN114863699A (en) * | 2022-06-14 | 2022-08-05 | 电子科技大学 | Urban vehicle-road cooperative traffic flow prediction method based on digital twins |
CN114841624A (en) * | 2022-06-16 | 2022-08-02 | 中国平安人寿保险股份有限公司 | Scheduling model training method, device, equipment and medium based on artificial intelligence |
CN114841624B (en) * | 2022-06-16 | 2024-08-02 | 中国平安人寿保险股份有限公司 | Scheduling model training method, device, equipment and medium based on artificial intelligence |
CN115240424A (en) * | 2022-07-26 | 2022-10-25 | 石河子大学 | Multi-view flow prediction method and system based on data driving |
CN115713856A (en) * | 2022-10-19 | 2023-02-24 | 东南大学 | Vehicle path planning method based on traffic flow prediction and actual road conditions |
CN115713856B (en) * | 2022-10-19 | 2023-09-22 | 东南大学 | Vehicle path planning method based on traffic flow prediction and actual road conditions |
CN115631631A (en) * | 2022-11-14 | 2023-01-20 | 北京航空航天大学 | Traffic flow prediction method and device based on bidirectional distillation network |
CN115840877A (en) * | 2022-12-06 | 2023-03-24 | 中国科学院空间应用工程与技术中心 | Distributed stream processing method and system for MFCC extraction, storage medium and computer |
CN115620524A (en) * | 2022-12-15 | 2023-01-17 | 中南大学 | Traffic jam prediction method, system, equipment and storage medium |
CN116738868B (en) * | 2023-08-16 | 2023-11-21 | 青岛中德智能技术研究院 | Rolling bearing residual life prediction method |
CN116738868A (en) * | 2023-08-16 | 2023-09-12 | 青岛中德智能技术研究院 | Rolling bearing residual life prediction method |
CN117113515A (en) * | 2023-10-23 | 2023-11-24 | 湖南大学 | Pavement design method, device, equipment and storage medium |
CN117113515B (en) * | 2023-10-23 | 2024-01-05 | 湖南大学 | Pavement design method, device, equipment and storage medium |
CN117253576A (en) * | 2023-10-30 | 2023-12-19 | 来未来科技(浙江)有限公司 | Outpatient electronic medical record generation method based on Chinese medical large model |
CN117253576B (en) * | 2023-10-30 | 2024-03-05 | 来未来科技(浙江)有限公司 | Outpatient electronic medical record generation method based on Chinese medical large model |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
AU2020102350A4 (en) | A Spark-Based Deep Learning Method for Data-Driven Traffic Flow Forecasting | |
AU2020101023A4 (en) | A Novel Deep Learning Approach for Distributed Traffic Flow Forecasting | |
Chu et al. | Deep multi-scale convolutional LSTM network for travel demand and origin-destination predictions | |
CN111860951B (en) | Rail transit passenger flow prediction method based on dynamic hypergraph convolutional network | |
Xia et al. | A distributed WND-LSTM model on MapReduce for short-term traffic flow prediction | |
Zhou et al. | Large-scale cellular traffic prediction based on graph convolutional networks with transfer learning | |
Kong et al. | Construction of intelligent traffic information recommendation system based on long short-term memory | |
Li et al. | Graph CNNs for urban traffic passenger flows prediction | |
CN108986453A (en) | A kind of traffic movement prediction method based on contextual information, system and device | |
Xia et al. | SW-BiLSTM: a Spark-based weighted BiLSTM model for traffic flow forecasting | |
CN111242395B (en) | Method and device for constructing prediction model for OD (origin-destination) data | |
CN115204478A (en) | Public traffic flow prediction method combining urban interest points and space-time causal relationship | |
Dai et al. | Spatio-temporal deep learning framework for traffic speed forecasting in IoT | |
CN112396218A (en) | Crowd flow prediction method based on urban area multi-mode fusion | |
CN115762147B (en) | Traffic flow prediction method based on self-adaptive graph meaning neural network | |
Zhao et al. | Unifying Uber and taxi data via deep models for taxi passenger demand prediction | |
Zhao et al. | Multi-featured spatial-temporal and dynamic multi-graph convolutional network for metro passenger flow prediction | |
CN113159371B (en) | Unknown target feature modeling and demand prediction method based on cross-modal data fusion | |
Fu et al. | Traffic Safety Oriented Multi‐Intersection Flow Prediction Based on Transformer and CNN | |
Nan et al. | MSTL-GLTP: A global–local decomposition and prediction framework for wireless traffic | |
US20230153742A1 (en) | Methods for shared bicycle delivery and operation area planning in smart cities and internet of things (iot) systems thereof | |
Chen et al. | Examine the Prediction Error of Ride‐Hailing Travel Demands with Various Ignored Sparse Demand Effects | |
Xue et al. | Urban population density estimation based on spatio‐temporal trajectories | |
Bhanu et al. | ST-A GP: Spatio-Temporal aggregator predictor model for multi-step taxi-demand prediction in cities | |
Wang et al. | A hybrid deep learning approach for traffic flow prediction in highway domain |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
FGI | Letters patent sealed or granted (innovation patent) | ||
MK22 | Patent ceased section 143a(d), or expired - non payment of renewal fee or expiry |