CN109920248A - A kind of public transport arrival time prediction technique based on GRU neural network - Google Patents

A kind of public transport arrival time prediction technique based on GRU neural network Download PDF

Info

Publication number
CN109920248A
CN109920248A CN201910162263.5A CN201910162263A CN109920248A CN 109920248 A CN109920248 A CN 109920248A CN 201910162263 A CN201910162263 A CN 201910162263A CN 109920248 A CN109920248 A CN 109920248A
Authority
CN
China
Prior art keywords
public transport
data
station
neural network
arrival time
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910162263.5A
Other languages
Chinese (zh)
Other versions
CN109920248B (en
Inventor
孙玲
陆俊天
施佺
曹阳
沈琴琴
朱森来
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nantong University
Original Assignee
Nantong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nantong University filed Critical Nantong University
Priority to CN201910162263.5A priority Critical patent/CN109920248B/en
Publication of CN109920248A publication Critical patent/CN109920248A/en
Application granted granted Critical
Publication of CN109920248B publication Critical patent/CN109920248B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The public transport arrival time prediction technique based on GRU neural network that the invention discloses a kind of, the described method includes: exporting historical data to CSV formatted file by database, initial data is obtained, removal the promiscuity of initial data, complexity and coefficient are analyzed and processed to the initial data using HBase distributed data base and Spark memory processing technique;Based on single attribute and multiple-factor angle using feature correlation organon processing analysis treated the initial data, standard time series categorical data is obtained;Variables choice is carried out to standard time series categorical data using Lasso method, rejects the feature vector that relevance is weak in standard time series categorical data;Based on the prediction model that arrives at a station of GRU neural network building public transport, input has rejected the standard time series categorical data of the weak feature vector of relevance to the prediction model that arrives at a station, and realizes and operates to the time prediction that public transport is arrived at a station;The present invention can effectively promote the accuracy to the prediction of public transport arrival time.

Description

A kind of public transport arrival time prediction technique based on GRU neural network
Technical field
The present invention relates to the monitoring of city bus and arrival time Predicting Techniques, and in particular to one kind is based on GRU nerve net The public transport arrival time prediction technique of network.
Background technique
Public transport is the important infrastructure to involve the interests of the state and the people, Information of Development, intelligentized advanced public transit system There is positive effect to urban public transport management and service level is improved.Public traffic vehicles schedule management is advanced public transit system Core, and public transit vehicle arrival time is the key parameter of public transit vehicle dynamic dispatching management, it is according to warp that traditional public transport, which is arranged an order according to class and grade, Running time interval is tested between judged fixed station to emulate to arrival time.In general, this method make it is pre- Estimate timetable it is error it is big, fitting degree is low, not can reflect real situation.
Arrival time prediction reduces passenger waiting time, facilitates passenger's reasonable distribution for improving public transit vehicle punctuality Travel time provides valuable help.Domestic and foreign scholars have done a large amount of research in terms of public transit vehicle arrival time prediction, mention The main having time sequence of prediction model out (Time Series, TS) model, artificial neural network (Artificial Neural Network, ANN) model, support vector machines (SupportVector Machines, SVM) model and Kalman filter model Deng.It is mobile to establish autoregression by carrying out difference processing to the unstable data in time series with test by Yang et al. Average time series model is fitted by residual analysis and data, is predicted arrival time, but in the Model sequence White noise influences seriously, to cause final precision of prediction not high;Bear Wenhua et al. is recorded by BP network with Floating Car and coil Data as network inputs, using vehicle travel time as output, which needs mass data to be fitted, arameter optimization It is complicated.
Summary of the invention
It can not reflect for above-mentioned degree of fitting big to the error of public transport arrival time prediction and its prediction in the prior art Public transport arrive at a station truth the problem of, the present invention in proposing a kind of public transport arrival time prediction technique based on GRU neural network, Specific technical solution is as follows:
A kind of public transport arrival time prediction technique based on GRU neural network, which comprises
S1, historical data is exported by database to CSV formatted file, obtain initial data, utilize HBase distributed data Library and Spark memory processing technique are analyzed and processed the promiscuity, complicated of the removal initial data to the initial data Property and coefficient;
S2, based on single attribute and multiple-factor angle using feature correlation organon processing analysis treated the original Beginning data obtain standard time series categorical data;
S3, variables choice is carried out to the standard time series categorical data using Lasso method, when rejecting the standard Between the weak feature vector of relevance in sequence type data;
The weak feature vector of relevance has been rejected in S4, the prediction model that arrives at a station based on the building public transport of GRU neural network, input The standard time series categorical data is realized and is operated to the time prediction that public transport is arrived at a station to the prediction model that arrives at a station.
Further, step S1 includes:
S11, the CSV formatted file is obtained from HDFS using SparkSQL, forms Spark DataFrame structure number According to;
S12, the history GPS track data that specified public transport is extracted using SparkSQL, and utilize HBase distributed data base The history GPS track data are matched with bus station distance.
Further, described to utilize HBase distributed data base by the history GPS track data and bus station distance It is matched, comprising:
S121, one particular value of setting are used to judge whether the matching to be less than the specified arrival location of public transport, if described The result matched is less than the particular value, then marks public transport arrival location corresponding with the matching;
S122, two GPS positioning points for taking time interval to be greater than t seconds are appointed into the matching in chronological order, according to two The slope of anchor point line judges the uplink and downlink operation conditions of public transport;
S123, positioning time nearest with website in the matching, the speed of service and acceleration based on public transport, note are chosen Record arrival time;
S124, the initial data is ranked up with arrival time and public transport corresponding vehicle number, and defeated using Spark It stores out into HDFS.
Further, the public transport arrival location is counted at a distance from actual location place by Greate-Circle distance It calculates formula to calculate, the Greate-Circle distance calculation formula are as follows:
Wherein, R is earth radius, Aj, AwThe respectively longitude, latitude in actual location place;Bj, BwRespectively public transport is arrived It stands longitude, the latitude in place.
Further, the calculating of the slope formula are as follows:
In formula, Dlon、DlatRepresent route uplink terminus longitude, latitude, Slon、SlatRepresent route uplink inception point warp Degree, latitude, Alon、AlatRepresent latter station longitude, the latitude of rear vehicle driving trace, Blon、BlatRepresent previous station longitude, latitude Degree;Wherein, if K > 0, then it represents that with it is in the same direction for uplink, i.e. uplink is on the contrary then be downlink.
Further, step S223 passes through formulaWherein, s is that the last anchor point is leaving from station Point distance, v0For the running velocity of public transport at the public transport arrival location, vtFor speed of arriving at a station, it is the last fixed for being defaulted as 0, t Time used in site to bus station.
Further, the Lasso method defined formula are as follows:Its In, xijIt is row vector β for regression coefficient, y indicates training label for i-th group of j variable.
Public transport arrival time prediction technique based on GRU neural network of the invention, first by Spark to initial data Process handles to obtain standard time series categorical data, realizes and arrives at a station the extractions of data to public transport;Then it is mentioned using Lasso method The weak feature vector of relevance realizes variables choice operation out;Finally mould is predicted using GRU neural network arriving at a station for public transport of building Type is realized and is operated to the specific time prediction that public transport is arrived at a station;Compared with prior art, GRU neural network of the present invention has logarithm According to the operating process screened and selected, by arriving at a station the screening and selection of data to public transport, the method for the present invention can be mentioned effectively Rise the accuracy predicted public transport arrival time.
Detailed description of the invention
Fig. 1 is the flow chart of the public transport arrival time prediction technique described in the embodiment of the present invention based on GRU neural network Signal;
Fig. 2 is that GPS data process flow is illustrated in the embodiment of the present invention;
Fig. 3 is to illustrate in the embodiment of the present invention to the source data relevance process flow diagram collected;
Fig. 4 is the diagram meaning of GRU network model described in the embodiment of the present invention;
Fig. 5 is the algorithm flow chart signal of GRU network model described in the embodiment of the present invention;
Fig. 6 is the Loss function penalty values correlation curve that the method for the present invention and LSTM method carry out the prediction of public transport arrival time Diagram meaning;
Fig. 7 is that the public transport arrival time predicted using the method for the present invention and the practical arrival time comparison of public transport are illustrated Meaning;
Fig. 8 and Fig. 9 is that LSTM network and GRU network are respectively adopted in the embodiment of the present invention to carry out public transport and arrive at a station to predict to train Comparison diagram signal.
Specific embodiment
In order to enable those skilled in the art to better understand the solution of the present invention, below in conjunction in the embodiment of the present invention Attached drawing, technical scheme in the embodiment of the invention is clearly and completely described.
Refering to fig. 1, in embodiments of the present invention, a kind of public transport arrival time prediction based on GRU neural network is provided Method, specifically includes that steps are as follows:
Step 1: exporting historical data to CSV formatted file by database, initial data is obtained, HBase distribution is utilized Database and Spark memory processing technique to initial data be analyzed and processed removal the promiscuity of initial data, complexity and Coefficient;In conjunction with Fig. 2, specifically, database is the historical record data for storing public transport real time execution, wherein historical record number It is recorded and is obtained together by GPS according to (i.e. historical data), and since initial data is remembered by the GPS instrument that is mounted in public transport There is the problems such as receiving precision delay and in public transport actual moving process in record, direct received data be possible to by GPS location precision and network influence, be present in reference format be not inconsistent, data apparent error, Data duplication the problems such as;It is based on This, the method for the present invention obtains original CSV formatted file first with SparkSQL from HDFS, forms Spark DataFrame The data of format extract operation to redundancy, abnormal data, and delete to redundant columns using time series, license number matching The number for removing, finally data being ranked up according to time, sequence of cars, and completed cleaning using HBase Phonenix interface According to storing into database;Followed by the history GPS track data for specifying public transport in HBase, pass through Spark elasticity distribution formula number History GPS track data are matched with bus station distance according to collection technology;In the matching process comprising steps of
A particular value is first set for judging whether matching is less than the setting arrival location of specified public transport, if matched result Less than particular value, then public transport arrival location corresponding with matching is marked;Public transport arrival location is led at a distance from actual location place Cross the calculating of Greate-Circle distance calculation formula, Greate-Circle distance calculation formula are as follows:Wherein, R is earth radius, Aj, AwRespectively Longitude, the latitude in actual location place;Bj, BwThe respectively longitude, latitude of public transport arrival location;It will match again in chronological order Appoint the corresponding anchor point of two GPS track data for taking time interval to be greater than t seconds, is judged according to the slope of two anchor point lines The uplink and downlink operation conditions of public transport;Wherein, the calculating of slope formula are as follows:In formula, Dlon、 DlatRepresent route uplink terminus longitude, latitude, Slon、SlatRepresent route uplink inception point longitude, latitude, Alon、AlatIt represents Latter station longitude, the latitude of rear vehicle driving trace, Blon、BlatRepresent previous station longitude, latitude;In checkout result, if K > 0, Then indicate with it is in the same direction for uplink, i.e. uplink is on the contrary then be downlink;Then, positioning nearest with website in matching is chosen Time, the speed of service and acceleration based on public transport record arrival time, especially by formulaMeter Calculation obtains arrival time, and in formula, s is the last anchor point point distance leaving from station, v0For public transport at the public transport arrival location Running velocity, vtFor speed of arriving at a station, 0, t is defaulted as the time used in the last anchor point to bus station;Finally, to arrive Stand the time and the corresponding vehicle number of public transport be ranked up initial data, and using the output of Spark memory processing technique store to In HDFS;Meanwhile by finding corresponding GPS track data pair using map to site name and position in site information table The coordinate position answered analyzes its corresponding station spacing according to its operating line, forms site information table.
In a particular embodiment, if it exists the corresponding locating point position of a plurality of history GPS track data and bus station away from From matched data, then screened with nearest, earliest for essential condition, selection obtains best matching result;Wherein, of the invention Table one is seen using the format of initial data;The arrival time table of public transport sees table two;The specific website information table of public transport See table three.
Table one
Table two
Table three
Step 2: using feature correlation organon processing analysis, that treated is former based on single attribute and multiple-factor angle Beginning data obtain standard time series categorical data;In a particular embodiment, consider from single attribute, each shift station Service time between point necessarily affects the arrival time of the next stop, and in practical driving procedure, different vehicle is due to driving There is also certain changing rules for the person's of sailing difference, and consider existing connection between multiple-factor, website spacing and when dispatching a car Between and whether be the traffic-operating periods feature such as working day, the efficiency of operation of whole route is necessarily affected, so as to cause arrival time Variation, time series relationship existing for combined data script processes data into standard by feature correlation organon Time series type;In the present invention, method is from two angle analysis different times of transverse and longitudinal of time and space and weather feelings Influence of the condition difference for public transport arrival time specifically sees four content of table.
Table four
Step 3: carrying out variables choice to standard time series categorical data using Lasso method, standard time sequence is rejected The weak feature vector of relevance in column categorical data;Being arrived at a station due to prediction public transport is a kind of regression problem in actual operation, In order to avoid due in regression analysis process predicted vector it is excessive, the calculating process for causing subset to select has not practicability, And subset selection has inherent discontinuity, it is extremely changeable so as to cause subset selection;By the present invention in that with Lasso method Variables choice is carried out, the weak feature vector of relevance is rejected, Lasso method defined formula are as follows:In formula, xijFor i-th group of j variable, vector β is regression coefficient, and y is indicated Training label;In conjunction with Fig. 3, the detailed process of the weak feature vector of correlation is rejected using Lasso method are as follows:
It analyzes to obtain the coefficient of different attribute firstly, carrying out variables choice using Lasso method by specified programming language Value, the variables choice coefficient of Lasso method see table five;The specific implementation program code of Lasso method in the present embodiment are as follows:
Table five
Then, according to its relative coefficient, specified attribute outputting and inputting as prediction model is selected, it is preferred that this Embodiment selects BUSNO, STOP, WEEKDAY, and DISTANCE, STARTTIME, input of the WEATHER attribute as model will Arrival time (STOPTIME) is exported as model;Certainly, this is only the preferred embodiment of the method for the present invention, in other embodiments In, it can be selected according to the actual situation, the present invention is not limited to this and fixed.
In actual operation, when increasing data volume due to after regression analysis pre-processes, needing to look to the future in data, The inconsistent problem of dimension, it is therefore desirable to operation are standardized to data, the expression formula for having dimension is transformed to nondimensional Expression formula;In this regard, the present invention is defined using class label, it is assumed that 10 vehicle license numbers are indicated with 0~9;It is marked using zero-mean Standardization is defined as,In formula, x indicates former fixed type data, and x* indicates that new data, μ indicate sample average, σ Indicate sample standard deviation;And deviation standardization, defined formula are utilized for his data are as follows:In formula, Y indicates standard value, and x indicates former characteristic value;The benefit that data become scalar is had from there through normalization, searching can be effectively reduced The time of optimal solution, the convergence rate and its precision of prediction of lift scheme, the contribution phase that each feature can be allowed to make result Together;Solve the problems, such as new data dimension difference;The forecasting efficiency and precision of prediction of the method for the present invention can be promoted.
Step 4: based on GRU neural network building public transport the prediction model that arrives at a station, input rejected the weak feature of relevance to The standard time series categorical data of amount is realized and is operated to the time prediction that public transport is arrived at a station to the prediction model that arrives at a station;In conjunction with Fig. 4, It can be seen that GRU neural network possesses resetting door and updates two doors of door, and GRU neural network will not control and retain inside Remember Ct;The principle of GRU neural network are as follows: firstly, updating door when time step is t, pass through formula zt=σ (W(z)xt+U(s) ht-1) update door is calculated, in formula, xtFor t-th of component of list entries x, pass through a linear transformation and weight matrix W(z) It is multiplied, ht-1The information for saving previous time step, by weight matrix U(s)Carry out linear transformation;Update goalkeeper this two Partial information is added, and is converted using Sigmoid activation primitive, activation result is compressed between 0 to 1;Door is updated to determine By historical data number pass to future, reduce the risk that gradient disappears;Resetting door determines the forgetting process of data, leads to Cross formula rt=σ (W(r)xt+U(z)ht-1) indicate;Similar to update door, the letter that the component of list entries and back are saved Breath carries out linear transformation, carries out transformation output finally by Sigmoid activation primitive.
Then, in use, new content will use the data in the history step of resetting door storage to resetting door, specifically Can by formula h 't=tan h (Wxt+rt⊙Uht-1) be calculated, wherein input quantity xtWith the information h of backt-1It first passes through Linear transformation processing, i.e., the right side multiplies matrix W, U respectively;Since resetting door is one by 0 to 1 vector, its value measurement, which gates, is opened The size opened;When the corresponding gate value of some element is 0, then having meant that this element will be lost in this step by network Forget, resets door r by calculatingtAnd Uht-1Hadamard product, can determine the information content to be retained or be forgotten;Finally Two parts computer is crossed into addition investment tanh activation primitive tanh.
Finally, calculating the final memory h of GRU neural network current time stept, especially by formula: ht=zt⊙ht-1+ (1-zt)⊙h′tIt calculates, htInformation required for active cell will be retained and pass to next unit, used update herein The activation result z of doort, to determine current memory content h 'tWith back information ht-1The middle information for needing to collect;Wherein ztWith ht-1The previous time step of Hadamard product representation remain into the information finally remembered, which remains into plus current memory The information finally remembered can calculate the content of final gating cycle unit output.
In a particular embodiment, the built-in protection of every layer of GRU neural network and the update door of its state is controlled, for realizing Parameter sharing and circulation memory;Especially by the function being added for realizing exponential damping learning rate, and using under Adam gradient Drop method, specifically, Adam gradient descent method is to single order momentum index rolling average calculation formula are as follows:
Wherein mtRepresent single order momentum, vtGeneration Table second order momentum, β1、β2, represent objective function immediately, in the stage of primary iteration, two momentum have the offset to initial value, That is mt=0, vt=0;Therefore, formula can be passed through to itIt is biased correction, and uses formulaGradient is updated;Compared to the prediction model constructed based on LSTM, the method for the present invention based on The prediction model overall structure of arriving at a station of GRU neural network is simpler, and when front and back gradient direction is consistent, can speed up It practises;When front and back gradient direction is inconsistent, it is able to suppress oscillation, cost module is used to calculate predicted value and the loss of true value is poor It is different, based on the next step optimal way of the obtained loss diversity judgement GRU neural network, and determine the optimization side of gradient To;Save module guarantees that the safety of model can be by mould that is, after being trained using a model for storage model parameter Type completely saves, and on the one hand realizes the continuous preservation of data, on the other hand, can use guarantor during predict next time The model deposited is realized to the optimization of entire prediction process steps, is conducive to the forecasting efficiency for promoting the method for the present invention.
Refering to Fig. 5, in embodiments of the present invention, the process of the prediction model of GRU neural network building are as follows:
Choose hyper parameter first: preferred, the invention of this reality is 0.1 to be just distributed very much to initialize weight as standard deviation, just Beginningization deviation is 0.1, and initial learning rate is 0.001, attenuation coefficient 0.9, the rate of decay 1000, training dataset Batch_size is 800, and all sample training number Epoch are 30, and time step Timesteps is 30.
Then model training is carried out: it is preferred, specifically, the present embodiment was gone through using Nantong Area No. 41 bus 14 days History data of arriving at a station are analyzed, and take training set of preceding 10 day data as the prediction model that arrives at a station, using quadratic loss function it =σ (Wi·[ht-1,xt]+bi) minimum error function as the model training, and using rear 4 day data in 14 days as inspection Test the test verify data of model training result;Formula can specifically be announced
It indicates, in formula, C is quadratic loss function value, and x is input value, and y (x) is The true value of arrival time, a are the corresponding output valve i.e. predicted value for inputting x and obtaining, and n indicates once trained total amount of data.It is real In the application of border, over-fitting in order to prevent, and preferably reduce error, so that model is studied in depth, is added in loss function L2 regular terms, ω indicate weight, and λ is for weighing quadratic loss function and weight this two relative importance.
By the public transport constructed the present invention is based on GRU neural network arrive at a station prediction model and tradition based on LSTM building prediction Model carries out loss late comparison, refering to Fig. 6, it can be seen that, the method for the present invention rapid decrease before four iteration, and five It tends towards stability after secondary, shows that the prediction model that arrives at a station of the method for the present invention building has been subjected at this time and train up, i.e., the present invention can To complete the forecast function of model in the case where frequency of training is few, predetermined speed of entire model is effectively improved, is integrally mentioned Rise forecasting efficiency.
Refering to Fig. 7, by the practical arrival time pair of the public transport arrival time predicted by the method for the invention and public transport Than specifically, being different from mean absolute percentage error MAPE index, present invention employs formulasIt is fixed The linear regression fit degree index R-squared of justice judges, wherein y indicates practical arrival time, y* expression arrival time Based on GRU neural network building prediction model predicted value of arriving at a station,Represent average value;And according to formulaCalculate the quasi- of 3 days all shifts of the prediction model fitting of arriving at a station constructed based on GRU neural network Right index R-squared, then be averaged, show that the degree of fitting of the prediction model that arrives at a station based on the building of GRU neural network reaches To 94.547%, by practical arrival time compared with the predicted time of the prediction model that arrives at a station constructed based on GRU neural network It is found that the prediction result of the method for the present invention is close with the practical arrival time of public transport, error is smaller.
Again by the method for the present invention and prediction model degree of being fitted and performance comparison based on LSTM building, refering to table six, It can be seen that the method for the present invention compared to it is traditional based on LSTM construct prediction model, GM11 algorithm and SVM algorithm come It says, degree of fitting is promoted obvious, i.e., the precision of prediction of surface the method for the present invention is higher than traditional public transport and arrives at a station precision of prediction;Refering to figure 8 and Fig. 9, therefrom can be with compared with the method for the present invention is carried out ten training with traditional LSTM prediction model in combination with table seven Find out, howsoever take epoch and batchsize that can find, the time-consuming of the method for the present invention is fewer than LSTM, in epoch number When for 100, batchsize being 300, the average time-consuming of LSTM network has had more 7.168% compared to GRU network, in epoch number When for 300, batchsize being 3000, the average time-consuming of LSTM network has been higher by 14.1% compared to GRU network;With this it is found that In the case where data volume constantly becomes more, calculating money can be more saved using the prediction model that arrives at a station that GRU neural network constructs Model training the time it takes, the operation efficiency of lift scheme are reduced in source.
Table six
Table seven
In summary, the public transport arrival time prediction technique of the invention based on GRU neural network, passes through Spark first It handles to obtain standard time series categorical data to initial data process, realizes and arrive at a station the extractions of data to public transport;Then it utilizes Lasso method proposes that the weak feature vector of relevance realizes variables choice operation;Finally utilize the building public transport of GRU neural network Arrive at a station prediction model, realizes and operates to the specific time prediction that public transport is arrived at a station;Compared with prior art, GRU nerve net of the present invention Network has the operating process being screened and selected to data, by arriving at a station the screening and selection of data to public transport, side of the present invention Method can effectively promote the accuracy to the prediction of public transport arrival time.
The foregoing is merely a prefered embodiment of the invention, is not intended to limit the scope of the patents of the invention, although referring to aforementioned reality Applying example, invention is explained in detail, for a person skilled in the art, still can be to aforementioned each specific Technical solution documented by embodiment is modified, or carries out equivalence replacement to part of technical characteristic.All utilizations The equivalent structure that description of the invention and accompanying drawing content are done directly or indirectly is used in other related technical areas, together Reason is within the invention patent protection scope.

Claims (7)

1. a kind of public transport arrival time prediction technique based on GRU neural network, which is characterized in that the described method includes:
S1, historical data is exported to CSV formatted file by database, obtains initial data, using HBase distributed data base and Spark memory processing technique to the initial data be analyzed and processed removal the promiscuity of initial data, complexity and Coefficient;
S2, based on single attribute and multiple-factor angle using feature correlation organon processing analysis treated the original number According to obtaining standard time series categorical data;
S3, variables choice is carried out to the standard time series categorical data using Lasso method, rejects the standard time sequence The weak feature vector of relevance in column categorical data;
The described of the weak feature vector of relevance has been rejected in S4, the prediction model that arrives at a station based on the building public transport of GRU neural network, input Standard time series categorical data is realized and is operated to the time prediction that public transport is arrived at a station to the prediction model that arrives at a station.
2. the public transport arrival time prediction technique based on GRU neural network as described in claim 1, which is characterized in that step S1 includes:
S11, the CSV formatted file is obtained from HDFS using SparkSQL, forms Spark DataFrame structured data;
S12, the history GPS track data that specified public transport is extracted using SparkSQL, and utilize HBase distributed data base by institute History GPS track data are stated to be matched with bus station distance.
3. the public transport arrival time prediction technique based on GRU neural network as claimed in claim 2, which is characterized in that described The history GPS track data are matched with bus station distance using HBase distributed data base, comprising:
S121, one particular value of setting are used to judge whether the matching to be less than the specified arrival location of public transport, if described matched As a result it is less than the particular value, then marks public transport arrival location corresponding with the matching;
S122, two GPS positioning points for taking time interval to be greater than t seconds are appointed into the matching in chronological order, are positioned according to two The slope of point line judges the uplink and downlink operation conditions of public transport;
S123, positioning time nearest with website in the matching is chosen, the speed of service and acceleration based on public transport are recorded It stands the time;
S124, the initial data is ranked up with arrival time and public transport corresponding vehicle number, and is deposited using Spark output Storage is into HDFS.
4. the public transport arrival time prediction technique based on GRU neural network as claimed in claim 3, which is characterized in that described Public transport arrival location is calculated at a distance from actual location place by Greate-Circle distance calculation formula, the Greate- Circle distance calculation formula are as follows:
Wherein, R is earth radius, Aj, AwThe respectively longitude, latitude in actual location place;Bj, BwRespectively public transport arrival location Longitude, latitude.
5. the public transport arrival time prediction technique based on GRU neural network as claimed in claim 3, which is characterized in that described The calculating of slope formula are as follows:
In formula, Dlon、DlatRepresent route uplink terminus longitude, latitude, Slon、SlatRepresent route uplink inception point longitude, latitude Degree, Alon、AlatRepresent latter station longitude, the latitude of rear vehicle driving trace, Blon、BlatRepresent previous station longitude, latitude;Its In, if K > 0, then it represents that with it is in the same direction for uplink, i.e. uplink is on the contrary then be downlink.
6. the public transport arrival time prediction technique based on GRU neural network as claimed in claim 3, which is characterized in that step S223 passes through formulaWherein, s is the last anchor point point distance leaving from station, v0It is arrived for the public transport It stands the running velocity of public transport at place, vtFor speed of arriving at a station, 0, t is defaulted as used in the last anchor point to bus station Time.
7. the public transport arrival time prediction technique based on GRU neural network as described in claim 1, which is characterized in that described Lasso method defined formula are as follows:
Wherein, xijIt is row vector β to return system for i-th group of j variable Number, y indicate training label.
CN201910162263.5A 2019-03-05 2019-03-05 Bus arrival time prediction method based on GRU neural network Active CN109920248B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910162263.5A CN109920248B (en) 2019-03-05 2019-03-05 Bus arrival time prediction method based on GRU neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910162263.5A CN109920248B (en) 2019-03-05 2019-03-05 Bus arrival time prediction method based on GRU neural network

Publications (2)

Publication Number Publication Date
CN109920248A true CN109920248A (en) 2019-06-21
CN109920248B CN109920248B (en) 2021-09-17

Family

ID=66963255

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910162263.5A Active CN109920248B (en) 2019-03-05 2019-03-05 Bus arrival time prediction method based on GRU neural network

Country Status (1)

Country Link
CN (1) CN109920248B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110459056A (en) * 2019-08-26 2019-11-15 南通大学 A kind of public transport arrival time prediction technique based on LSTM neural network
CN111263326A (en) * 2020-01-09 2020-06-09 中国人民解放军国防科技大学 Vehicle position prediction method based on multiple fusion convolution GRU
CN111489064A (en) * 2020-03-27 2020-08-04 湖南大学 Df-PBS system-oriented public bicycle station dynamic planning method and system
CN111726351A (en) * 2020-06-16 2020-09-29 桂林电子科技大学 Bagging-improved GRU parallel network flow abnormity detection method
CN112149919A (en) * 2020-10-15 2020-12-29 武汉译码当先科技有限公司 Bus operation line evaluation method, device, equipment and storage medium
CN112907953A (en) * 2021-01-27 2021-06-04 吉林大学 Bus travel time prediction method based on sparse GPS data
CN113672687A (en) * 2021-10-25 2021-11-19 北京值得买科技股份有限公司 E-commerce big data processing method, device, equipment and storage medium

Citations (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102081859A (en) * 2009-11-26 2011-06-01 上海遥薇实业有限公司 Control method of bus arrival time prediction model
CN102157075A (en) * 2011-03-15 2011-08-17 上海交通大学 Method for predicting bus arrivals
CN103310651A (en) * 2013-05-24 2013-09-18 北京市交通信息中心 Bus arrival prediction method based on real-time traffic status information
TWI417798B (en) * 2008-11-21 2013-12-01 Nat Taipei University Oftechnology High - speed reverse transfer neural network system with elastic structure and learning function
CN103871246A (en) * 2014-02-10 2014-06-18 南京大学 Short-term traffic flow forecasting method based on road network space relation constraint Lasso
US20140309977A1 (en) * 2013-04-16 2014-10-16 International Business Machines Corporation Performing-time-series based predictions with projection thresholds using secondary time-series-based information stream
CN104715630A (en) * 2014-10-06 2015-06-17 中华电信股份有限公司 Arrival time prediction system and method
US20160180838A1 (en) * 2014-12-22 2016-06-23 Google Inc. User specified keyword spotting using long short term memory neural network feature extractor
CN105741554A (en) * 2016-04-29 2016-07-06 肖峰 Traffic flow congestion determining method and traffic flow congestion determining device based on mobile phone motion sensor data
CN106127344A (en) * 2016-06-28 2016-11-16 合肥酷睿网络科技有限公司 A kind of network bus arrival time Forecasting Methodology
CN106251642A (en) * 2016-09-18 2016-12-21 北京航空航天大学 A kind of public transport road based on real-time bus gps data chain speed calculation method
CN106570160A (en) * 2016-11-04 2017-04-19 北方工业大学 Mass spatio-temporal data cleaning method and mass spatio-temporal data cleaning device
CN106652534A (en) * 2016-12-14 2017-05-10 北京工业大学 Method for predicting arrival time of bus
CN106710218A (en) * 2017-03-09 2017-05-24 北京公共交通控股(集团)有限公司 Method for predicting arrival time of bus
CN106934752A (en) * 2017-03-07 2017-07-07 高剑 A kind of KXG based on bus
CN106997669A (en) * 2017-05-31 2017-08-01 青岛大学 A kind of method of the judgement traffic congestion origin cause of formation of feature based importance
US20170220524A1 (en) * 2013-12-20 2017-08-03 Intel Corporation Processing device for performing convolution operations
KR20170003769U (en) * 2017-10-14 2017-11-01 김재식 An Error Correction Model for Bus Arrival Information Prediction System Using Machine Learning
CN108154698A (en) * 2018-01-05 2018-06-12 上海元卓信息科技有限公司 A kind of public transport based on GPS track big data is to precise time computational methods leaving from station
US10002322B1 (en) * 2017-04-06 2018-06-19 The Boston Consulting Group, Inc. Systems and methods for predicting transactions
CN108802776A (en) * 2018-07-02 2018-11-13 武汉蓝泰源信息技术有限公司 Public transport GPS method for correcting error based on abnormity point elimination and trace compression algorithm
CN109191845A (en) * 2018-09-28 2019-01-11 吉林大学 A kind of public transit vehicle arrival time prediction technique
CN109345832A (en) * 2018-11-13 2019-02-15 上海应用技术大学 A kind of urban road based on depth recurrent neural network is overtaken other vehicles prediction technique

Patent Citations (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI417798B (en) * 2008-11-21 2013-12-01 Nat Taipei University Oftechnology High - speed reverse transfer neural network system with elastic structure and learning function
CN102081859A (en) * 2009-11-26 2011-06-01 上海遥薇实业有限公司 Control method of bus arrival time prediction model
CN102157075A (en) * 2011-03-15 2011-08-17 上海交通大学 Method for predicting bus arrivals
US20140309977A1 (en) * 2013-04-16 2014-10-16 International Business Machines Corporation Performing-time-series based predictions with projection thresholds using secondary time-series-based information stream
CN103310651A (en) * 2013-05-24 2013-09-18 北京市交通信息中心 Bus arrival prediction method based on real-time traffic status information
US20170220524A1 (en) * 2013-12-20 2017-08-03 Intel Corporation Processing device for performing convolution operations
CN103871246A (en) * 2014-02-10 2014-06-18 南京大学 Short-term traffic flow forecasting method based on road network space relation constraint Lasso
CN104715630A (en) * 2014-10-06 2015-06-17 中华电信股份有限公司 Arrival time prediction system and method
US20160180838A1 (en) * 2014-12-22 2016-06-23 Google Inc. User specified keyword spotting using long short term memory neural network feature extractor
CN105741554A (en) * 2016-04-29 2016-07-06 肖峰 Traffic flow congestion determining method and traffic flow congestion determining device based on mobile phone motion sensor data
CN106127344A (en) * 2016-06-28 2016-11-16 合肥酷睿网络科技有限公司 A kind of network bus arrival time Forecasting Methodology
CN106251642A (en) * 2016-09-18 2016-12-21 北京航空航天大学 A kind of public transport road based on real-time bus gps data chain speed calculation method
CN106570160A (en) * 2016-11-04 2017-04-19 北方工业大学 Mass spatio-temporal data cleaning method and mass spatio-temporal data cleaning device
CN106652534A (en) * 2016-12-14 2017-05-10 北京工业大学 Method for predicting arrival time of bus
CN106934752A (en) * 2017-03-07 2017-07-07 高剑 A kind of KXG based on bus
CN106710218A (en) * 2017-03-09 2017-05-24 北京公共交通控股(集团)有限公司 Method for predicting arrival time of bus
US10002322B1 (en) * 2017-04-06 2018-06-19 The Boston Consulting Group, Inc. Systems and methods for predicting transactions
CN106997669A (en) * 2017-05-31 2017-08-01 青岛大学 A kind of method of the judgement traffic congestion origin cause of formation of feature based importance
KR20170003769U (en) * 2017-10-14 2017-11-01 김재식 An Error Correction Model for Bus Arrival Information Prediction System Using Machine Learning
CN108154698A (en) * 2018-01-05 2018-06-12 上海元卓信息科技有限公司 A kind of public transport based on GPS track big data is to precise time computational methods leaving from station
CN108802776A (en) * 2018-07-02 2018-11-13 武汉蓝泰源信息技术有限公司 Public transport GPS method for correcting error based on abnormity point elimination and trace compression algorithm
CN109191845A (en) * 2018-09-28 2019-01-11 吉林大学 A kind of public transit vehicle arrival time prediction technique
CN109345832A (en) * 2018-11-13 2019-02-15 上海应用技术大学 A kind of urban road based on depth recurrent neural network is overtaken other vehicles prediction technique

Non-Patent Citations (10)

* Cited by examiner, † Cited by third party
Title
CHIEN S I J: "Dynamic bus arrival time prediction with artificial neural networks", 《JOURNAL OF TRANSPORTATION ENGINEERING》 *
GONG J, LIU M, ZHANG S: "Hybrid dynamic prediction model of bus arrival time based on weighted of historical and real-time GPS data", 《2013 IEEE 25TH CONTROL AND DECISION CONFERENCE (CCDC)》 *
K.Y CHAN: "Neural-network-based models for short-term traffic flow forecasting using ahybrid exponential smoothing and Levenberg-Marquardt algorithm", 《IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS》 *
TINGTING YIN, GANG ZHONG: "A prediction model of bus arrival time at stops with multi-routes", 《WORLD CONFERENCE ON TRANSPORT RESEARCH》 *
刘洋: "基于GRU神经网络的时间序列预测研究", 《中国知网硕士学位论文数据库》 *
刘靖: "基于Spark与粒子滤波算法的公交到站时间预测系统", 《计算机应用》 *
李晓: "运用大数据提升交通管理能力", 《企业技术开发》 *
江 颉: "面向物联网的分布式均分Lasso 算法", 《浙江工业大学学报》 *
范光鹏: "基于LSTM和Kalman滤波的公交车到站时间预测", 《计算机应用与软件》 *
蒋士正: "基于变量选择-神经网络模型的复杂路网短时交通流预测", 《上海交通大学学报》 *

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110459056A (en) * 2019-08-26 2019-11-15 南通大学 A kind of public transport arrival time prediction technique based on LSTM neural network
CN111263326A (en) * 2020-01-09 2020-06-09 中国人民解放军国防科技大学 Vehicle position prediction method based on multiple fusion convolution GRU
CN111263326B (en) * 2020-01-09 2022-11-04 中国人民解放军国防科技大学 Vehicle position prediction method based on multiple fusion convolution GRU
CN111489064A (en) * 2020-03-27 2020-08-04 湖南大学 Df-PBS system-oriented public bicycle station dynamic planning method and system
CN111489064B (en) * 2020-03-27 2023-05-12 湖南大学 Df-PBS (direct-flow-coupled system) -oriented public bicycle station dynamic planning method and system
CN111726351A (en) * 2020-06-16 2020-09-29 桂林电子科技大学 Bagging-improved GRU parallel network flow abnormity detection method
CN112149919A (en) * 2020-10-15 2020-12-29 武汉译码当先科技有限公司 Bus operation line evaluation method, device, equipment and storage medium
CN112149919B (en) * 2020-10-15 2024-01-16 武汉市公用电子工程有限责任公司 Bus operation line evaluating method, device, equipment and storage medium
CN112907953A (en) * 2021-01-27 2021-06-04 吉林大学 Bus travel time prediction method based on sparse GPS data
CN112907953B (en) * 2021-01-27 2022-01-28 吉林大学 Bus travel time prediction method based on sparse GPS data
CN113672687A (en) * 2021-10-25 2021-11-19 北京值得买科技股份有限公司 E-commerce big data processing method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN109920248B (en) 2021-09-17

Similar Documents

Publication Publication Date Title
CN109920248A (en) A kind of public transport arrival time prediction technique based on GRU neural network
CN110264709A (en) The prediction technique of the magnitude of traffic flow of road based on figure convolutional network
CN110517492B (en) Traffic path recommendation method, system and device based on parallel ensemble learning
CN110458336B (en) Online appointment vehicle supply and demand prediction method based on deep learning
CN106781489B (en) A kind of road network trend prediction method based on recurrent neural network
CN110164128A (en) A kind of City-level intelligent transportation analogue system
Ma et al. Short-term traffic flow forecasting by selecting appropriate predictions based on pattern matching
CN113537600B (en) Medium-long-term precipitation prediction modeling method for whole-process coupling machine learning
CN109840660A (en) A kind of vehicular characteristics data processing method and vehicle risk prediction model training method
CN106910199A (en) Towards the car networking mass-rent method of city space information gathering
CN104764868B (en) A kind of soil organic matter Forecasting Methodology based on Geographical Weighted Regression
CN109840639A (en) A kind of late time forecasting methods of high speed rail train operation
CN112419711B (en) Closed parking lot parking demand prediction method based on improved GMDH algorithm
Ye et al. Short-term prediction of available parking space based on machine learning approaches
Chen et al. A multiscale-grid-based stacked bidirectional GRU neural network model for predicting traffic speeds of urban expressways
CN110459056A (en) A kind of public transport arrival time prediction technique based on LSTM neural network
CN110533239A (en) A kind of smart city air quality high-precision measuring method
Zou et al. Passenger flow prediction using smart card data from connected bus system based on interpretable xgboost
CN110517494A (en) Forecasting traffic flow model, prediction technique, system, device based on integrated study
CN113674524A (en) LSTM-GASVR-based multi-scale short-time traffic flow prediction modeling and prediction method and system
CN103020733B (en) Method and system for predicting single flight noise of airport based on weight
Lee et al. Dynamic BIM component recommendation method based on probabilistic matrix factorization and grey model
CN115099328A (en) Traffic flow prediction method, system, device and storage medium based on countermeasure network
CN107316096A (en) A kind of track traffic one-ticket pass passenger amount of entering the station Forecasting Methodology
Lu et al. Prediction of tourist flow based on multi‐source traffic data in scenic spot

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB03 Change of inventor or designer information
CB03 Change of inventor or designer information

Inventor after: Shi Quan

Inventor after: Lu Juntian

Inventor after: Sun Ling

Inventor after: Cao Yang

Inventor after: Shen Qinqin

Inventor after: Zhu Senlai

Inventor before: Sun Ling

Inventor before: Lu Juntian

Inventor before: Shi Quan

Inventor before: Cao Yang

Inventor before: Shen Qinqin

Inventor before: Zhu Senlai

GR01 Patent grant
GR01 Patent grant