CN107563540A - A kind of public transport in short-term based on random forest is got on the bus the Forecasting Methodology of the volume of the flow of passengers - Google Patents
A kind of public transport in short-term based on random forest is got on the bus the Forecasting Methodology of the volume of the flow of passengers Download PDFInfo
- Publication number
- CN107563540A CN107563540A CN201710609933.4A CN201710609933A CN107563540A CN 107563540 A CN107563540 A CN 107563540A CN 201710609933 A CN201710609933 A CN 201710609933A CN 107563540 A CN107563540 A CN 107563540A
- Authority
- CN
- China
- Prior art keywords
- bus
- time window
- passengers
- volume
- flow
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Landscapes
- Traffic Control Systems (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
A kind of Forecasting Methodology for the volume of the flow of passengers of being got on the bus the invention provides public transport in short-term based on random forest, including:Obtain passenger's riding information and bus positional information in survey region;The website of getting on the bus of passenger is extrapolated by the passenger's riding information and bus positional information of acquisition;Zoning bus station and time window;Random forest grader is trained, establishes regressive prediction model;Forecast sample is built, the forecast sample is inputted into regressive prediction model, target area bus station is obtained and is got on the bus the volume of the flow of passengers in the prediction of object time window.The present invention, using random forests algorithm, obtains high-precision forecast result, has practical guided significance by proposing region bus station concept.
Description
Technical field
The present invention relates to technical field of transportation, and in particular to a kind of public transport in short-term based on random forest is got on the bus the volume of the flow of passengers
Forecasting Methodology.
Background technology
Traffic transport power of the public transport to whole city plays leading role, but current most domestic city is public
The city bus transport power deficiency of traffic transport power deficiency, particularly peak period, now goes up the volume of the flow of passengers in short-term to each bus station
Forecasting research seems particularly important.Can be that public transport management system carry to the above prediction of the volume of the flow of passengers in short-term of each bus station
For the volume of the flow of passengers in more structurally sound prediction, play a part of adjusting public transport transport power in time, alleviate the crowding of bus passenger.But
Main research at present concentrate on urban public traffic network planning and designing optimization and public transport management in terms of optimization, and deposit
In following problem:
1. because the data supporting of above-mentioned aspect is less, qualitative analysis is often based upon.
2. the short-term prediction research for traffic flow is relatively broad, but few predictions to the volume of the flow of passengers in public transport in short-term are ground
Study carefully.
3. existing prediction is based on single bus station, because the upper volume of the flow of passengers fluctuation of single bus station is larger, therefore
Prediction effect is poor.
The content of the invention
A kind of Forecasting Methodology for the volume of the flow of passengers of being got on the bus the invention provides public transport in short-term based on random forest, can solve the problem that
State the problem of prior art is present.
A kind of Forecasting Methodology for the volume of the flow of passengers of being got on the bus the invention provides public transport in short-term based on random forest, including:
Step S1:Obtain passenger's riding information and bus positional information in survey region;
Step S2:By the step S1 passenger's riding informations obtained and bus positional information, getting on the bus for passenger is extrapolated
Website;
Step S3:Zoning bus station and time window;
The survey region is divided into an equal amount of grid spaces, and the grid spaces are numbered, by same side
The bus station included in lattice is polymerize, and is obtained region bus station, whole day search time is divided into an equal amount of
Time window, count the volume of the flow of passengers of getting on the bus of regional bus station in each time window;
Step S4:Random forest grader is trained, establishes regressive prediction model;
Target area bus station and object time window are determined, with the target area bus station in the object time
The volume of the flow of passengers of getting on the bus of (d+1) individual time window before prediction day where window in n days is inputted as training sample using training sample
Random forest grader is trained, and establishes regressive prediction model;
Wherein, every day, n and d were integer in the volume of the flow of passengers of getting on the bus of (d+1) individual time window as a sample data;
Step S5:Forecast sample is built, the forecast sample is inputted into regressive prediction model, obtains target area bus station
Point is got on the bus the volume of the flow of passengers in the prediction of object time window;
When choosing d on the day of the target area bus station is located at object time window and before object time window
Between window the volume of the flow of passengers of getting on the bus as forecast sample, the forecast sample is inputted into the regressive prediction model, obtains target area
Bus station is got on the bus the volume of the flow of passengers in the prediction of object time window, and d is integer.
In the prior art, because the volume of the flow of passengers fluctuation of getting on the bus of single bus station is larger, it is based on single bus station
The passenger flow forecast effect of getting on the bus of point is poor, the directive significance without reality.The invention proposes " region public transport
The concept (i.e. step S3) of website ", by regarding the bus station in certain area as a set, overall statistics and entirety are pre-
The volume of the flow of passengers of always getting on the bus of all bus stations in the region is surveyed, can more reflect the trip information of resident in the region, is had more preferable
Can be predictive.Wherein, the actual size and institute that the grid size of " region bus station " can be according to whole survey region
Comprising bus station position and number flexibly delimit.Simultaneously as routine bus system is with respect to subway, passenger flow is more sparse,
In order to reach preferably statistics and prediction effect, be divided into multiple time windows of equalization, by each time the whole day time
The volume of the flow of passengers of getting on the bus in window is counted and predicted, to substitute for the passenger flow statisticses sometime put and prediction, have more preferable
Actual directive significance.
Further, the step S1 is specifically included:
Step S1.1:The bus IC card card using information of the passenger in survey region is obtained by the vehicle-mounted POS of bus,
The card using information includes identification number, pick-up time and the bus of the seating license plate number of passenger;
Step S1.2:Believe the wheelpath position obtained by bus vehicle positioning equipment in the bus running period
Breath, the wheelpath positional information includes bus license plate number, trajectory location points correspond to the time, trajectory location points correspond to longitude
With trajectory location points corresponding latitude.
Further, the step S2 is specifically included:
Step S2.1:The bus positional information that step S1 is obtained is compared with the public bus network data of reality, from
The location point matched with bus positional information is searched in public bus network data, temporal information corresponding to the location point is public affairs
Car is handed over to reach the specific time of each bus station;
Wherein, public bus network data include circuit number, site name, website sequence number, website longitude and website latitude;
Step S2.2:Passenger's riding information that step S1 is obtained is reached into each bus station with the bus extrapolated
The specific time carries out comparing, extrapolates the website of getting on the bus of passenger.
Specifically, by circuit number corresponding to the determination of bus license plate number, trajectory location points are then corresponded into longitude
With trajectory location points corresponding latitude compared with website longitude and website latitude pair, will both longitudes and latitudes it is close or identical when it is corresponding
Trajectory location points correspond to the time and site name establishes one-to-one relation, it follows that it is corresponding that each bus reaches it
The specific time of each bus station on circuit.
By the bus license plate number taken in passenger's riding information determine corresponding to bus, then by passenger when getting on the bus
Between reached with this bus obtained above in its corresponding line compared with the specific time of each bus station pair, so as to obtain
Get on the bus website and the pick-up time of passenger.
Further, the step S4 is specifically included:
Step S4.1:Target area bus station and object time window are determined, and obtains the target area bus station
The volume of the flow of passengers of getting on the bus of (d+1) individual time window before predicting day where the object time window in n days, i.e. target area public transport
(d+1) before day is predicted in all bus stations where the object time window in grid spaces corresponding to website in n days is individual
The volume of the flow of passengers of always getting on the bus of time window, n and d are integer;
Step S4.2:With the target area bus station in n days with object time window with the d time before the period
First input parameter x of the volume of the flow of passengers as training sample that get on the bus in window, with the target area bus station in n days with mesh
Mark the get on the bus volume of the flow of passengers second input parameter y as training sample of the time window with the period, structure training sample set D={ (x1,
y1),(x2,y2),…,(xi,yi),…,(xn,yn)},xi∈Rd,yi∈ R, each sample (xi,yi) all join comprising two inputs
Number, each sample contain in one day the volume of the flow of passengers of getting on the bus of (d+1) individual time window, wherein, xiThe intrinsic dimensionality having is d,
X in i.e. each sampleiD value is respectively provided with, represents in one day and is got on the bus with object time window with d time window before the period
The volume of the flow of passengers, xiForm matrix X=(x1,x2,…,xi,…,xn)T, matrix X have n rows d row, represent in n days with object time window
With the volume of the flow of passengers of getting on the bus in d time window before the period, yiThe intrinsic dimensionality having is 1, i.e., the y in each sampleiOnly have
One value, represents the volume of the flow of passengers of getting on the bus with object time window with the period, y in one dayiForm column matrix Y=(y1,y2,..,yn
)T, there is matrix Y n rows 1 to arrange, and represent in n days with object time window with the volume of the flow of passengers of getting on the bus known to the period, yiWith xiCorrespond;
Step S4.3:With training sample set D={ (x1,y1),(x2,y2),…,(xi,yi),…,(xn,yn)},xi∈Rd,yi
∈ R train random forest grader, establish regressive prediction model.
Further, the step S4.3 is specially:
With training sample set the D={ (x1,y1),(x2,y2),…,(xi,yi),…,(xn,yn) calculated as random forest
The input parameter of method, sets CART post-class processing number t and the depth deep of each tree, and each node uses f dimensional features, entered
Row model training;
Wherein, t, deep and f are integer, and f value is d, d square root or is logarithm that bottom takes d with 2;
Step S4.3.1:There is for sampling with putting back to and being formed in t CART post-class processing from the training sample set D
The self-service sample set of j CART post-class processing, to each tree successively from root node start with node pair in each tree with
The self-service sample set is divided corresponding to each tree;
Wherein, j span arrives t for 1;
Step S4.3.2:In the d dimensional features having at each node of each tree from training sample set without put back to
Machine chooses f dimensional features, and seeks the best kth dimensional feature of classifying quality from f dimensional features, using the kth dimensional feature as division
Feature, as threshold value, the present node for being unsatisfactory for end condition is divided using characteristic value corresponding to the kth dimensional feature;
Wherein, the sample for kth dimensional feature in present node being less than to threshold value is divided into left sibling, will be remaining in present node
Sample be divided into right node, k span arrives f for 1;
Step S4.3.3:The present node for meeting end condition is divided into leaf node, the prediction of the leaf node
Export the average value of each sample value included for present node;
Wherein, when end condition is that the sample number that includes of present node is minimum and information gain is minimum, the present node
Stop division;
Step S4.3.4:Repeat step S4.3.1 to step S4.3.3, until all nodes are all completed to train or marked
It is designated as leaf node;
Step S4.3.5:Repeat step S4.3.1 to step S4.3.4, until all CART post-class processings are completed to train.
Further, the prediction process of regressive prediction model described in step S5 is specially:
Choose jth CART post-class processings, the root by the CART post-class processings to forecast sample from present tree
Node proceeds by division, and according to the division feature and threshold value of node, the forecast sample less than threshold value is divided into left section
Point, remaining sample are divided into right node, until reaching the leaf node of present tree, and export predicted value;Wherein, j value model
Enclose and arrive t for 1;
Aforesaid operations are repeated, until all CART post-class processings export predicted value, all CART post-class processings outputs
The average value of predicted value is the output valve of the regressive prediction model.
Beneficial effect
The present invention can not get on the bus this problem, i.e. peak period for the peak period in-car overcrowding passenger that causes to wait
City bus transport power deficiency, it is proposed that a kind of public transport in short-term based on random forest is got on the bus the Forecasting Methodology of the volume of the flow of passengers, is passed through
The concept of region bus station is proposed, it is public by excavation of the machine learning algorithm to historical data and learning training, estimation range
The volume of the flow of passengers on website is handed over, precision of prediction is high, has practical guided significance, is provided for public transport management system more structurally sound pre-
The volume of the flow of passengers in survey, play a part of adjusting public transport transport power in time, improve the service level of public transport.
Brief description of the drawings
Fig. 1 is that a kind of public transport in short-term based on random forest provided in an embodiment of the present invention is got on the bus the Forecasting Methodology of the volume of the flow of passengers
Flow chart;
Fig. 2 and Fig. 3 be using this method predict Shenzhen in short-term in public transport the volume of the flow of passengers result displaying figure, wherein, Fig. 2 represent
30 days 18 October in 2014:00-19:00 observation and 19:00-20:The fitted figure of 00 observation, Fig. 3 represent all areas
Domain was 30 days 19 October in 2014:00-20:00 predicted value and the fit solution of observation.
Embodiment
More fully understand that a kind of public transport in short-term based on random forest provided by the invention is got on the bus the volume of the flow of passengers for convenience
Forecasting Methodology, it is described in detail with reference to specific embodiment.
A kind of Forecasting Methodology for the volume of the flow of passengers of being got on the bus the embodiments of the invention provide public transport in short-term based on random forest, realize
Specific steps are as shown in Figure 1.The present embodiment has used Shenzhen public affairs in Shenzhen's on October 31,11 days to 2014 October in 2014
Hand over IC-card brushing card data and public transport vehicle-mounted gps data.Embodiment comprises the following steps:
Step S1:The bus IC card brushing card data in Shenzhen's on October 31st, 11 days 1 October in 2014 is obtained, often
One data record of swiping the card " passenger ID ", " pick-up time ", " information such as public transport ID ";Obtain Shenzhen on October 11st, 2014
To the public transport vehicle-mounted gps data on October 31st, 2014, the data contain " public transport ID ", " GPS point time ", " GPS point warp
The information such as degree ", " GPS point latitude ".
Step S2:Extrapolate the Entrucking Point and pick-up time of the passenger according to above- mentioned information, i.e. passenger every swipes the card note
Bus station where when record occurs.In order to reach this purpose, it is necessary first to by gps data and public bus network, station data
Data fusion is carried out, extrapolates the specific time that public transit vehicle reaches each bus station accordingly;Then by the data after fusion
Further merged with passenger's IC-card brushing card data, extrapolate get on the bus website and the pick-up time of passenger.Public bus network packet
Contain " circuit number ", " site name ", " website sequence number ", " website longitude ", " website latitude ".The mode of data fusion is
Retain all fields of different pieces of information, specifically contain on Shenzhen passenger in October 11 to 31 days 21 October in 2014 in 2014
Car site information, include 14109 trip datas of 293 passengers altogether after screening.
Step S3:It is that Shenzhen's (sharing 53914 bus stations) is divided into an equal amount of side by whole survey region
Lattice (each grid size is 1km × 1km), and grid is numbered, the bus station that then will be included in each grid
Point is polymerize, so as to forming region bus station;Meanwhile the research of the present embodiment uses daily 6:00-22:In 00 period
Caused brushing card data, counts the volume of the flow of passengers in the public transport daily in each each grid of time window, and time window size is 1h, whole day
Totally 16 time windows.
Step S4:Determine that target area bus station (grid to be predicted) and object time window (to be predicted
Time window), (d+1) that statistics is obtained before day is predicted where the object time window in the target area bus station in n days is individual
The volume of the flow of passengers of getting on the bus of time window.
By the target area bus station in n days with object time window with upper in d time window before the period
First input parameter x of the car volume of the flow of passengers as training sample, with the target area bus station in n days with object time window
With second input parameter y of the volume of the flow of passengers as training sample that get on the bus of period, structure training sample set D={ (x1,y1),(x2,
y2),…,(xi,yi),…,(xn,yn)},xi∈Rd,yi∈ R, each sample (xi,yi) two input parameters are all included, it is each
Individual sample contains in one day the volume of the flow of passengers of getting on the bus of (d+1) individual time window, wherein, xiThe intrinsic dimensionality having is d, i.e., each sample
X in thisiD value is respectively provided with, represents the volume of the flow of passengers of getting on the bus with object time window with d time window before the period, x in one dayi
Form matrix X=(x1,x2,…,xi,…,xn)T, matrix X has n rows d row, before representing in n days with object time window with the period
D time window in the volume of the flow of passengers of getting on the bus, yiThe intrinsic dimensionality having is 1, i.e., the y in each sampleiThere was only a value, generation
The get on the bus volume of the flow of passengers of the table in one day with object time window with the period, yiForm column matrix Y=(y1,y2,..,yn)T, matrix Y tools
There are n rows 1 to arrange, represent in n days with object time window with the volume of the flow of passengers of getting on the bus known to the period, yiWith xiCorrespond.
With training sample set D={ (x1,y1),(x2,y2),…,(xi,yi),…,(xn,yn)},xi∈Rd,yi∈ R training with
Machine forest classified device, establishes regressive prediction model.
Step S5:When obtaining d before target area bus station is located at object time window on the day of object time window
Between window the volume of the flow of passengers of getting on the bus;With the volume of the flow of passengers structure forecast sample x that gets on the bus*, x*∈Rd, forecast sample x*With the first input parameter x
In each sample there is identical intrinsic dimensionality d, i.e. forecast sample and the first input parameter contains equal number time window
The volume of the flow of passengers of getting on the bus;By forecast sample x*The regressive prediction model is inputted, obtains target area bus station in the object time
The prediction of window is got on the bus the volume of the flow of passengers.
Specifically, in the present embodiment, on October 27th, 2014 to 30 days 16 October is chosen:00-20:00 period
The volume of the flow of passengers of getting on the bus of target grid is researched and analysed in (i.e. time window TW16-TW19), using the data of 27 to 29 as
Training set, the data of 30 days are as forecast set.In corresponding training characteristics collection D, d=3, comprising TW16-TW18,3 times are represented
Window;N=3, represent 27 days to three days on the 29th;Forecast set x*Represent the same day 16 on the 30th:00-19:00 (TW16-TW18) target area
The volume of the flow of passengers of getting on the bus of domain bus station.Machine learning algorithm used in the present embodiment is random forest, and its algorithm is realized and included
Two processes of training and prediction, it is specific as follows shown:
Training process:
1st, training set is training characteristics collection D={ (x1,y1),(x2,y2),…,(xn,yn)},xi∈Rw,yi∈ R. test sets
For forecast set x*∈Rd, training characteristics collection and forecast set are respectively provided with d dimensional features.Therefore symbiosis is into t CART post-class processing, every
The depth of tree is deep, and each node uses f dimensional features, and when a certain node includes, sample number is minimum and information gain is minimum
When, the node stops division;In the present embodiment, t values are that 10, deep is initial value, are worth for sky, take f=d.
2nd, sample with putting back to form Train (j) from training characteristics collection D, Train (j) represents that jth CART classifies back
Gui Shu training set, wherein, j=1,2,3 ..., 10, trained since root node.
If the 3, present node meets end condition, present node is divided into leaf node, the leaf node it is pre-
The average value for each sample value of sample set that output includes for present node is surveyed, then proceedes to train other nodes.If work as prosthomere
Point is unsatisfactory for end condition, then without putting back to randomly selects f dimensional features (f value one by a certain percentage from above-mentioned d dimensional features
As be d, sqrt (d) or log2(d) f=d, in the present embodiment, is taken), search out classifying quality preferably (i.e. present node sample
During the value maximum for the variance VarRight that the variance VarLeft that the variance Var of this collection subtracts left child node subtracts right child node again)
One-dimensional characteristic, be designated as kth dimensional feature (1<k<F) and corresponding characteristic value is threshold value Threshold, by present node kth
Sample of the dimensional feature less than threshold value Threshold is divided into left sibling, and remaining sample is divided into right node, then proceedes to train
Other nodes.
4th, leaf node was all trained or be marked as to repeat step 2,3 until all nodes.
5th, repeat step 2,3,4 was all trained to until all CART post-class processings.
Prediction process:
1st, for jth CART trees, since the root node of present tree, according to the division feature k and threshold value of present node
Threshold, the sample less than threshold value Threshold is divided into left sibling, remaining sample is divided into right node, until arriving
Up to some leaf node, and export predicted value.
2nd, previous step is repeated until t CART tree all outputs predicted value, and the predicted value of Random Forest model is
For the average value of the output of all CART trees.
Specifically, in the present embodiment, on October 30 19 is predicted:00-20:00 all grids are 19:00-20:00
Get on the bus passenger flow, and by predicted value compared with observation, as a result as shown in Figures 2 and 3, Fig. 2 represents on October 30th, 2014
18:00-19:00 observation and 19:00-20:The fitted figure of 00 observation;Fig. 3 represents all areas in October, 2014
30 days 19:00-20:00 predicted value and the fit solution of observation.R in figure2It is excellent for the coefficient of determination (goodness of fit), fitting
Degree is bigger to represent that point is more intensive near the tropic, and prediction effect is better;RMSE represents root-mean-square error, its smaller expression of value
Prediction effect is better.As can be seen, the prediction effect of this method is fabulous.
In summary, a kind of public transport in short-term based on random forest provided by the invention is got on the bus the Forecasting Methodology of the volume of the flow of passengers,
It can not be got on the bus the city bus transport power of this problem, i.e. peak period for the peak period in-car overcrowding passenger that causes to wait
The problem of insufficient, by proposing the concept of region bus station, by excavation and study of the machine learning algorithm to historical data
Train, the volume of the flow of passengers on the bus station of estimation range, the volume of the flow of passengers in more structurally sound prediction is provided for public transport management system, rise
To the effect of timely adjustment public transport transport power, the service level of public transport is improved.
Embodiments of the invention are the foregoing is only, are not intended to limit the invention, it is all in spirit of the invention and former
Within then, change, equivalent substitution, improvement etc., should be included in the scope of the protection.
Claims (6)
- The Forecasting Methodology of the volume of the flow of passengers 1. a kind of public transport in short-term based on random forest is got on the bus, it is characterised in that including:Step S1:Obtain passenger's riding information and bus positional information in survey region;Step S2:By the step S1 passenger's riding informations obtained and bus positional information, the website of getting on the bus of passenger is extrapolated;Step S3:Zoning bus station and time window;The survey region is divided into an equal amount of grid spaces, and the grid spaces are numbered, by same grid Comprising bus station polymerize, obtain region bus station, whole day search time be divided into an equal amount of time Window, count the volume of the flow of passengers of getting on the bus of regional bus station in each time window;Step S4:Random forest grader is trained, establishes regressive prediction model;Target area bus station and object time window are determined, with the target area bus station in the object time window institute The volume of the flow of passengers of getting on the bus of (d+1) individual time window before day is predicted in n days is inputted random as training sample using training sample Forest classified device is trained, and establishes regressive prediction model;Wherein, every day, n and d were integer in the volume of the flow of passengers of getting on the bus of (d+1) individual time window as a sample data;Step S5:Forecast sample is built, the forecast sample is inputted into regressive prediction model, target area bus station is obtained and exists The prediction of object time window is got on the bus the volume of the flow of passengers;Choose d time window on the day of the target area bus station is located at object time window and being located at before object time window The volume of the flow of passengers of getting on the bus as forecast sample, the forecast sample is inputted into the regressive prediction model, obtains target area public transport Website is got on the bus the volume of the flow of passengers in the prediction of object time window, and d is integer.
- 2. Forecasting Methodology according to claim 1, it is characterised in that the step S1 is specifically included:Step S1.1:The bus IC card card using information of the passenger in survey region is obtained by the vehicle-mounted POS of bus, it is described Card using information includes identification number, pick-up time and the bus of the seating license plate number of passenger;Step S1.2:Wheelpath positional information in the bus running period is obtained by bus vehicle positioning equipment, The wheelpath positional information include bus license plate number, trajectory location points correspond to the time, trajectory location points correspond to longitude and Trajectory location points corresponding latitude.
- 3. Forecasting Methodology according to claim 2, it is characterised in that the step S2 is specifically included:Step S2.1:The bus positional information that step S1 is obtained is compared with the public bus network data of reality, from public transport The location point matched with bus positional information is searched in track data, temporal information corresponding to the location point is bus Reach the specific time of each bus station;Wherein, public bus network data include circuit number, site name, website sequence number, website longitude and website latitude;Step S2.2:Passenger's riding information that step S1 is obtained is reached into the specific of each bus station with the bus extrapolated Time carries out comparing, extrapolates the website of getting on the bus of passenger.
- 4. Forecasting Methodology according to claim 3, it is characterised in that the step S4 is specifically included:Step S4.1:Target area bus station and object time window are determined, and obtains the target area bus station in institute The volume of the flow of passengers of getting on the bus of (d+1) individual time window before prediction day where stating object time window in n days, n and d are integer;Step S4.2:With the target area bus station in n days with object time window with d time window before the period First input parameter x of the volume of the flow of passengers as training sample that get on the bus, with the target area bus station in n days with target when Between window with second input parameter y of the volume of the flow of passengers as training sample that get on the bus of period, structure training sample set D={ (x1,y1), (x2,y2),…,(xi,yi),…,(xn,yn)},xi∈Rd,yi∈ R, each sample (xi,yi) two input parameters are all included, Wherein, xiThe intrinsic dimensionality having is d, and d is integer;Step S4.3:With training sample set D={ (x1,y1),(x2,y2),…,(xi,yi),…,(xn,yn)},xi∈Rd,yi∈R Random forest grader is trained, establishes regressive prediction model.
- 5. Forecasting Methodology according to claim 4, it is characterised in that the step S4.3 is specially:With training sample set the D={ (x1,y1),(x2,y2),…,(xi,yi),…,(xn,yn) as random forests algorithm Input parameter, CART post-class processing number t and the depth deep of each tree are set, each node uses f dimensional features, carries out mould Type training;Wherein, t, deep and f are integer, and f value is d, d square root or is logarithm that bottom takes d with 2;Step S4.3.1:Have from the training sample set D sample the jth to be formed in t CART post-class processing with putting back to The self-service sample set of CART post-class processings, node pair in each tree and every are started with from root node successively to each tree The self-service sample set is divided corresponding to tree;Wherein, j span arrives t for 1;Step S4.3.2:Without putting back to, ground is random to be selected in the d dimensional features having at each node of each tree from training sample set F dimensional features are taken, and seek the best kth dimensional feature of classifying quality from f dimensional features, using the kth dimensional feature as division feature, , as threshold value, the present node for being unsatisfactory for end condition is divided using characteristic value corresponding to the kth dimensional feature;Wherein, the sample for kth dimensional feature in present node being less than to threshold value is divided into left sibling, by remaining sample in present node Originally right node is divided into, k span arrives f for 1;Step S4.3.3:The present node for meeting end condition is divided into leaf node, the prediction output of the leaf node The average value of each sample value included for present node;Wherein, when end condition is that the sample number that includes of present node is minimum and information gain is minimum, the present node stops Division;Step S4.3.4:Repeat step S4.3.1 to step S4.3.3, until all nodes are all completed to train or be marked as Leaf node;Step S4.3.5:Repeat step S4.3.1 to step S4.3.4, until all CART post-class processings are completed to train.
- 6. Forecasting Methodology according to claim 5, it is characterised in that the prediction of regressive prediction model described in step S5 Journey is specially:Choose jth CART post-class processings, the root node by the CART post-class processings to forecast sample from present tree Division is proceeded by, according to the division feature and threshold value of node, the forecast sample less than threshold value is divided into left sibling, is remained Remaining sample is divided into right node, until reaching the leaf node of present tree, and exports predicted value;Wherein, j span is 1 To t;Aforesaid operations are repeated, until all CART post-class processings export predicted value, all CART post-class processings output predictions The average value of value is the output valve of the regressive prediction model.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710609933.4A CN107563540B (en) | 2017-07-25 | 2017-07-25 | Method for predicting short-time bus boarding passenger flow based on random forest |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710609933.4A CN107563540B (en) | 2017-07-25 | 2017-07-25 | Method for predicting short-time bus boarding passenger flow based on random forest |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107563540A true CN107563540A (en) | 2018-01-09 |
CN107563540B CN107563540B (en) | 2021-03-30 |
Family
ID=60974256
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710609933.4A Active CN107563540B (en) | 2017-07-25 | 2017-07-25 | Method for predicting short-time bus boarding passenger flow based on random forest |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107563540B (en) |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108415885A (en) * | 2018-02-08 | 2018-08-17 | 武汉蓝泰源信息技术有限公司 | The real-time bus passenger flow prediction technique returned based on neighbour |
CN108877223A (en) * | 2018-07-13 | 2018-11-23 | 南京理工大学 | A kind of Short-time Traffic Flow Forecasting Methods based on temporal correlation |
CN109035770A (en) * | 2018-07-31 | 2018-12-18 | 上海世脉信息科技有限公司 | The real-time analyzing and predicting method of public transport passenger capacity under a kind of big data environment |
CN109711428A (en) * | 2018-11-20 | 2019-05-03 | 佛山科学技术学院 | A kind of saturated gas pipeline internal corrosion speed predicting method and device |
CN109741597A (en) * | 2018-12-11 | 2019-05-10 | 大连理工大学 | A kind of bus section runing time prediction technique based on improvement depth forest |
CN111105070A (en) * | 2019-11-20 | 2020-05-05 | 深圳市北斗智能科技有限公司 | Passenger flow early warning method and system |
CN112235362A (en) * | 2020-09-28 | 2021-01-15 | 北京百度网讯科技有限公司 | Position determination method, device, equipment and storage medium |
CN112949939A (en) * | 2021-03-30 | 2021-06-11 | 福州市电子信息集团有限公司 | Taxi passenger carrying hotspot prediction method based on random forest model |
CN113392880A (en) * | 2021-05-27 | 2021-09-14 | 扬州大学 | Traffic flow short-time prediction method based on deviation correction random forest |
WO2021189950A1 (en) * | 2020-10-29 | 2021-09-30 | 平安科技(深圳)有限公司 | Short-time bus station passenger flow prediction method and apparatus, and computer device and storage medium |
CN113570099A (en) * | 2020-04-28 | 2021-10-29 | 百度在线网络技术(北京)有限公司 | Departure interval prediction method, prediction model training method, device and equipment |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102169606A (en) * | 2010-02-26 | 2011-08-31 | 同济大学 | Method for predicting influence of heavy passenger flow of urban rail transit network |
CN102436603A (en) * | 2011-08-29 | 2012-05-02 | 北京航空航天大学 | Rail transit full-road-network passenger flow prediction method based on probability tree destination (D) prediction |
CN105095993A (en) * | 2015-07-22 | 2015-11-25 | 济南市市政工程设计研究院(集团)有限责任公司 | System and method for predicting passenger flow volume of railway stations |
US20150356458A1 (en) * | 2014-06-10 | 2015-12-10 | Jose Oriol Lopez Berengueres | Method And System For Forecasting Future Events |
-
2017
- 2017-07-25 CN CN201710609933.4A patent/CN107563540B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102169606A (en) * | 2010-02-26 | 2011-08-31 | 同济大学 | Method for predicting influence of heavy passenger flow of urban rail transit network |
CN102436603A (en) * | 2011-08-29 | 2012-05-02 | 北京航空航天大学 | Rail transit full-road-network passenger flow prediction method based on probability tree destination (D) prediction |
US20150356458A1 (en) * | 2014-06-10 | 2015-12-10 | Jose Oriol Lopez Berengueres | Method And System For Forecasting Future Events |
CN105095993A (en) * | 2015-07-22 | 2015-11-25 | 济南市市政工程设计研究院(集团)有限责任公司 | System and method for predicting passenger flow volume of railway stations |
Cited By (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108415885A (en) * | 2018-02-08 | 2018-08-17 | 武汉蓝泰源信息技术有限公司 | The real-time bus passenger flow prediction technique returned based on neighbour |
CN108877223A (en) * | 2018-07-13 | 2018-11-23 | 南京理工大学 | A kind of Short-time Traffic Flow Forecasting Methods based on temporal correlation |
CN109035770B (en) * | 2018-07-31 | 2022-01-04 | 上海世脉信息科技有限公司 | Real-time analysis and prediction method for bus passenger capacity in big data environment |
CN109035770A (en) * | 2018-07-31 | 2018-12-18 | 上海世脉信息科技有限公司 | The real-time analyzing and predicting method of public transport passenger capacity under a kind of big data environment |
CN109711428A (en) * | 2018-11-20 | 2019-05-03 | 佛山科学技术学院 | A kind of saturated gas pipeline internal corrosion speed predicting method and device |
CN109741597B (en) * | 2018-12-11 | 2020-09-29 | 大连理工大学 | Bus section operation time prediction method based on improved deep forest |
CN109741597A (en) * | 2018-12-11 | 2019-05-10 | 大连理工大学 | A kind of bus section runing time prediction technique based on improvement depth forest |
CN111105070A (en) * | 2019-11-20 | 2020-05-05 | 深圳市北斗智能科技有限公司 | Passenger flow early warning method and system |
CN111105070B (en) * | 2019-11-20 | 2024-04-16 | 深圳市北斗智能科技有限公司 | Passenger flow early warning method and system |
CN113570099A (en) * | 2020-04-28 | 2021-10-29 | 百度在线网络技术(北京)有限公司 | Departure interval prediction method, prediction model training method, device and equipment |
CN112235362A (en) * | 2020-09-28 | 2021-01-15 | 北京百度网讯科技有限公司 | Position determination method, device, equipment and storage medium |
CN112235362B (en) * | 2020-09-28 | 2022-08-30 | 北京百度网讯科技有限公司 | Position determination method, device, equipment and storage medium |
WO2021189950A1 (en) * | 2020-10-29 | 2021-09-30 | 平安科技(深圳)有限公司 | Short-time bus station passenger flow prediction method and apparatus, and computer device and storage medium |
CN112949939A (en) * | 2021-03-30 | 2021-06-11 | 福州市电子信息集团有限公司 | Taxi passenger carrying hotspot prediction method based on random forest model |
CN113392880B (en) * | 2021-05-27 | 2021-11-23 | 扬州大学 | Traffic flow short-time prediction method based on deviation correction random forest |
CN113392880A (en) * | 2021-05-27 | 2021-09-14 | 扬州大学 | Traffic flow short-time prediction method based on deviation correction random forest |
Also Published As
Publication number | Publication date |
---|---|
CN107563540B (en) | 2021-03-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107563540A (en) | A kind of public transport in short-term based on random forest is got on the bus the Forecasting Methodology of the volume of the flow of passengers | |
CN108133302B (en) | Public bicycle potential demand prediction method based on big data | |
CN105788260B (en) | A kind of bus passenger OD projectional techniques based on intelligent public transportation system data | |
CN103177575B (en) | System and method for dynamically optimizing online dispatching of urban taxies | |
CN110264709A (en) | The prediction technique of the magnitude of traffic flow of road based on figure convolutional network | |
CN103984994B (en) | Method for predicting urban rail transit passenger flow peak duration | |
CN110836675B (en) | Decision tree-based automatic driving search decision method | |
CN105206048A (en) | Urban resident traffic transfer mode discovery system and method based on urban traffic OD data | |
CN103310287A (en) | Rail transit passenger flow predicting method for predicting passenger travel probability and based on support vector machine (SVM) | |
CN102324128A (en) | Method for predicting OD (Origin-Destination) passenger flow among bus stations on basis of IC (Integrated Circuit)-card record and device | |
CN109544690A (en) | Shared bicycle trip influence factor recognition methods, system and storage medium | |
CN106898142B (en) | A kind of path forms time reliability degree calculation method considering section correlation | |
CN107919014A (en) | Taxi towards more carrying kilometres takes in efficiency optimization method | |
CN109840272B (en) | Method for predicting user demand of shared electric automobile station | |
CN106327867B (en) | Bus punctuation prediction method based on GPS data | |
CN106777169A (en) | A kind of user's trip hobby analysis method based on car networking data | |
JP6307376B2 (en) | Traffic analysis system, traffic analysis program, and traffic analysis method | |
CN109800903A (en) | A kind of profit route planning method based on taxi track data | |
CN113642757A (en) | Internet of things charging pile construction planning method and system based on artificial intelligence | |
CN111008730B (en) | Crowd concentration prediction model construction method and device based on urban space structure | |
Tan et al. | Statistical analysis and prediction of regional bus passenger flows | |
CN115994787A (en) | Car pooling demand prediction matching method based on neural network | |
CN116090785A (en) | Custom bus planning method for two stages of large-scale movable loose scene | |
CN109886746A (en) | A kind of trip purpose recognition methods based on passenger getting off car when and where | |
CN114239929A (en) | Taxi traffic demand characteristic prediction method based on random forest |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |