CN110490365A - A method of based on the pre- survey grid of multisource data fusion about vehicle order volume - Google Patents
A method of based on the pre- survey grid of multisource data fusion about vehicle order volume Download PDFInfo
- Publication number
- CN110490365A CN110490365A CN201910630258.2A CN201910630258A CN110490365A CN 110490365 A CN110490365 A CN 110490365A CN 201910630258 A CN201910630258 A CN 201910630258A CN 110490365 A CN110490365 A CN 110490365A
- Authority
- CN
- China
- Prior art keywords
- order
- data
- order volume
- scaling matrices
- timeslice
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 42
- 230000004927 fusion Effects 0.000 title claims abstract description 17
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 31
- 230000006870 function Effects 0.000 claims abstract description 14
- 239000011159 matrix material Substances 0.000 claims description 17
- 238000013480 data collection Methods 0.000 claims description 16
- 230000008569 process Effects 0.000 claims description 13
- 238000012549 training Methods 0.000 claims description 6
- 230000001174 ascending effect Effects 0.000 claims description 3
- 238000009795 derivation Methods 0.000 claims description 3
- 238000000605 extraction Methods 0.000 claims description 3
- 238000011478 gradient descent method Methods 0.000 claims description 3
- 241001269238 Data Species 0.000 abstract description 5
- 230000000052 comparative effect Effects 0.000 description 10
- 230000008901 benefit Effects 0.000 description 5
- 230000000694 effects Effects 0.000 description 5
- 230000008859 change Effects 0.000 description 2
- 230000007423 decrease Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 241000208340 Araliaceae Species 0.000 description 1
- MKYBYDHXWVHEJW-UHFFFAOYSA-N N-[1-oxo-1-(2,4,6,7-tetrahydrotriazolo[4,5-c]pyridin-5-yl)propan-2-yl]-2-[[3-(trifluoromethoxy)phenyl]methylamino]pyrimidine-5-carboxamide Chemical compound O=C(C(C)NC(=O)C=1C=NC(=NC=1)NCC1=CC(=CC=C1)OC(F)(F)F)N1CC2=C(CC1)NN=N2 MKYBYDHXWVHEJW-UHFFFAOYSA-N 0.000 description 1
- 235000005035 Panax pseudoginseng ssp. pseudoginseng Nutrition 0.000 description 1
- 235000003140 Panax quinquefolius Nutrition 0.000 description 1
- 230000001133 acceleration Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000000354 decomposition reaction Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 230000007717 exclusion Effects 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 235000008434 ginseng Nutrition 0.000 description 1
- 238000005065 mining Methods 0.000 description 1
- YHXISWVBGDMDLQ-UHFFFAOYSA-N moclobemide Chemical compound C1=CC(Cl)=CC=C1C(=O)NCCN1CCOCC1 YHXISWVBGDMDLQ-UHFFFAOYSA-N 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/251—Fusion techniques of input or preprocessed data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/06—Buying, selling or leasing transactions
- G06Q30/0601—Electronic shopping [e-shopping]
- G06Q30/0633—Lists, e.g. purchase orders, compilation or processing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/40—Business processes related to the transportation industry
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Economics (AREA)
- Strategic Management (AREA)
- Physics & Mathematics (AREA)
- Marketing (AREA)
- General Business, Economics & Management (AREA)
- Human Resources & Organizations (AREA)
- Tourism & Hospitality (AREA)
- Development Economics (AREA)
- Finance (AREA)
- Data Mining & Analysis (AREA)
- Accounting & Taxation (AREA)
- Entrepreneurship & Innovation (AREA)
- Bioinformatics & Computational Biology (AREA)
- Operations Research (AREA)
- Game Theory and Decision Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Quality & Reliability (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Primary Health Care (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a kind of methods based on the pre- survey grid of multisource data fusion about vehicle order volume.It proposes to carry out the prediction of OD order volume based on the average weighted hierarchical prediction model of scaling matrices.It is proposed predicts the scaling matrices of future time piece based on the average weighted mode of scaling matrices, weight be according to the time, weather, etc. the measuring similarity functions of features determine that therefore the algorithm can effectively merge these multi-source datas.It is last to be allocated city blanket order amount according to corresponding value in obtained scaling matrices, obtain the order volume of each OD.The present invention promotes recurrence tree algorithm using gradient and predicts the city blanket order amount of future time piece, the scaling matrices of future time piece are predicted in conjunction with scaling matrices average weighted mode, effectively these multi-source datas are merged finally by PMWA algorithm, obtain the order volume of each OD, effectively solve the problems, such as " multi-thread prediction ", precision of prediction with higher.
Description
Technical field
The present invention relates to intelligent transport system fields, and in particular to one kind is based on the pre- survey grid of multisource data fusion about vehicle order
The method of amount.
Background technique
With the continuous development of social economy and urbanization, Urban traffic demand is increased rapidly, the optional routing of passenger's trip
Diameter increases, and net about vehicle becomes most people trip preferred manner.However traffic congestion, roading be unreasonable and infrastructure
There is the problem of " difficulty of calling a taxi " so that Traffic Problems are increasingly serious in contradiction not in place, thankfully, with
Being constantly progressive for the technologies such as informationization, computer, automatic control and artificial intelligence, the foundation of intelligent transportation system, Neng Gouyou
Solve the above problems to effect.
Method based on the pre- survey grid of multisource data fusion about vehicle order volume belongs to a part of intelligent transportation system, purpose
It is that net about vehicle OD (Origin-Destination) order volume is predicted in comprehensive many factors analysis, reasonably distributes city
Blanket order scheme, and then achieve the purpose that resource consumption is effectively reduced and alleviate passenger " difficulty of calling a taxi " problem.Existing net is about
Vehicle requirement forecasting, prediction target only limit starting point, do not limit destination.And in practical application scene, it is defined
The prediction of OD (Origin-Destination) order volume of starting point and destination is necessary.Vehicle scheduling can not only be used as
Reference, alleviate " difficulty of calling a taxi " problem, also help building intelligent transportation system, hold flow of personnel situation.Simultaneously as every
The changing rule of a OD sequence data is different, and common method for predicting such as ARIMA and LSTM etc., is needed according to not
Different parameter or model structure are determined with the changing rule of OD sequence data, thus can only solve the problems, such as " single line prediction ",
Its OD order volume is predicted according to single OD sequence data.
Summary of the invention
The present invention provides a kind of based on the pre- survey grid of multisource data fusion about vehicle to overcome above-mentioned prior art shortcoming
The method of order volume.The present invention proposes to be weighted and averaged (Proportion-Matrix-based Weighted based on scaling matrices
Average, abbreviation PMWA) hierarchical prediction model carry out the prediction of OD order volume, which includes two parts: city blanket order amount
Prediction, the prediction of OD scaling matrices.Regression tree (Gradient Boosting Regression Tree, abbreviation are promoted using gradient
GBRT) algorithm predicts the city blanket order amount of future time piece.Scaling matrices refer to the order volume of all OD in city
The matrix that accounting in blanket order amount is constituted.It proposes based on the average weighted mode of scaling matrices to the ratio square of future time piece
Battle array predicted, weight be according to the time, weather, etc. the measuring similarity functions of features determine that therefore the algorithm can
Effectively these multi-source datas to be merged.It is last according to corresponding value in obtained scaling matrices by city blanket order amount
It is allocated, obtains the order volume of each OD.Method provided by the invention can effectively solve the problem that by merging various influence factors
" multi-thread prediction " problem, i.e., predict the order volume of whole OD according to multiple OD sequence datas.
To further illustrate a kind of method based on the pre- survey grid of multisource data fusion about vehicle order volume, spy is to described in file
Related symbol parameter is illustrated:
A method of based on the pre- survey grid of multisource data fusion about vehicle order volume, include the following steps:
S1, History Order data set, weather data collection and/or traffic congestion data collection are arranged according to date collection;
Wherein, detailed process is as follows for the History Order dataset acquisition arrangement:
(1) raw data file is read according to the date;
(2) cryptographic Hash of data set corresponding region described in step (1) is switched into corresponding Arabic numerals ID, obtains ID
Range;
(3) date and time sheet data is extracted from lower single timestamp field that each records;
(4) data set is subjected to ascending sort according to date and time piece;
(5) all data are grouped according to date, timeslice, starting point, destination information;
(6) the order record quantity in each group, order volume of the as OD in this day timeslice are counted;
(7) whether data file has all read, if not provided, repeating step 1 to 6, until having read all data
File obtains order volume of each OD in each timeslice.
Wherein, the raw data file include drop drop net about vehicle order data collection, Divine Land net about vehicle order data collection and
Cao grasps the one or more of net about vehicle order data collection.
Wherein, in the S1 weather data integrate acquisition method for sorting include: since weather data field is discrete value,
Mode is taken to obtain the weather data collection of the timeslice weather data field.
In data set, the data of weather description are to carry out category division by number, such as fine day is 1, cloudy
It is 2, yin is 3, and raining is 4.When being learnt using model, if directly model can be made using this data as input
It is mistakenly considered rainy day weight highest, fine day weight is minimum, it is therefore desirable to which One-Hot Encoding is carried out to this categorical data.
For weather describes such discrete features, value does not have size meaning, so using One-Hot
Encoding is encoded.One-Hot Encoding is also known as one-hot coding, an efficient coding.Main thought is to pass through n
The binary condition position of position is encoded for n discrete value, each mode bit and other mode bits are mutually indepedent, and often
A discrete value has and is corresponding with a mode bit.Because of this mutually indepedent mutual exclusion of n discrete value, at any time, in this n
In a mode bit, has and only one is 1.
Wherein, in the S1 traffic congestion data integrate acquisition method for sorting include: since traffic congestion data is successive value,
Therefore the traffic congestion data collection for obtaining the timeslice is averaged to traffic congestion data.
Not every feature is all discrete value, and the feature of successive value is also required to handle.In a practical situation, different
Feature may numerically differ greatly, if directly carrying out training pattern without processing using original value as input, making
It is unreasonable to train the model come.If decline solution optimal solution using gradient, the process that can also make gradient decline is time-consuming
It is more.All features in one vector can be mapped to [0,1] section by Min-Max Scaler, progress feature contracting in this way
It, can be with the renewal speed of acceleration parameter after putting.Min-Max Scaler formula is as follows:
S2, the city blanket order amount for returning tree algorithm prediction future time piece is promoted based on gradient;
Wherein, detailed process is as follows for gradient promotion recurrence tree algorithm in the S2:
(1) it initializesI.e. one is only had the tree of root node;
(2) iteration establishes M boosted tree, with training datasetx∈RK, y ∈ R is as input;Meter
Calculate value of the negative gradient in "current" model of loss functionWherein t=1,2 ..., N,
For derivation symbol, N is the number of iterations, and using it as the estimated value of residual error;
(3) to rmtIt is fitted, obtains the leaf node region R of the m treemj, j=1,2 ..., J, wherein J is leaf
Node number;Each leaf node is recycled;
(4) output valve in each region is recalculated,Make to lose
Function Minimization updatesWherein j=1,2 ..., J, RmjIndicate m
Tree leaf node region;
(5)GmIt (x) is final model.
Wherein, each regression tree giCalculation formula are as follows:
Wherein, L is loss function,It is training set, xtIt is feature influential on order volume on timeslice t,
ytIt is the city blanket order amount of timeslice t, Gi-1The set of i-1 tree before being:
Wherein, final model G (x) is every one tree giThe sum of:
G (x)=g1(x)+g2(x)+…+gM(x)
S3, the scaling matrices that future time piece is predicted based on scaling matrices weighted average mode;
Wherein, detailed process is as follows for scaling matrices extraction:
(1) it is recorded with History OrderAs input;
(2) as desired to THRespectively obtain city blanket order amountWith OD order volume
(3) the city blanket order amount E of acquisition time piece tt, obtain ODijIn the order volume O of timeslice tij,t, calculate ratio:Wherein, i=1,2 ..., n, j=1,2 ..., n;
(4) scaling matrices of timeslice t are obtained
(5) scaling matrices collection is returned
Wherein, in the S3 in scaling matrices weighted average mode weight according to the phase of time, weather and/or traffic congestion
It is determined like degree metric function.
Wherein, for time similarity:
Define temporal characteristics similarity are as follows:
Wherein, Δ h (t1,t2)=min { dist (t1,t2),48-dist(t1,t2)};
Wherein, dist (t1,t2) indicate timeslice t1And t2Difference: dist (t1,t2)=mode (| t1-t2|,48);
Wherein,Indicate the number of days of timeslice apart from caused difference:
Wherein, for weather similarity:
Define weather characteristics similarity are as follows:
Wherein, for temperature similarity:
Define temperature profile similarity are as follows:
Define temperature profile similarity are as follows:
Wherein, for traffic similarity:
Define traffic congestion information similarity are as follows:
S4, it is based on OD order volume prediction algorithm PMWA, it is obtained to S2 according to the obtained scaling matrices respective value of S3
City blanket order amount is allocated to arrive the order volume of each OD;
Wherein, detailed process is as follows by the OD order volume prediction algorithm PMWA:
(1) History Order is recordedMeasure the parameter θ in characteristic similarity function1,θ2,
θ3,θ4, Learning Step α is as input;
(2) learnt according to GBRT to regression tree GM(x);
(3) G is usedM(x) the city blanket order amount of predicted time piece t
(4) all OD scaling matrices are obtained according to scaling matrices extracting method
(5) using the quadratic sum of corresponding element in matrix as loss function;
(6) gradient descent method undated parameter is used, optimal solution is found;
(7) the characteristic similarity function of reference time piece i and all features of object time piece t are calculated;
(8) scaling matrices of timeslice t are obtained by the average weighted mode of scaling matrices
(9)Obtain the OD order moment matrix of timeslice t;
(10) order volume of each OD is returned
The beneficial effects of the present invention are promote regression tree (Gradient Boosting Regression using gradient
Tree, abbreviation GBRT) algorithm predicts the city blanket order amount of future time piece, it is average weighted in conjunction with scaling matrices
Mode predicts the scaling matrices of future time piece.Effectively these multi-source datas are melted finally by PMWA algorithm
It closes, obtains the order volume of each OD, take full advantage of the inner link between the regularity of time series data and OD, effectively solve
" multi-thread prediction " problem, precision of prediction with higher.
Detailed description of the invention
Fig. 1 is inventive algorithm flow chart;
Fig. 2 is that order volume of the present invention extracts flow chart.
Specific embodiment
To make the object, technical solutions and advantages of the present invention clearer, the present invention is made with reference to the accompanying drawing further
Detailed description.
Present embodiment takes Python programming language to realize algorithm.
Open competition data set of the data set of present embodiment from Di-Tech2016, including Hangzhou 2016
Year 2 months on March 17th, 23 days 1, totally 24 days drops dripped order data, amounted to 7291683 order records.It uses first 17 days
Data as training set, last 7 days data are as test set.Since the major part of net about vehicle is common in-vehicle with vehicle demand,
Therefore, the case where being predicted herein just for the OD order volume of common in-vehicle, and not considering share-car.Meanwhile herein only to limit
The OD for determining starting point and destination carries out order volume prediction, is only concerned specific order volume situation, is indifferent to specific roadway
Line.
Embodiment
A method of based on the pre- survey grid of multisource data fusion about vehicle order volume, include the following steps:
S1, History Order data set, weather data collection and/or traffic congestion data collection are arranged according to date collection;
Wherein, detailed process is as follows for the History Order dataset acquisition arrangement:
(1) raw data file is read according to the date;
(2) cryptographic Hash in region is switched into corresponding Arabic numerals ID, obtains ID range;
(3) date and time sheet data is extracted from lower single timestamp field that each records;
(4) data set is subjected to ascending sort according to date and time piece;
(5) all data are grouped according to date, timeslice, starting point, destination information;
(6) the order record quantity in each group, order volume of the as OD in this day timeslice are counted;
(7) whether data file has all read, if not provided, repeating step 1 to 6, until having read all data
File obtains order volume of each OD in each timeslice.
S2, the city blanket order amount for returning tree algorithm prediction future time piece is promoted based on gradient;
Wherein, detailed process is as follows for gradient promotion recurrence tree algorithm in the S2:
(1) it initializesI.e. one is only had the tree of root node;
(2) iteration establishes M boosted tree, with training datasetx∈RK, y ∈ R is as input;Meter
Calculate value of the negative gradient in "current" model of loss functionWherein t=1,2 ..., N,
For derivation symbol, N is the number of iterations, and using it as the estimated value of residual error;
(3) to rmtIt is fitted, obtains the leaf node region R of the m treemj, j=1,2 ..., J, wherein J is leaf
Node number;Each leaf node is recycled;
(4) output valve in each region is recalculated,Make to lose
Function Minimization updatesWherein j=1,2 ..., J, RmjIndicate m
Tree leaf node region;
(5)GmIt (x) is final model.
S3, the scaling matrices that future time piece is predicted based on scaling matrices weighted average mode;
Wherein, detailed process is as follows for scaling matrices extraction:
(1) it is recorded with History OrderAs input;
(2) as desired to THRespectively obtain city blanket order amountWith OD order volume
(3) the city blanket order amount E of acquisition time piece tt, obtain ODijIn the order volume O of timeslice tij,t, calculate ratio:Wherein, i=1,2 ..., n, j=1,2 ..., n;
(4) scaling matrices of timeslice t are obtained
(5) scaling matrices collection is returned
S4, it is based on OD order volume prediction algorithm PMWA, it is obtained to S2 according to the obtained scaling matrices respective value of S3
City blanket order amount is allocated to arrive the order volume of each OD;
Wherein, detailed process is as follows by the OD order volume prediction algorithm PMWA:
(1) History Order is recordedMeasure the parameter θ in characteristic similarity function1,θ2,
θ3,θ4, Learning Step α is as input;
(2) learnt according to GBRT to regression tree GM(x);
(3) G is usedM(x) the city blanket order amount of predicted time piece t
(4) all OD scaling matrices are obtained according to scaling matrices extracting method
(5) using the quadratic sum of corresponding element in matrix as loss function;
(6) gradient descent method undated parameter is used, optimal solution is found;
(7) the characteristic similarity function of reference time piece i and all features of object time piece t are calculated;
(8) scaling matrices of timeslice t are obtained by the average weighted mode of scaling matrices
(9)Obtain the OD order moment matrix of timeslice t;
(10) order volume of each OD is returned
Comparative example 1
The selection of historical data is carried out according to working day and weekend for given OD using history average algorithm (HA),
Using the average order volume of corresponding timeslice as predicted value.Such as: 8:30am-9:00am of the prediction OD 8-9 in Friday
Order volume just chooses the order volume of all OD of 8:30am-9:00am on weekdays 8-9 in historical data, is averaged conduct
Predicted value.
Comparative example 2
Similar with predicted city blanket order amount using GBRT algorithm, by construction feature, the order volume of each OD is used
GBRT is directly predicted.
Comparative example 3
Using STP-KNN algorithm, STP-KNN is a kind of short-term volume forecasting algorithm of the data-driven of non-ginseng.The model
Variation tendency by defining flow sequence is found out with the sequence most according to the flow sequence of n timeslice before object time piece
Similar K flow sequence obtains corresponding weight according to similarity ranking, then logical to the last one value of this K sequence
The mode for crossing weighted linear combination obtains predicted value.
Comparative example 4
Using NMF-AR algorithm, NMF-AR is a kind of mixed model for the prediction of OD order volume being recently proposed.The mould
The OD matrix decomposition that history OD order volume is made up of Non-negative Matrix Factorization (NMF) by type is two matrixes, referred to as OD matrix
Basic matrix and coefficient matrix.Then predict that coefficient matrix, obtain object time piece is using autoregression model (AR)
Number.Final OD order moment matrix is obtained with the coefficient predicted finally by basic matrix.
Use MAE (mean absolute error), RMSE (root-mean-square error), ME (mean error), RMSLE (root mean square logarithm
Error), R-Squared (coefficient of determination) carries out performance of the model of above-described embodiment and comparative example 1-4 on test set
Evaluation, as a result referring to table 1 " global prediction error ".
Table 1
Method | MAE | RMSE | ME | RMSLE | R-Squared |
HA | 3.8691 | 10.2694 | 538 | 0.4145 | 0.9174 |
GBRT | 3.8179 | 9.9931 | 429 | 0.3368 | 0.9383 |
STP-KNN | 3.4994 | 8.3019 | 273 | 0.389 | 0.9574 |
NMF-AR | 3.4849 | 8.2596 | 328 | 0.3727 | 0.9579 |
PMWA | 3.2488 | 7.6516 | 245 | 0.3348 | 0.9638 |
The PMWA algorithm model error described in embodiment it can be seen from 1 data of table is minimum, is superior to make on overall performance
For other four models of comparative example.
Wherein, although HA algorithm described in comparative example 1 is simple, its rule for not making full use of time series data to change
Rule, does not account for all kinds of impact factors yet, thus effect is worst;GBRT model described in comparative example 2 is carried out according to different characteristic
The building of regression tree, this method may be affected by certain features, and other feature influence is smaller, although should
Model takes full advantage of various external influence factors, but inadequate for the feature mining of time series data itself, while also not having
In view of the relationship that influences each other between different OD, so being predicted not obtain good effect using GBRT;Comparative example 3
Although the STP-KNN model takes full advantage of the changing rule of time series data, but the model is also without in view of difference
The relationship that influences each other between OD, while the model does not also use external influence factors, although the effect of the model is compared
GBRT is promoted, but still has a certain distance with PMWA algorithm described in embodiment;NMF-AR model described in comparative example 4 is logical
It crosses an OD order moment matrix and has used influence relationship between different OD, while passing through an autoregression model clock synchronization ordinal number
According to changing rule excavated, but the model does not account for influence of the external factor to order volume.Although also obtaining
Good effect, but PMWA algorithm still not as described embodiments.
In conclusion PMWA model provided by the present invention has made full use of between the regularity of time series data and OD
Inner link predicts OD order volume by merging multi-source data.This is a kind of " multi-thread prediction " solution to the problem,
Also precision of prediction with higher simultaneously.
The basic principles, main features and advantages of the present invention have been shown and described above.The technology of the industry
Personnel are it should be appreciated that the present invention is not limited to the above embodiments, and the above embodiments and description only describe this
The principle of invention, without departing from the spirit and scope of the present invention, various changes and improvements may be made to the invention, these changes
Change and improvement all fall within the protetion scope of the claimed invention.The claimed scope of the invention by appended claims and its
Equivalent thereof.
Claims (8)
1. a kind of method based on the pre- survey grid of multisource data fusion about vehicle order volume, which comprises the steps of:
S1, History Order data set, weather data collection and/or traffic congestion data collection are arranged according to date collection;
S2, the city blanket order amount for returning tree algorithm prediction future time piece is promoted based on gradient;
S3, the scaling matrices that future time piece is predicted based on scaling matrices weighted average mode;
S4, it is based on OD order volume prediction algorithm PMWA, according to the obtained scaling matrices respective value of S3 to the obtained city S2
Blanket order amount is allocated to arrive the order volume of each OD;
Wherein, the time slice interval is 10-30 minutes;Preferably, the time slice interval is 30 minutes.
2. a kind of method based on the pre- survey grid of multisource data fusion about vehicle order volume according to claim 1, feature exist
In detailed process is as follows for the arrangement of History Order dataset acquisition in the S1:
(1) raw data file is read according to the date form data set;
(2) cryptographic Hash of data set corresponding region described in step (1) is switched into corresponding Arabic numerals ID, obtains ID model
It encloses;
(3) date and time sheet data is extracted from lower single timestamp field that each in data set records;
(4) data set is subjected to ascending sort according to date and time piece;
(5) all data are grouped according to date, timeslice, starting point, destination information;
(6) the order record quantity in each group, order volume of the as OD in this day timeslice are counted;
(7) whether data file has all read, if not provided, repeat step 1 to 6, until having read all data files,
Obtain order volume of each OD in each timeslice;
Wherein, the raw data file includes drop drop net about vehicle order data collection, Divine Land net about vehicle order data collection and Cao behaviour
The one or more of net about vehicle order data collection.
3. a kind of method based on the pre- survey grid of multisource data fusion about vehicle order volume according to claim 1, feature exist
In it includes: the weather for taking mode to obtain the timeslice weather data field that weather data collection, which acquires method for sorting, in the S1
Data set.
4. a kind of method based on the pre- survey grid of multisource data fusion about vehicle order volume according to claim 1, wherein described
In S1 traffic congestion data collection acquisition method for sorting include: to traffic congestion data be averaged obtain the timeslice traffic gather around
Stifled data set.
5. a kind of method based on the pre- survey grid of multisource data fusion about vehicle order volume according to claim 1, feature exist
In detailed process is as follows for gradient promotion recurrence tree algorithm in the S2:
(1) it initializesG at this time0It (x) is the tree for there was only root node;
(2) iteration establishes M boosted tree, with training datasetAs input;Meter
Calculate value of the negative gradient in "current" model of loss functionWherein t=1,2 ..., N,
For derivation symbol, N is the number of iterations, and using it as the estimated value of residual error;
(3) to rmtIt is fitted, obtains the leaf node region R of the m treemj, j=1,2 ..., J, wherein J is leaf node
Number;Each leaf node is recycled;
(4) value in leaf node region is estimated using linear search, Make to damage
Function Minimization is lost, is updated Wherein j=1,2 ..., J, RmjIt indicates
The leaf node region of the m tree;
(5)GmIt (x) is final model.
6. a kind of method based on the pre- survey grid of multisource data fusion about vehicle order volume according to claim 1, feature exist
In detailed process is as follows for scaling matrices extraction in the S3:
(1) it is recorded with History OrderAs input;
(2) as desired to THRespectively obtain city blanket order amountWith OD order volume
(3) the city blanket order amount E of acquisition time piece tt, obtain ODijIn the order volume O of timeslice tij,t, calculate ratio:Wherein, i=1,2 ..., n, j=1,2 ..., n;
(4) scaling matrices of timeslice t are obtained
(5) scaling matrices collection is returned
7. a kind of method based on the pre- survey grid of multisource data fusion about vehicle order volume according to claim 1 or 4, wherein
In the S3 in scaling matrices weighted average mode weight according to the measuring similarity function of time, weather and/or traffic congestion
It determines.
8. a kind of method based on the pre- survey grid of multisource data fusion about vehicle order volume according to claim 1, feature exist
In detailed process is as follows by OD order volume prediction algorithm PMWA in the S4:
(1) History Order is recordedMeasure the parameter θ in characteristic similarity function1,θ2,θ3,θ4,
Learning Step α is as input;
(2) it is promoted according to gradient and returns tree algorithm study to regression tree GM(x);
(3) G is usedM(x) the city blanket order amount of predicted time piece t
(4) all OD scaling matrices are obtained according to scaling matrices extracting method
(5) using the quadratic sum of corresponding element in matrix as loss function;
(6) gradient descent method undated parameter is used, optimal solution is found;
(7) the characteristic similarity function of reference time piece i and all features of object time piece t are calculated;
(8) scaling matrices of timeslice t are obtained by the average weighted mode of scaling matrices
(9)Obtain the OD order moment matrix of timeslice t;
(10) order volume of each OD is returned
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910630258.2A CN110490365B (en) | 2019-07-12 | 2019-07-12 | Method for predicting network car booking order quantity based on multi-source data fusion |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910630258.2A CN110490365B (en) | 2019-07-12 | 2019-07-12 | Method for predicting network car booking order quantity based on multi-source data fusion |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110490365A true CN110490365A (en) | 2019-11-22 |
CN110490365B CN110490365B (en) | 2022-04-05 |
Family
ID=68546887
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910630258.2A Expired - Fee Related CN110490365B (en) | 2019-07-12 | 2019-07-12 | Method for predicting network car booking order quantity based on multi-source data fusion |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110490365B (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113379455A (en) * | 2021-06-10 | 2021-09-10 | 中国铁道科学研究院集团有限公司电子计算技术研究所 | Order quantity prediction method and apparatus |
CN113469739A (en) * | 2021-06-25 | 2021-10-01 | 广州宸祺出行科技有限公司 | Method and system for predicting taxi taking demand for network taxi appointment |
CN115394084A (en) * | 2022-08-29 | 2022-11-25 | 郑州轻工业大学 | NMF-BilSTM-based urban road network short-term traffic flow prediction method |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104599088A (en) * | 2015-02-13 | 2015-05-06 | 北京嘀嘀无限科技发展有限公司 | Dispatching method and dispatching system based on orders |
CN107103142A (en) * | 2017-07-11 | 2017-08-29 | 交通运输部公路科学研究所 | Comprehensive traffic network operation situation towards highway and the railway network deduces emulation technology |
US20180018572A1 (en) * | 2016-07-12 | 2018-01-18 | Alibaba Group Holding Limited | Method, apparatus, device, and system for predicting future travel volumes of geographic regions based on historical transportation network data |
CN109117973A (en) * | 2017-06-26 | 2019-01-01 | 北京嘀嘀无限科技发展有限公司 | A kind of net about vehicle order volume prediction technique and device |
-
2019
- 2019-07-12 CN CN201910630258.2A patent/CN110490365B/en not_active Expired - Fee Related
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104599088A (en) * | 2015-02-13 | 2015-05-06 | 北京嘀嘀无限科技发展有限公司 | Dispatching method and dispatching system based on orders |
US20180018572A1 (en) * | 2016-07-12 | 2018-01-18 | Alibaba Group Holding Limited | Method, apparatus, device, and system for predicting future travel volumes of geographic regions based on historical transportation network data |
CN107633680A (en) * | 2016-07-12 | 2018-01-26 | 阿里巴巴集团控股有限公司 | Acquisition methods, device, equipment and the system of trip data |
CN109117973A (en) * | 2017-06-26 | 2019-01-01 | 北京嘀嘀无限科技发展有限公司 | A kind of net about vehicle order volume prediction technique and device |
CN107103142A (en) * | 2017-07-11 | 2017-08-29 | 交通运输部公路科学研究所 | Comprehensive traffic network operation situation towards highway and the railway network deduces emulation technology |
Non-Patent Citations (1)
Title |
---|
周鑫: "基于GBRT的交通流量预测算法研究", 《现代计算机》 * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113379455A (en) * | 2021-06-10 | 2021-09-10 | 中国铁道科学研究院集团有限公司电子计算技术研究所 | Order quantity prediction method and apparatus |
CN113379455B (en) * | 2021-06-10 | 2024-02-09 | 中国铁道科学研究院集团有限公司电子计算技术研究所 | Order quantity prediction method and equipment |
CN113469739A (en) * | 2021-06-25 | 2021-10-01 | 广州宸祺出行科技有限公司 | Method and system for predicting taxi taking demand for network taxi appointment |
CN113469739B (en) * | 2021-06-25 | 2024-05-28 | 广州宸祺出行科技有限公司 | Prediction method and system for taxi taking demand of network taxi taking |
CN115394084A (en) * | 2022-08-29 | 2022-11-25 | 郑州轻工业大学 | NMF-BilSTM-based urban road network short-term traffic flow prediction method |
CN115394084B (en) * | 2022-08-29 | 2023-07-25 | 郑州轻工业大学 | Urban road network short-time traffic flow prediction method based on NMF-BiLSTM |
Also Published As
Publication number | Publication date |
---|---|
CN110490365B (en) | 2022-04-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110570651B (en) | Road network traffic situation prediction method and system based on deep learning | |
CN111653088B (en) | Vehicle driving quantity prediction model construction method, prediction method and system | |
CN107610469B (en) | Day-dimension area traffic index prediction method considering multi-factor influence | |
CN107045788B (en) | Traffic road condition prediction method and device | |
Zhao et al. | Truck traffic speed prediction under non-recurrent congestion: Based on optimized deep learning algorithms and GPS data | |
CN113487066B (en) | Long-time-sequence freight volume prediction method based on multi-attribute enhanced graph convolution-Informer model | |
CN104834977B (en) | Traffic alert grade prediction technique based on learning distance metric | |
CN103177570B (en) | Method for predicting traffic jam indexes for rush hours in morning and evening | |
CN110458336B (en) | Online appointment vehicle supply and demand prediction method based on deep learning | |
CN110490365A (en) | A method of based on the pre- survey grid of multisource data fusion about vehicle order volume | |
CN109272157A (en) | A kind of freeway traffic flow parameter prediction method and system based on gate neural network | |
CN110503104B (en) | Short-time remaining parking space quantity prediction method based on convolutional neural network | |
CN108648445B (en) | Dynamic traffic situation prediction method based on traffic big data | |
CN111860989B (en) | LSTM neural network short-time traffic flow prediction method based on ant colony optimization | |
CN110164129B (en) | Single-intersection multi-lane traffic flow prediction method based on GERNN | |
CN115440032A (en) | Long-term and short-term public traffic flow prediction method | |
CN113051811B (en) | Multi-mode short-term traffic jam prediction method based on GRU network | |
CN112270355A (en) | Active safety prediction method based on big data technology and SAE-GRU | |
CN112037539B (en) | Method and system for recommending signal control scheme for saturated urban traffic network | |
Chen et al. | A multiscale-grid-based stacked bidirectional GRU neural network model for predicting traffic speeds of urban expressways | |
CN113821547B (en) | Rapid and efficient short-time prediction method, system and storage medium for occupancy of parking lot | |
CN116434538A (en) | Urban traffic flow prediction model construction method based on heterogeneous data fusion | |
CN115269758A (en) | Passenger-guidance-oriented road network passenger flow state deduction method and system | |
CN113537569B (en) | Short-term bus passenger flow prediction method and system based on weight stacking decision tree | |
CN108053646A (en) | Traffic characteristic acquisition methods, Forecasting Methodology and system based on time-sensitive feature |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20220405 |