CN110490365A - A method of based on the pre- survey grid of multisource data fusion about vehicle order volume - Google Patents

A method of based on the pre- survey grid of multisource data fusion about vehicle order volume Download PDF

Info

Publication number
CN110490365A
CN110490365A CN201910630258.2A CN201910630258A CN110490365A CN 110490365 A CN110490365 A CN 110490365A CN 201910630258 A CN201910630258 A CN 201910630258A CN 110490365 A CN110490365 A CN 110490365A
Authority
CN
China
Prior art keywords
order
data
order volume
scaling matrices
timeslice
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910630258.2A
Other languages
Chinese (zh)
Other versions
CN110490365B (en
Inventor
周鑫
彭舰
黄飞虎
李梦诗
徐文政
刘唐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sichuan University
Original Assignee
Sichuan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sichuan University filed Critical Sichuan University
Priority to CN201910630258.2A priority Critical patent/CN110490365B/en
Publication of CN110490365A publication Critical patent/CN110490365A/en
Application granted granted Critical
Publication of CN110490365B publication Critical patent/CN110490365B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/251Fusion techniques of input or preprocessed data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0633Lists, e.g. purchase orders, compilation or processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/40Business processes related to the transportation industry

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Economics (AREA)
  • Strategic Management (AREA)
  • Physics & Mathematics (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Tourism & Hospitality (AREA)
  • Development Economics (AREA)
  • Finance (AREA)
  • Data Mining & Analysis (AREA)
  • Accounting & Taxation (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Operations Research (AREA)
  • Game Theory and Decision Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Quality & Reliability (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a kind of methods based on the pre- survey grid of multisource data fusion about vehicle order volume.It proposes to carry out the prediction of OD order volume based on the average weighted hierarchical prediction model of scaling matrices.It is proposed predicts the scaling matrices of future time piece based on the average weighted mode of scaling matrices, weight be according to the time, weather, etc. the measuring similarity functions of features determine that therefore the algorithm can effectively merge these multi-source datas.It is last to be allocated city blanket order amount according to corresponding value in obtained scaling matrices, obtain the order volume of each OD.The present invention promotes recurrence tree algorithm using gradient and predicts the city blanket order amount of future time piece, the scaling matrices of future time piece are predicted in conjunction with scaling matrices average weighted mode, effectively these multi-source datas are merged finally by PMWA algorithm, obtain the order volume of each OD, effectively solve the problems, such as " multi-thread prediction ", precision of prediction with higher.

Description

A method of based on the pre- survey grid of multisource data fusion about vehicle order volume
Technical field
The present invention relates to intelligent transport system fields, and in particular to one kind is based on the pre- survey grid of multisource data fusion about vehicle order The method of amount.
Background technique
With the continuous development of social economy and urbanization, Urban traffic demand is increased rapidly, the optional routing of passenger's trip Diameter increases, and net about vehicle becomes most people trip preferred manner.However traffic congestion, roading be unreasonable and infrastructure There is the problem of " difficulty of calling a taxi " so that Traffic Problems are increasingly serious in contradiction not in place, thankfully, with Being constantly progressive for the technologies such as informationization, computer, automatic control and artificial intelligence, the foundation of intelligent transportation system, Neng Gouyou Solve the above problems to effect.
Method based on the pre- survey grid of multisource data fusion about vehicle order volume belongs to a part of intelligent transportation system, purpose It is that net about vehicle OD (Origin-Destination) order volume is predicted in comprehensive many factors analysis, reasonably distributes city Blanket order scheme, and then achieve the purpose that resource consumption is effectively reduced and alleviate passenger " difficulty of calling a taxi " problem.Existing net is about Vehicle requirement forecasting, prediction target only limit starting point, do not limit destination.And in practical application scene, it is defined The prediction of OD (Origin-Destination) order volume of starting point and destination is necessary.Vehicle scheduling can not only be used as Reference, alleviate " difficulty of calling a taxi " problem, also help building intelligent transportation system, hold flow of personnel situation.Simultaneously as every The changing rule of a OD sequence data is different, and common method for predicting such as ARIMA and LSTM etc., is needed according to not Different parameter or model structure are determined with the changing rule of OD sequence data, thus can only solve the problems, such as " single line prediction ", Its OD order volume is predicted according to single OD sequence data.
Summary of the invention
The present invention provides a kind of based on the pre- survey grid of multisource data fusion about vehicle to overcome above-mentioned prior art shortcoming The method of order volume.The present invention proposes to be weighted and averaged (Proportion-Matrix-based Weighted based on scaling matrices Average, abbreviation PMWA) hierarchical prediction model carry out the prediction of OD order volume, which includes two parts: city blanket order amount Prediction, the prediction of OD scaling matrices.Regression tree (Gradient Boosting Regression Tree, abbreviation are promoted using gradient GBRT) algorithm predicts the city blanket order amount of future time piece.Scaling matrices refer to the order volume of all OD in city The matrix that accounting in blanket order amount is constituted.It proposes based on the average weighted mode of scaling matrices to the ratio square of future time piece Battle array predicted, weight be according to the time, weather, etc. the measuring similarity functions of features determine that therefore the algorithm can Effectively these multi-source datas to be merged.It is last according to corresponding value in obtained scaling matrices by city blanket order amount It is allocated, obtains the order volume of each OD.Method provided by the invention can effectively solve the problem that by merging various influence factors " multi-thread prediction " problem, i.e., predict the order volume of whole OD according to multiple OD sequence datas.
To further illustrate a kind of method based on the pre- survey grid of multisource data fusion about vehicle order volume, spy is to described in file Related symbol parameter is illustrated:
A method of based on the pre- survey grid of multisource data fusion about vehicle order volume, include the following steps:
S1, History Order data set, weather data collection and/or traffic congestion data collection are arranged according to date collection;
Wherein, detailed process is as follows for the History Order dataset acquisition arrangement:
(1) raw data file is read according to the date;
(2) cryptographic Hash of data set corresponding region described in step (1) is switched into corresponding Arabic numerals ID, obtains ID Range;
(3) date and time sheet data is extracted from lower single timestamp field that each records;
(4) data set is subjected to ascending sort according to date and time piece;
(5) all data are grouped according to date, timeslice, starting point, destination information;
(6) the order record quantity in each group, order volume of the as OD in this day timeslice are counted;
(7) whether data file has all read, if not provided, repeating step 1 to 6, until having read all data File obtains order volume of each OD in each timeslice.
Wherein, the raw data file include drop drop net about vehicle order data collection, Divine Land net about vehicle order data collection and Cao grasps the one or more of net about vehicle order data collection.
Wherein, in the S1 weather data integrate acquisition method for sorting include: since weather data field is discrete value, Mode is taken to obtain the weather data collection of the timeslice weather data field.
In data set, the data of weather description are to carry out category division by number, such as fine day is 1, cloudy It is 2, yin is 3, and raining is 4.When being learnt using model, if directly model can be made using this data as input It is mistakenly considered rainy day weight highest, fine day weight is minimum, it is therefore desirable to which One-Hot Encoding is carried out to this categorical data.
For weather describes such discrete features, value does not have size meaning, so using One-Hot Encoding is encoded.One-Hot Encoding is also known as one-hot coding, an efficient coding.Main thought is to pass through n The binary condition position of position is encoded for n discrete value, each mode bit and other mode bits are mutually indepedent, and often A discrete value has and is corresponding with a mode bit.Because of this mutually indepedent mutual exclusion of n discrete value, at any time, in this n In a mode bit, has and only one is 1.
Wherein, in the S1 traffic congestion data integrate acquisition method for sorting include: since traffic congestion data is successive value, Therefore the traffic congestion data collection for obtaining the timeslice is averaged to traffic congestion data.
Not every feature is all discrete value, and the feature of successive value is also required to handle.In a practical situation, different Feature may numerically differ greatly, if directly carrying out training pattern without processing using original value as input, making It is unreasonable to train the model come.If decline solution optimal solution using gradient, the process that can also make gradient decline is time-consuming It is more.All features in one vector can be mapped to [0,1] section by Min-Max Scaler, progress feature contracting in this way It, can be with the renewal speed of acceleration parameter after putting.Min-Max Scaler formula is as follows:
S2, the city blanket order amount for returning tree algorithm prediction future time piece is promoted based on gradient;
Wherein, detailed process is as follows for gradient promotion recurrence tree algorithm in the S2:
(1) it initializesI.e. one is only had the tree of root node;
(2) iteration establishes M boosted tree, with training datasetx∈RK, y ∈ R is as input;Meter Calculate value of the negative gradient in "current" model of loss functionWherein t=1,2 ..., N, For derivation symbol, N is the number of iterations, and using it as the estimated value of residual error;
(3) to rmtIt is fitted, obtains the leaf node region R of the m treemj, j=1,2 ..., J, wherein J is leaf Node number;Each leaf node is recycled;
(4) output valve in each region is recalculated,Make to lose Function Minimization updatesWherein j=1,2 ..., J, RmjIndicate m Tree leaf node region;
(5)GmIt (x) is final model.
Wherein, each regression tree giCalculation formula are as follows:
Wherein, L is loss function,It is training set, xtIt is feature influential on order volume on timeslice t, ytIt is the city blanket order amount of timeslice t, Gi-1The set of i-1 tree before being:
Wherein, final model G (x) is every one tree giThe sum of:
G (x)=g1(x)+g2(x)+…+gM(x)
S3, the scaling matrices that future time piece is predicted based on scaling matrices weighted average mode;
Wherein, detailed process is as follows for scaling matrices extraction:
(1) it is recorded with History OrderAs input;
(2) as desired to THRespectively obtain city blanket order amountWith OD order volume
(3) the city blanket order amount E of acquisition time piece tt, obtain ODijIn the order volume O of timeslice tij,t, calculate ratio:Wherein, i=1,2 ..., n, j=1,2 ..., n;
(4) scaling matrices of timeslice t are obtained
(5) scaling matrices collection is returned
Wherein, in the S3 in scaling matrices weighted average mode weight according to the phase of time, weather and/or traffic congestion It is determined like degree metric function.
Wherein, for time similarity:
Define temporal characteristics similarity are as follows:
Wherein, Δ h (t1,t2)=min { dist (t1,t2),48-dist(t1,t2)};
Wherein, dist (t1,t2) indicate timeslice t1And t2Difference: dist (t1,t2)=mode (| t1-t2|,48);
Wherein,Indicate the number of days of timeslice apart from caused difference:
Wherein, for weather similarity:
Define weather characteristics similarity are as follows:
Wherein, for temperature similarity:
Define temperature profile similarity are as follows:
Define temperature profile similarity are as follows:
Wherein, for traffic similarity:
Define traffic congestion information similarity are as follows:
S4, it is based on OD order volume prediction algorithm PMWA, it is obtained to S2 according to the obtained scaling matrices respective value of S3 City blanket order amount is allocated to arrive the order volume of each OD;
Wherein, detailed process is as follows by the OD order volume prediction algorithm PMWA:
(1) History Order is recordedMeasure the parameter θ in characteristic similarity function12, θ34, Learning Step α is as input;
(2) learnt according to GBRT to regression tree GM(x);
(3) G is usedM(x) the city blanket order amount of predicted time piece t
(4) all OD scaling matrices are obtained according to scaling matrices extracting method
(5) using the quadratic sum of corresponding element in matrix as loss function;
(6) gradient descent method undated parameter is used, optimal solution is found;
(7) the characteristic similarity function of reference time piece i and all features of object time piece t are calculated;
(8) scaling matrices of timeslice t are obtained by the average weighted mode of scaling matrices
(9)Obtain the OD order moment matrix of timeslice t;
(10) order volume of each OD is returned
The beneficial effects of the present invention are promote regression tree (Gradient Boosting Regression using gradient Tree, abbreviation GBRT) algorithm predicts the city blanket order amount of future time piece, it is average weighted in conjunction with scaling matrices Mode predicts the scaling matrices of future time piece.Effectively these multi-source datas are melted finally by PMWA algorithm It closes, obtains the order volume of each OD, take full advantage of the inner link between the regularity of time series data and OD, effectively solve " multi-thread prediction " problem, precision of prediction with higher.
Detailed description of the invention
Fig. 1 is inventive algorithm flow chart;
Fig. 2 is that order volume of the present invention extracts flow chart.
Specific embodiment
To make the object, technical solutions and advantages of the present invention clearer, the present invention is made with reference to the accompanying drawing further Detailed description.
Present embodiment takes Python programming language to realize algorithm.
Open competition data set of the data set of present embodiment from Di-Tech2016, including Hangzhou 2016 Year 2 months on March 17th, 23 days 1, totally 24 days drops dripped order data, amounted to 7291683 order records.It uses first 17 days Data as training set, last 7 days data are as test set.Since the major part of net about vehicle is common in-vehicle with vehicle demand, Therefore, the case where being predicted herein just for the OD order volume of common in-vehicle, and not considering share-car.Meanwhile herein only to limit The OD for determining starting point and destination carries out order volume prediction, is only concerned specific order volume situation, is indifferent to specific roadway Line.
Embodiment
A method of based on the pre- survey grid of multisource data fusion about vehicle order volume, include the following steps:
S1, History Order data set, weather data collection and/or traffic congestion data collection are arranged according to date collection;
Wherein, detailed process is as follows for the History Order dataset acquisition arrangement:
(1) raw data file is read according to the date;
(2) cryptographic Hash in region is switched into corresponding Arabic numerals ID, obtains ID range;
(3) date and time sheet data is extracted from lower single timestamp field that each records;
(4) data set is subjected to ascending sort according to date and time piece;
(5) all data are grouped according to date, timeslice, starting point, destination information;
(6) the order record quantity in each group, order volume of the as OD in this day timeslice are counted;
(7) whether data file has all read, if not provided, repeating step 1 to 6, until having read all data File obtains order volume of each OD in each timeslice.
S2, the city blanket order amount for returning tree algorithm prediction future time piece is promoted based on gradient;
Wherein, detailed process is as follows for gradient promotion recurrence tree algorithm in the S2:
(1) it initializesI.e. one is only had the tree of root node;
(2) iteration establishes M boosted tree, with training datasetx∈RK, y ∈ R is as input;Meter Calculate value of the negative gradient in "current" model of loss functionWherein t=1,2 ..., N, For derivation symbol, N is the number of iterations, and using it as the estimated value of residual error;
(3) to rmtIt is fitted, obtains the leaf node region R of the m treemj, j=1,2 ..., J, wherein J is leaf Node number;Each leaf node is recycled;
(4) output valve in each region is recalculated,Make to lose Function Minimization updatesWherein j=1,2 ..., J, RmjIndicate m Tree leaf node region;
(5)GmIt (x) is final model.
S3, the scaling matrices that future time piece is predicted based on scaling matrices weighted average mode;
Wherein, detailed process is as follows for scaling matrices extraction:
(1) it is recorded with History OrderAs input;
(2) as desired to THRespectively obtain city blanket order amountWith OD order volume
(3) the city blanket order amount E of acquisition time piece tt, obtain ODijIn the order volume O of timeslice tij,t, calculate ratio:Wherein, i=1,2 ..., n, j=1,2 ..., n;
(4) scaling matrices of timeslice t are obtained
(5) scaling matrices collection is returned
S4, it is based on OD order volume prediction algorithm PMWA, it is obtained to S2 according to the obtained scaling matrices respective value of S3 City blanket order amount is allocated to arrive the order volume of each OD;
Wherein, detailed process is as follows by the OD order volume prediction algorithm PMWA:
(1) History Order is recordedMeasure the parameter θ in characteristic similarity function12, θ34, Learning Step α is as input;
(2) learnt according to GBRT to regression tree GM(x);
(3) G is usedM(x) the city blanket order amount of predicted time piece t
(4) all OD scaling matrices are obtained according to scaling matrices extracting method
(5) using the quadratic sum of corresponding element in matrix as loss function;
(6) gradient descent method undated parameter is used, optimal solution is found;
(7) the characteristic similarity function of reference time piece i and all features of object time piece t are calculated;
(8) scaling matrices of timeslice t are obtained by the average weighted mode of scaling matrices
(9)Obtain the OD order moment matrix of timeslice t;
(10) order volume of each OD is returned
Comparative example 1
The selection of historical data is carried out according to working day and weekend for given OD using history average algorithm (HA), Using the average order volume of corresponding timeslice as predicted value.Such as: 8:30am-9:00am of the prediction OD 8-9 in Friday Order volume just chooses the order volume of all OD of 8:30am-9:00am on weekdays 8-9 in historical data, is averaged conduct Predicted value.
Comparative example 2
Similar with predicted city blanket order amount using GBRT algorithm, by construction feature, the order volume of each OD is used GBRT is directly predicted.
Comparative example 3
Using STP-KNN algorithm, STP-KNN is a kind of short-term volume forecasting algorithm of the data-driven of non-ginseng.The model Variation tendency by defining flow sequence is found out with the sequence most according to the flow sequence of n timeslice before object time piece Similar K flow sequence obtains corresponding weight according to similarity ranking, then logical to the last one value of this K sequence The mode for crossing weighted linear combination obtains predicted value.
Comparative example 4
Using NMF-AR algorithm, NMF-AR is a kind of mixed model for the prediction of OD order volume being recently proposed.The mould The OD matrix decomposition that history OD order volume is made up of Non-negative Matrix Factorization (NMF) by type is two matrixes, referred to as OD matrix Basic matrix and coefficient matrix.Then predict that coefficient matrix, obtain object time piece is using autoregression model (AR) Number.Final OD order moment matrix is obtained with the coefficient predicted finally by basic matrix.
Use MAE (mean absolute error), RMSE (root-mean-square error), ME (mean error), RMSLE (root mean square logarithm Error), R-Squared (coefficient of determination) carries out performance of the model of above-described embodiment and comparative example 1-4 on test set Evaluation, as a result referring to table 1 " global prediction error ".
Table 1
Method MAE RMSE ME RMSLE R-Squared
HA 3.8691 10.2694 538 0.4145 0.9174
GBRT 3.8179 9.9931 429 0.3368 0.9383
STP-KNN 3.4994 8.3019 273 0.389 0.9574
NMF-AR 3.4849 8.2596 328 0.3727 0.9579
PMWA 3.2488 7.6516 245 0.3348 0.9638
The PMWA algorithm model error described in embodiment it can be seen from 1 data of table is minimum, is superior to make on overall performance For other four models of comparative example.
Wherein, although HA algorithm described in comparative example 1 is simple, its rule for not making full use of time series data to change Rule, does not account for all kinds of impact factors yet, thus effect is worst;GBRT model described in comparative example 2 is carried out according to different characteristic The building of regression tree, this method may be affected by certain features, and other feature influence is smaller, although should Model takes full advantage of various external influence factors, but inadequate for the feature mining of time series data itself, while also not having In view of the relationship that influences each other between different OD, so being predicted not obtain good effect using GBRT;Comparative example 3 Although the STP-KNN model takes full advantage of the changing rule of time series data, but the model is also without in view of difference The relationship that influences each other between OD, while the model does not also use external influence factors, although the effect of the model is compared GBRT is promoted, but still has a certain distance with PMWA algorithm described in embodiment;NMF-AR model described in comparative example 4 is logical It crosses an OD order moment matrix and has used influence relationship between different OD, while passing through an autoregression model clock synchronization ordinal number According to changing rule excavated, but the model does not account for influence of the external factor to order volume.Although also obtaining Good effect, but PMWA algorithm still not as described embodiments.
In conclusion PMWA model provided by the present invention has made full use of between the regularity of time series data and OD Inner link predicts OD order volume by merging multi-source data.This is a kind of " multi-thread prediction " solution to the problem, Also precision of prediction with higher simultaneously.
The basic principles, main features and advantages of the present invention have been shown and described above.The technology of the industry Personnel are it should be appreciated that the present invention is not limited to the above embodiments, and the above embodiments and description only describe this The principle of invention, without departing from the spirit and scope of the present invention, various changes and improvements may be made to the invention, these changes Change and improvement all fall within the protetion scope of the claimed invention.The claimed scope of the invention by appended claims and its Equivalent thereof.

Claims (8)

1. a kind of method based on the pre- survey grid of multisource data fusion about vehicle order volume, which comprises the steps of:
S1, History Order data set, weather data collection and/or traffic congestion data collection are arranged according to date collection;
S2, the city blanket order amount for returning tree algorithm prediction future time piece is promoted based on gradient;
S3, the scaling matrices that future time piece is predicted based on scaling matrices weighted average mode;
S4, it is based on OD order volume prediction algorithm PMWA, according to the obtained scaling matrices respective value of S3 to the obtained city S2 Blanket order amount is allocated to arrive the order volume of each OD;
Wherein, the time slice interval is 10-30 minutes;Preferably, the time slice interval is 30 minutes.
2. a kind of method based on the pre- survey grid of multisource data fusion about vehicle order volume according to claim 1, feature exist In detailed process is as follows for the arrangement of History Order dataset acquisition in the S1:
(1) raw data file is read according to the date form data set;
(2) cryptographic Hash of data set corresponding region described in step (1) is switched into corresponding Arabic numerals ID, obtains ID model It encloses;
(3) date and time sheet data is extracted from lower single timestamp field that each in data set records;
(4) data set is subjected to ascending sort according to date and time piece;
(5) all data are grouped according to date, timeslice, starting point, destination information;
(6) the order record quantity in each group, order volume of the as OD in this day timeslice are counted;
(7) whether data file has all read, if not provided, repeat step 1 to 6, until having read all data files, Obtain order volume of each OD in each timeslice;
Wherein, the raw data file includes drop drop net about vehicle order data collection, Divine Land net about vehicle order data collection and Cao behaviour The one or more of net about vehicle order data collection.
3. a kind of method based on the pre- survey grid of multisource data fusion about vehicle order volume according to claim 1, feature exist In it includes: the weather for taking mode to obtain the timeslice weather data field that weather data collection, which acquires method for sorting, in the S1 Data set.
4. a kind of method based on the pre- survey grid of multisource data fusion about vehicle order volume according to claim 1, wherein described In S1 traffic congestion data collection acquisition method for sorting include: to traffic congestion data be averaged obtain the timeslice traffic gather around Stifled data set.
5. a kind of method based on the pre- survey grid of multisource data fusion about vehicle order volume according to claim 1, feature exist In detailed process is as follows for gradient promotion recurrence tree algorithm in the S2:
(1) it initializesG at this time0It (x) is the tree for there was only root node;
(2) iteration establishes M boosted tree, with training datasetAs input;Meter Calculate value of the negative gradient in "current" model of loss functionWherein t=1,2 ..., N, For derivation symbol, N is the number of iterations, and using it as the estimated value of residual error;
(3) to rmtIt is fitted, obtains the leaf node region R of the m treemj, j=1,2 ..., J, wherein J is leaf node Number;Each leaf node is recycled;
(4) value in leaf node region is estimated using linear search, Make to damage Function Minimization is lost, is updated Wherein j=1,2 ..., J, RmjIt indicates The leaf node region of the m tree;
(5)GmIt (x) is final model.
6. a kind of method based on the pre- survey grid of multisource data fusion about vehicle order volume according to claim 1, feature exist In detailed process is as follows for scaling matrices extraction in the S3:
(1) it is recorded with History OrderAs input;
(2) as desired to THRespectively obtain city blanket order amountWith OD order volume
(3) the city blanket order amount E of acquisition time piece tt, obtain ODijIn the order volume O of timeslice tij,t, calculate ratio:Wherein, i=1,2 ..., n, j=1,2 ..., n;
(4) scaling matrices of timeslice t are obtained
(5) scaling matrices collection is returned
7. a kind of method based on the pre- survey grid of multisource data fusion about vehicle order volume according to claim 1 or 4, wherein In the S3 in scaling matrices weighted average mode weight according to the measuring similarity function of time, weather and/or traffic congestion It determines.
8. a kind of method based on the pre- survey grid of multisource data fusion about vehicle order volume according to claim 1, feature exist In detailed process is as follows by OD order volume prediction algorithm PMWA in the S4:
(1) History Order is recordedMeasure the parameter θ in characteristic similarity function1234, Learning Step α is as input;
(2) it is promoted according to gradient and returns tree algorithm study to regression tree GM(x);
(3) G is usedM(x) the city blanket order amount of predicted time piece t
(4) all OD scaling matrices are obtained according to scaling matrices extracting method
(5) using the quadratic sum of corresponding element in matrix as loss function;
(6) gradient descent method undated parameter is used, optimal solution is found;
(7) the characteristic similarity function of reference time piece i and all features of object time piece t are calculated;
(8) scaling matrices of timeslice t are obtained by the average weighted mode of scaling matrices
(9)Obtain the OD order moment matrix of timeslice t;
(10) order volume of each OD is returned
CN201910630258.2A 2019-07-12 2019-07-12 Method for predicting network car booking order quantity based on multi-source data fusion Expired - Fee Related CN110490365B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910630258.2A CN110490365B (en) 2019-07-12 2019-07-12 Method for predicting network car booking order quantity based on multi-source data fusion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910630258.2A CN110490365B (en) 2019-07-12 2019-07-12 Method for predicting network car booking order quantity based on multi-source data fusion

Publications (2)

Publication Number Publication Date
CN110490365A true CN110490365A (en) 2019-11-22
CN110490365B CN110490365B (en) 2022-04-05

Family

ID=68546887

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910630258.2A Expired - Fee Related CN110490365B (en) 2019-07-12 2019-07-12 Method for predicting network car booking order quantity based on multi-source data fusion

Country Status (1)

Country Link
CN (1) CN110490365B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113379455A (en) * 2021-06-10 2021-09-10 中国铁道科学研究院集团有限公司电子计算技术研究所 Order quantity prediction method and apparatus
CN113469739A (en) * 2021-06-25 2021-10-01 广州宸祺出行科技有限公司 Method and system for predicting taxi taking demand for network taxi appointment
CN115394084A (en) * 2022-08-29 2022-11-25 郑州轻工业大学 NMF-BilSTM-based urban road network short-term traffic flow prediction method

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104599088A (en) * 2015-02-13 2015-05-06 北京嘀嘀无限科技发展有限公司 Dispatching method and dispatching system based on orders
CN107103142A (en) * 2017-07-11 2017-08-29 交通运输部公路科学研究所 Comprehensive traffic network operation situation towards highway and the railway network deduces emulation technology
US20180018572A1 (en) * 2016-07-12 2018-01-18 Alibaba Group Holding Limited Method, apparatus, device, and system for predicting future travel volumes of geographic regions based on historical transportation network data
CN109117973A (en) * 2017-06-26 2019-01-01 北京嘀嘀无限科技发展有限公司 A kind of net about vehicle order volume prediction technique and device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104599088A (en) * 2015-02-13 2015-05-06 北京嘀嘀无限科技发展有限公司 Dispatching method and dispatching system based on orders
US20180018572A1 (en) * 2016-07-12 2018-01-18 Alibaba Group Holding Limited Method, apparatus, device, and system for predicting future travel volumes of geographic regions based on historical transportation network data
CN107633680A (en) * 2016-07-12 2018-01-26 阿里巴巴集团控股有限公司 Acquisition methods, device, equipment and the system of trip data
CN109117973A (en) * 2017-06-26 2019-01-01 北京嘀嘀无限科技发展有限公司 A kind of net about vehicle order volume prediction technique and device
CN107103142A (en) * 2017-07-11 2017-08-29 交通运输部公路科学研究所 Comprehensive traffic network operation situation towards highway and the railway network deduces emulation technology

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
周鑫: "基于GBRT的交通流量预测算法研究", 《现代计算机》 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113379455A (en) * 2021-06-10 2021-09-10 中国铁道科学研究院集团有限公司电子计算技术研究所 Order quantity prediction method and apparatus
CN113379455B (en) * 2021-06-10 2024-02-09 中国铁道科学研究院集团有限公司电子计算技术研究所 Order quantity prediction method and equipment
CN113469739A (en) * 2021-06-25 2021-10-01 广州宸祺出行科技有限公司 Method and system for predicting taxi taking demand for network taxi appointment
CN113469739B (en) * 2021-06-25 2024-05-28 广州宸祺出行科技有限公司 Prediction method and system for taxi taking demand of network taxi taking
CN115394084A (en) * 2022-08-29 2022-11-25 郑州轻工业大学 NMF-BilSTM-based urban road network short-term traffic flow prediction method
CN115394084B (en) * 2022-08-29 2023-07-25 郑州轻工业大学 Urban road network short-time traffic flow prediction method based on NMF-BiLSTM

Also Published As

Publication number Publication date
CN110490365B (en) 2022-04-05

Similar Documents

Publication Publication Date Title
CN110570651B (en) Road network traffic situation prediction method and system based on deep learning
CN111653088B (en) Vehicle driving quantity prediction model construction method, prediction method and system
CN107610469B (en) Day-dimension area traffic index prediction method considering multi-factor influence
CN107045788B (en) Traffic road condition prediction method and device
Zhao et al. Truck traffic speed prediction under non-recurrent congestion: Based on optimized deep learning algorithms and GPS data
CN113487066B (en) Long-time-sequence freight volume prediction method based on multi-attribute enhanced graph convolution-Informer model
CN104834977B (en) Traffic alert grade prediction technique based on learning distance metric
CN103177570B (en) Method for predicting traffic jam indexes for rush hours in morning and evening
CN110458336B (en) Online appointment vehicle supply and demand prediction method based on deep learning
CN110490365A (en) A method of based on the pre- survey grid of multisource data fusion about vehicle order volume
CN109272157A (en) A kind of freeway traffic flow parameter prediction method and system based on gate neural network
CN110503104B (en) Short-time remaining parking space quantity prediction method based on convolutional neural network
CN108648445B (en) Dynamic traffic situation prediction method based on traffic big data
CN111860989B (en) LSTM neural network short-time traffic flow prediction method based on ant colony optimization
CN110164129B (en) Single-intersection multi-lane traffic flow prediction method based on GERNN
CN115440032A (en) Long-term and short-term public traffic flow prediction method
CN113051811B (en) Multi-mode short-term traffic jam prediction method based on GRU network
CN112270355A (en) Active safety prediction method based on big data technology and SAE-GRU
CN112037539B (en) Method and system for recommending signal control scheme for saturated urban traffic network
Chen et al. A multiscale-grid-based stacked bidirectional GRU neural network model for predicting traffic speeds of urban expressways
CN113821547B (en) Rapid and efficient short-time prediction method, system and storage medium for occupancy of parking lot
CN116434538A (en) Urban traffic flow prediction model construction method based on heterogeneous data fusion
CN115269758A (en) Passenger-guidance-oriented road network passenger flow state deduction method and system
CN113537569B (en) Short-term bus passenger flow prediction method and system based on weight stacking decision tree
CN108053646A (en) Traffic characteristic acquisition methods, Forecasting Methodology and system based on time-sensitive feature

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20220405