CN110083801B - OD travel time reliability estimation method and system based on robust statistics - Google Patents

OD travel time reliability estimation method and system based on robust statistics Download PDF

Info

Publication number
CN110083801B
CN110083801B CN201910291447.1A CN201910291447A CN110083801B CN 110083801 B CN110083801 B CN 110083801B CN 201910291447 A CN201910291447 A CN 201910291447A CN 110083801 B CN110083801 B CN 110083801B
Authority
CN
China
Prior art keywords
ttr
module
time
travel time
journey
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910291447.1A
Other languages
Chinese (zh)
Other versions
CN110083801A (en
Inventor
吕伟韬
杨树
陈凝
潘阳阳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangsu Zhitong Transportation Technology Co ltd
Original Assignee
Jiangsu Zhitong Transportation Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangsu Zhitong Transportation Technology Co ltd filed Critical Jiangsu Zhitong Transportation Technology Co ltd
Priority to CN201910291447.1A priority Critical patent/CN110083801B/en
Publication of CN110083801A publication Critical patent/CN110083801A/en
Priority to PCT/CN2019/115223 priority patent/WO2020206996A1/en
Application granted granted Critical
Publication of CN110083801B publication Critical patent/CN110083801B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/29Geographical information databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/40Business processes related to the transportation industry
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Analysis (AREA)
  • Business, Economics & Management (AREA)
  • Mathematical Physics (AREA)
  • Pure & Applied Mathematics (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • Probability & Statistics with Applications (AREA)
  • Health & Medical Sciences (AREA)
  • Operations Research (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Algebra (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Software Systems (AREA)
  • Remote Sensing (AREA)
  • Evolutionary Biology (AREA)
  • Economics (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Traffic Control Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses an OD travel time reliability estimation method and system based on robust statistics, wherein the system comprises an interaction module, a data docking and preprocessing module, a map visualization module and a TTR estimation module; the method comprises the following steps: firstly, acquiring operation data of a taxi, and transmitting the operation data to a data docking and preprocessing module by an interaction module to perform data analysis processing to obtain a travel module corresponding to the operation data; then, the space range corresponding to the operation data is subjected to gridding treatment through a map visualization module to obtain units with congruent shapes, so that space collection of the starting point and the destination of the taxi is realized; meanwhile, a robust statistical method is adopted to obtain TTR indexes through a TTR estimation module, space interpolation is realized through a Kriging method after filling processing is carried out through a map visualization module, and finally reliability estimation on OD travel time is realized; the method has robustness in the reliability estimation process of the OD travel time of the outlier samples with large travel time numerical value difference.

Description

OD travel time reliability estimation method and system based on robust statistics
Technical Field
The invention belongs to the technical field of traffic monitoring, and is mainly used for estimating the reliability of OD travel time of a taxi, in particular to an OD travel time reliability estimation method and system based on robust statistics.
Background
The estimation methods of the current travel time reliability (Travel Time Reliability, TTR) mainly comprise two types, wherein one type is processed by adopting a statistical method based on original travel time data, and the other type is distributed fitting based on a large amount of travel time sample data to further perform reliability estimation; both types of methods are data driven methods. The taxi is operated by a characteristic different from that of a social vehicle, a taxi driver sometimes does not select the most direct (time-saving) passing route between the OD point pairs, and the conditions such as detouring and carpooling exist, so that the travel time between the OD point pairs can be obviously increased, obvious outlier samples are shown in statistics, and if the outlier samples exist, travel time reliability indexes such as standard difference, buffer time index and the like in classical statistics (classical statistics) can be obviously influenced, so that the robustness is poor.
The interference of the outlier sample on the statistical result is reduced, and the common method is to reject the outlier data as an outlier in the data preprocessing stage. However, this method is not applicable in TTR estimation; firstly, because the sample size is too large, the pre-screening work efficiency of outlier data is low and the time cost is high; secondly, because the method is suitable for processing error data, but in OD travel time data of a taxi, an abnormally large value is mostly not error data, is a record of real conditions and reflects the running condition of the vehicle, so that the removal of the sample is not reasonable.
Disclosure of Invention
Aiming at the problem that the processing robustness of travel time samples with outliers for taxis is poor in the prior art, the invention provides the OD travel time reliability estimation method and system based on robust statistics. The specific technical scheme is as follows:
on the one hand, the OD travel time reliability estimation method based on the robust statistics is applied to taxis, and TTR estimation is carried out by the robust statistics method on the basis of processing taxi operation data to obtain an OD travel time sample; further implementing a visualization of the TTR within the target region with spatial interpolation; the specific method comprises the following steps:
s1, preparing original sample data: acquiring taxi operation data in a specified date, wherein each taxi operation data comprises a license plate number, a time stamp, vehicle GPS longitude and latitude coordinates and a passenger carrying state status, wherein status=0 represents no-load, and status=1 represents passenger carrying; and
determining a range of the raw sample data: the range comprises a time range and a space range, wherein the time range refers to any time interval within 0:00-23:59, and the space range refers to any longitude and latitude coordinate range within a specified road network;
s2, processing the original sample data, identifying the starting point and the end point of each trip of all taxis, and generating a taxi trip record: extracting the original sample data of any taxi in any day, arranging the original sample data in time sequence, and marking the status occurrence transition data; if the status is suddenly converted into 1 after a plurality of continuous records are all 0, marking the status as the beginning of the journey of the taxi; otherwise, if the status is suddenly converted to 0 after a plurality of continuous records are 1, the journey marked as the taxi is ended;
s3, dividing the space range into C units with congruent shapes in the map range corresponding to the target area according to the map scale, and marking the unit number as a cell j J=1, 2,.,. C, realizing spatial collection of trip start points and trip destination points of taxis;
s4, taking a designated unit as a target unit for estimating the time reliability of the OD travel time, and screening a travel record of which the end point coordinate is positioned in the target unit from the travel record; determining the OD journey time corresponding to the renting vehicle according to the starting point moment and the ending point moment in the journey record, and dividing a sample data set corresponding to the appointed OD journey time according to the unit of the corresponding starting point in the journey record;
s5, taking robustness and time complexity into consideration, and estimating the TTR by adopting a robust statistical method for each sample data set;
s6, associating the estimated value of the TTR obtained in the S5 with the unit through visualization:
defining all the units as filling objects, and constructing a gradual change color palette based on the filling objects; performing color mapping definition according to any theoretical numerical value interval of the TTR, and performing object filling according to the estimated value;
s7, performing spatial interpolation by adopting a common Kriging method based on the estimated value of the TTR corresponding to the filled unit, and calculating the variance of the TTR after spatial interpolation;
s8, setting a critical value lambda, and if the variance is smaller than the critical value lambda, filling corresponding units by adopting the spatial interpolation; otherwise, interpolation is not performed.
In step S2, the time in the start point and end point records of the trip is the start time and the end time of the trip, respectively; the corresponding taxi journey record comprises a journey starting point, a journey end point and all records in the journey end point, and the GPS coordinates in each journey record are the passing points of the taxies in the journey.
Further, in step S4, the sample data set includes a start unit, an end unit, and an OD travel time t i And the starting point and the destination point in each sample data set are the same.
Further, the TTR indexes comprise central trend CT, dispersion D, standard dispersion SD, order statistics OS and buffer index BI; wherein:
the central trend CT is characterized by an OD travel time sample median (T), wherein median (& gt) is the median, and T= { T 1 ,t 2 ,…,t i ,…,t N -sample data set, N is the number of OD travel time samples; i is the serial number of the OD travel time sample;
the dispersion D is characterized by covariance, i.e
Figure GDA0004040277490000041
In (1) the->
Figure GDA0004040277490000042
Is the mean value of central trend indexes in the classical statistical method, < + >>
Figure GDA0004040277490000043
Sigma is standard deviation>
Figure GDA0004040277490000044
The standard deviation SD is defined by a coefficient of fractional deviation:
Figure GDA0004040277490000045
75 in th (T)、25 th (T) 75% and 25% fractional numbers, respectively;
the order statistic OS takes 90% of the quantiles or 95% of the quantiles;
the buffer index BI is defined as:
Figure GDA0004040277490000046
in the formula, 90 th (T) is 90% fractional number.
On the other hand, an OD journey time reliability estimation system based on robust statistics is provided, which is applied to the OD journey time reliability estimation method based on robust statistics, and comprises an interaction module, a data docking and preprocessing module, a map visualization module and a TTR estimation module, wherein,
the interaction module is used for transmitting the date, the time range and the space range appointed by the user to the data docking and preprocessing module and the map visualization module, and transmitting the TTR estimation index appointed by the user and the number of the target unit to the TTR estimation module;
the data docking and preprocessing module is used for retrieving taxi operation data in the appointed date, the appointed time range and the space range through the interaction module, identifying the starting point and the end point of each trip of all taxis based on the taxi operation data, and generating a trip record, wherein the trip record comprises the starting point and the end point of the trip and all records in the trip, and GPS coordinates in each record are passing points of vehicles in the trip;
the map visualization module is used for carrying out gridding processing on the map area in the space range through the instruction of the interaction module to obtain the unit; defining all the units as filling objects, and constructing a gradual change color mixing panel; respectively defining color mapping according to a theoretical numerical interval of the TTR estimation index; accessing TTR indexes of the TTR estimation module to fill objects; based on TTR index values of the filled units, adopting a common Kriging method to perform spatial interpolation;
the TTR estimation module is connected with the travel record of the data docking and preprocessing module, the map grid processing data of the map visualization module and the target unit number of the interaction module, and the travel record of which the terminal coordinates belong to the target unit range is screened out; according to the origin coordinate attribution unit, calculating the OD travel time corresponding to the travel record; and accessing TTR indexes of the interaction module, and estimating the TTR indexes by adopting a steady statistical method.
Further, each taxi operation data comprises a license plate number, a time stamp, vehicle GPS longitude and latitude coordinates and passenger carrying status; the TTR indexes comprise a central trend CT, a dispersion D, a standard dispersion SD, a sequence statistic OS and a buffer index BI.
Further, the TTR indexes comprise central trend CT, dispersion D, standard dispersion SD, order statistics OS and buffer index BI; wherein:
the central trend CT is characterized by an OD travel time sample median (T), wherein median (& gt) is the median, and T= { T 1 ,t 2 ,…,t i ,…,t N -sample data set, N is the number of OD travel time samples; i is the serial number of the OD travel time sample;
the dispersion D is characterized by covariance, i.e
Figure GDA0004040277490000061
In (1) the->
Figure GDA0004040277490000062
Is the mean value of central trend indexes in the classical statistical method, < + >>
Figure GDA0004040277490000063
Sigma is standard deviation>
Figure GDA0004040277490000064
The standard deviation SD is defined by a coefficient of fractional deviation:
Figure GDA0004040277490000065
75 in th (T)、25 th (T) 75% and 25% fractional numbers, respectively;
the order statistic OS takes 90% of the quantiles or 95% of the quantiles;
the buffer index BI is defined as:
Figure GDA0004040277490000066
in the formula, 90 th (T) is 90% fractional number.
The invention relates to an OD journey time reliability estimation method and system based on robust statistics, wherein the system comprises an interaction module, a data docking and preprocessing module, a map visualization module and a TTR estimation module; firstly, operation data such as license plate numbers, time stamps, vehicle GPS longitude and latitude coordinates, passenger carrying states and the like of taxis in appointed dates are acquired, the operation data are transmitted to a data docking and preprocessing module through an interaction module, and a travel record of the taxis is obtained after processing; meanwhile, a meshing instruction is sent to a map visualization module through an interaction module, the space range is subjected to unit division processing through the map visualization module, meanwhile, a TTR index obtained through a TTR estimation module through a robust statistical method is adopted, space interpolation is realized through a Kriging method after filling processing through the map visualization module, and finally reliability estimation of OD travel time is realized; compared with the prior art, the method has robustness in the reliability estimation process of the OD travel time of the outlier samples with large travel time numerical value difference, has consistent time complexity compared with a classical statistical method, and can save the whole operation time cost.
Drawings
FIG. 1 is a schematic block diagram of an implementation flow of an OD travel time reliability estimation method based on robust statistics according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of the structural composition of an implementation system of the OD travel time reliability estimation method based on robust statistics according to an embodiment of the present invention;
FIG. 3 is a schematic of an MAD grid of OD travel time samples constructed by the method of the present invention in an example;
FIG. 4 is a schematic diagram of a MAD grid map corresponding to a refined OD travel time sample after spatial interpolation in the method of the present invention in an embodiment.
Detailed Description
In order to enable those skilled in the art to better understand the present invention, the following description will make clear and complete descriptions of the technical solutions according to the embodiments of the present invention with reference to the accompanying drawings.
The invention provides a robust statistics-based OD travel time reliability estimation method, which is applied to taxis, and the method carries out TTR estimation by a robust statistics method on the basis of processing taxi operation data to obtain an OD travel time sample; the visualization of the TTR within the target region is further achieved with spatial interpolation.
Referring to fig. 1, the method of the present invention specifically includes the following steps:
s1, preparing original sample data: acquiring taxi operation data in a specified date, wherein each taxi operation data comprises a license plate number, a time stamp, vehicle GPS longitude and latitude coordinates and a passenger carrying state status, wherein status=0 represents no-load, and status=1 represents passenger carrying; determining a range of raw sample data: the range comprises a time range and a space range, wherein the time range refers to any time interval within 0:00-23:59, and the space range refers to any longitude and latitude coordinate range within a specified road network; in the embodiment, taxi operation data can be accessed from a taxi vehicle-mounted GPS monitoring system in an interface docking mode; the time stamp consists of date and time, and is in the format of yyymmdd hh: mm: ss; preferably, to ensure the data sample size, the date specified is set to a period of more than three months, and the data thus obtained is more general.
S2, processing original sample data, identifying the starting point and the end point of each trip of all taxis, generating a taxi trip record, specifically, extracting the original sample data of any taxi in any day, arranging the original sample data in time sequence, and marking status occurrence transition data; in an embodiment, status transitions include two types, wherein if status transitions from 1 suddenly after consecutive records of 0, then marks the beginning of the trip as taxi; otherwise, if status is suddenly converted into 0 after a plurality of continuous records are 1, the journey marked as taxi is ended; in particular, the time in the travel start point and end point records are the start time and the end time of the travel respectively; the corresponding taxi journey record comprises a journey starting point, a journey end point and all records in the journey end point, and the GPS coordinates in each journey record are the passing points of the taxi in the journey.
List one
Number plate number Time stamp Status Longitude lon Latitude lat
Threx exxx 20171214 10:07:20 0 120.9649050 31.4194000
Threx exxx 20171214 10:07:40 0 120.9649050 31.4188000
Threx exxx 20171214 10:08:00 0 120.9649050 31.4188000
Threx exxx 20171214 10:08:17 1 120.9648000 31.4187010
Threx exxx 20171214 10:09:00 1 120.9626000 31.4190030
Threx exxx 20171214 10:10:00 1 120.9650000 31.4196990
Threx exxx 20171214 10:11:00 1 120.9630000 31.4152000
Threx exxx 20171214 10:12:00 1 120.9541000 31.4144000
Threx exxx 20171214 10:13:00 1 120.9505000 31.4145980
Threx exxx 20171214 10:13:20 0 120.9495000 31.4145980
In combination with the first table, taxi operation data of 1 st 12 th 1 st 2017 to 31 st 5 th 2018 are acquired as samples, the time range is the whole day, the space range is all areas in the middle-ring range, and the area of the areas is 26 km; the first data is taken as original sample data, the result is shown as a second table, six pieces of operation data of status=1 corresponding to the taxi form a complete journey, the starting time of the journey is 20171214 10:08:17, and the end time is 2017121410:13:00; the start point coordinates (120.9648000,31.4187010) and the end point coordinates (120.9505000,31.4145980).
Watch II
Threx exxx 20171214 10:08:17 1 120.9648000 31.4187010
Threx exxx 20171214 10:09:00 1 120.9626000 31.4190030
Threx exxx 20171214 10:10:00 1 120.9650000 31.4196990
Threx exxx 20171214 10:11:00 1 120.9630000 31.4152000
Threx exxx 20171214 10:12:00 1 120.9541000 31.4144000
Threx exxx 20171214 10:13:00 1 120.9505000 31.4145980
S3, according to map scale in map range corresponding to the target areaDividing the space range into C units, and numbering the units as cells j J=1, 2, C, realizing spatial collection of trip start points and trip destination points of taxis, obtaining OD travel time of taxis among units; specifically, the spatial collection realized by the travel starting point and the travel destination supports the processing analysis of the OD travel time of the taxies in the spatial range, wherein the OD travel time is the travel time between every two units.
Preferably, the congruent shaped cells are generally square, regular hexagonal, etc.; and when the sample data processed in step S2 is sufficiently large, there are a plurality of OD-time-of-flight data samples between any two units.
S4, taking a designated unit as a target unit for estimating the time reliability of the OD travel time, and screening travel records with end point coordinates as the target unit from the travel records; determining OD journey time corresponding to the taxi according to the starting point moment and the ending point moment in the journey record, and dividing the OD journey time into sample data sets with the same number according to units where the corresponding starting points are located in the journey record; wherein the sample data set comprises a start unit, an end unit, and an OD travel time t i And the start and end points in each sample data set are the same.
S5, taking robustness and time complexity into consideration, and estimating TTR (time to failure) of each sample data set by adopting a robust statistical method; the TTR indexes comprise central trend CT, dispersion D, standard dispersion SD, sequence statistic OS and buffer index BI; wherein the central trend CT is characterized by the median (T) of the OD travel time samples, wherein the median (&) is the median, and T= { T 1 ,t 2 ,…,t i ,…,t N -sample data set, N is the number of OD travel time samples; i is the serial number of the OD travel time sample; the dispersion D is characterized by covariance, i.e
Figure GDA0004040277490000101
In (1) the->
Figure GDA0004040277490000102
Is the central trend index mean value in the classical statistical method,
Figure GDA0004040277490000103
sigma is standard deviation>
Figure GDA0004040277490000111
The standard deviation SD is defined by a coefficient of fractional deviation:
Figure GDA0004040277490000112
75 in th (T)、25 th (T) 75% and 25% fractional numbers, respectively; the order statistic OS takes 90% of the quantiles or 95% of the quantiles; the buffer index BI is defined as: />
Figure GDA0004040277490000113
In the formula, 90 th (T) is 90% fractional number.
In a specific embodiment, in order to ensure that reliability estimation can be performed on an outlier group with an abnormally large travel time value, before TTR index calculation is performed, a plurality of data are extracted from generated processed travel time sample data to perform robustness detection; combining a table III, wherein the specific process is as follows:
dividing the extracted data into four groups, wherein the OD travel time value of the sample data in the group 1 is smaller and no abnormal large value exists; the sample data OD travel time values in group 2 are mostly smaller and there are few abnormally large values; the sample data OD travel time values in group 3 are larger and no obvious abnormally large values exist; the sample data OD travel time values in group 4 are larger and there are several obvious outliers; the OD journey time reliability is estimated by adopting a classical statistical method and a steady statistical method for the four groups of samples respectively, specifically, through TTR indexes: calculation of central trend, dispersion, standard dispersion, order statistics and buffer index is used for measuring TTR indexes.
In classical statistical methods, the central trend is represented by the mean
Figure GDA0004040277490000114
Characterization, where t i Is a sample, N is a sample size; i is a sampleA serial number; the dispersion is determined by standard deviation->
Figure GDA0004040277490000115
Characterization, standard dispersion is characterized by an absolute median difference MAD, i.e., mad=mean (t i -media (T)), where t= { T 1 ,t 2 ,…,t i ,…,t N The mean (& gt) is the median of the sample dataset; the order statistic typically takes 90% quantiles or 95% quantiles; buffer index->
Figure GDA0004040277490000121
In 90 th (T) is the 90% quantile of the sample.
Watch III
Figure GDA0004040277490000122
The statistical results of the groups 1 and 3 show that when no abnormally large travel time sample exists, classical statistics is similar to robust statistical results of the method, but when an outlier sample with an abnormally large individual numerical value exists, robust statistics robustness is stronger, so that the robust statistics is more applicable to taxi OD travel time reliability estimation.
Further, considering that a lognormal distribution mixed model (LMM) can be used to fit the OD travel time distribution, in the embodiment, the TTR index of four groups of samples is estimated after LMM fitting, the reliability index of classical statistics is adopted, and the calculation formula of the mean value is that
Figure GDA0004040277490000123
Wherein K is the distribution number of the mixed model, and w k For the kth distributed weight, +.>
Figure GDA0004040277490000131
Is the mean of the kth distribution; the standard deviation is calculated as +.>
Figure GDA0004040277490000132
In->
Figure GDA0004040277490000133
Standard deviation of the kth distribution; from which MAD, QCD, BI can be further calculated.
Table four vs. results of robust statistics table below
Figure GDA0004040277490000134
Referring to Table IV, as can be seen from the results in the Table, LMM and robust statistics have considerable robustness to abnormally large numerical samples; the time complexity of the two statistical methods is analyzed, and a large O symbol is used for representing the time complexity of the algorithm. Because the LMM method is based on the EM algorithm, the time complexity of the EM algorithm is O (N.times.K.times.I), I is the iteration number, and the value is 100 (the value is determined by the sample size, and the larger the sample size is, the larger the value is); therefore, the LMM method needs to add one O (n×k×i) to the classical statistics and the robust statistics during the index calculation; robust statistics are less computationally expensive than LMM methods.
S6, associating the estimated value of TTR with the space range of gridding treatment through visualization, specifically, defining all units as filling objects, and constructing a gradual change color matching panel based on the filling objects; and carrying out color mapping definition according to a theoretical numerical value interval of any TTR, and carrying out object filling according to an estimated value.
S7, performing spatial interpolation by adopting a common Kriging method based on the estimated value of the TTR corresponding to the filled unit, and calculating the variance of the TTR after the spatial interpolation;
s8, setting a critical value lambda, and if the variance is smaller than the critical value lambda, filling the corresponding unit by adopting spatial interpolation; otherwise, interpolation is not performed.
Dividing the area of the region in the step S2 into 10000 rectangular grid units through grid operation, estimating the OD travel time reliability of original sample data through a robust statistical method, and obtaining a final TTR index grid diagram through map visualization processing and a common Kriging method; wherein, through step S6, a MAD grid chart as shown in fig. 3 can be constructed, and it is obvious from fig. 3 that there are too many non-filling units; then, after spatial interpolation is performed by using the common kriging method in step S7, a MAD grid chart as shown in fig. 4 is obtained, and as can be seen from comparing fig. 3, after spatial interpolation is performed by using the common kriging method, a continuous filling unit can be obtained, and a continuously changing TTR thermodynamic diagram is drawn.
Referring to fig. 2, based on the above-mentioned OD travel time reliability estimation method, the embodiment of the present invention further provides an application system of the OD travel time reliability estimation method based on robust statistics, and specifically, the system includes an interaction module, a data docking and preprocessing module, a map visualization module and a TTR estimation module; the interaction module transmits the date, time range and space range appointed by the user to the data docking and preprocessing module and the map visualization module, and transmits the estimation index of the TTR appointed by the user and the number of the target unit to the TTR estimation module; each taxi operation data comprises a license plate number, a time stamp, vehicle GPS longitude and latitude coordinates and a passenger carrying state status; TTR indexes comprise a central trend CT, a dispersion D, a standard dispersion SD, a sequence statistic OS and a buffer index BI; the data docking and preprocessing module is used for retrieving taxi operation data in a specified date, a specified time range and a specified space range through the interaction module, identifying starting points and end points of all taxis in each trip based on the taxi operation data, and generating trip records, wherein the trip records comprise the starting points and the end points of the trips and all records in the trip, and GPS coordinates in each record are passing points of vehicles in the trip; the map visualization module is used for carrying out gridding processing on the map area in the space range through the instruction of the interaction module to obtain the unit; defining all units as filling objects, and constructing a gradual change color mixing panel; respectively defining color mapping according to a theoretical value interval of the TTR estimation index; accessing TTR indexes of a TTR estimation module to fill objects; based on TTR index values of the filled units, adopting a common Kriging method to perform spatial interpolation; the TTR estimation module is connected with the travel record of the data docking and preprocessing module, connected with the map gridding processing data of the map visualization module, connected with the target unit number of the interaction module, and used for screening the travel record of which the end point coordinate belongs to the target unit range; according to the origin coordinate attribution unit, calculating the OD travel time corresponding to the travel record; TTR indexes of the interaction module are accessed, and TTR index estimation is carried out by adopting a steady statistical method; the specific TTR indexes comprise central trend CT, dispersion D, standard dispersion SD, sequence statistic OS and buffer index BI; wherein:
the central trend CT is characterized by the median (T) of the OD travel time samples, where median (·) is the median, t= { T 1 ,t 2 ,…,t i ,…,t N -sample data set, N is the number of OD travel time samples; i is the serial number of the OD travel time sample; the dispersion D is characterized by covariance, i.e
Figure GDA0004040277490000151
In (1) the->
Figure GDA0004040277490000152
Is the mean value of central trend indexes in the classical statistical method, < + >>
Figure GDA0004040277490000153
Sigma is standard deviation>
Figure GDA0004040277490000154
The standard deviation SD is defined by a coefficient of fractional deviation:
Figure GDA0004040277490000161
75 in th (T)、25 th (T) 75% and 25% fractional numbers, respectively; the order statistic OS takes 90% of the quantiles or 95% of the quantiles; the buffer index BI is defined as: />
Figure GDA0004040277490000162
In the formula, 90 th (T) is 90% fractional number.
Preferably, the interactive module realizes the interactive operation of the electronic map by calling a functional interface provided by the map engine, and the information such as the space range, the target unit and the like is acquired from the front end through the interactive operation of the electronic map of the user. And the map visualization module realizes the rendering of the electronic map by calling a functional interface provided by the map engine.
The invention relates to an OD journey time reliability estimation method and system based on robust statistics, wherein the system comprises an interaction module, a data docking and preprocessing module, a map visualization module and a TTR estimation module; firstly, operation data such as license plate numbers, time stamps, vehicle GPS longitude and latitude coordinates, passenger carrying states and the like of taxis in appointed dates are acquired, the operation data are transmitted to a data docking and preprocessing module through an interaction module, and a travel record of the taxis is obtained after processing; meanwhile, a meshing instruction is sent to a map visualization module through an interaction module, the map visualization module is used for carrying out unit division processing operation of congruent shapes on a space range, meanwhile, a TTR index obtained through a TTR estimation module through a robust statistical method is adopted, space interpolation is realized through a Kriging method after filling processing is carried out through the map visualization module, and finally reliability estimation on OD travel time is realized; compared with the prior art, the method has robustness in the reliability estimation process of the OD travel time of the outlier samples with large travel time numerical value difference, has consistent time complexity compared with a classical statistical method, and can save the whole operation time cost.
Although the present invention has been described in detail with reference to the foregoing embodiments, it will be apparent to those skilled in the art that the foregoing embodiments may be modified or equivalents substituted for some of the features thereof. All equivalent structures made by the content of the specification and the drawings of the invention are directly or indirectly applied to other related technical fields, and are also within the scope of the invention.

Claims (7)

1. The OD travel time reliability estimation method based on the robust statistics is characterized by being applied to taxis, and is used for estimating TTR by the robust statistics method on the basis of processing taxi operation data to obtain an OD travel time sample; further implementing a visualization of the TTR within the target region with spatial interpolation; the specific method comprises the following steps:
s1, preparing original sample data: acquiring taxi operation data in a specified date, wherein each taxi operation data comprises a license plate number, a time stamp, vehicle GPS longitude and latitude coordinates and a passenger carrying state status, wherein status=0 represents no-load, and status=1 represents passenger carrying; and
determining a range of the raw sample data: the range comprises a time range and a space range, wherein the time range refers to any time interval within 0:00-23:59, and the space range refers to any longitude and latitude coordinate range within a specified road network;
s2, processing the original sample data, identifying the starting point and the end point of each trip of all taxis, and generating a taxi trip record: extracting the original sample data of any taxi in any day, arranging the original sample data in time sequence, and marking the status occurrence transition data; if the status is suddenly converted into 1 after a plurality of continuous records are all 0, marking the status as the beginning of the journey of the taxi; otherwise, if the status is suddenly converted to 0 after a plurality of continuous records are 1, the journey marked as the taxi is ended;
s3, dividing the space range into C units with congruent shapes in the map range corresponding to the target area according to the map scale, and marking the unit number as a cell j J=1, 2,.,. C, realizing spatial collection of trip start points and trip destination points of taxis;
s4, taking a designated unit as a target unit for estimating the time reliability of the OD travel time, and screening a travel record of which the end point coordinate is positioned in the target unit from the travel record; determining the OD journey time corresponding to the renting vehicle according to the starting point moment and the ending point moment in the journey record, and dividing a sample data set corresponding to the appointed OD journey time according to the unit of the corresponding starting point in the journey record;
s5, taking robustness and time complexity into consideration, and estimating the TTR by adopting a robust statistical method for each sample data set;
s6, associating the estimated value of the TTR obtained in the S5 with the unit through visualization:
defining all the units as filling objects, and constructing a gradual change color palette based on the filling objects; performing color mapping definition according to any theoretical numerical value interval of the TTR, and performing object filling according to the estimated value;
s7, performing spatial interpolation by adopting a common Kriging method based on the estimated value of the TTR corresponding to the filled unit, and calculating the variance of the TTR after spatial interpolation;
s8, setting a critical value lambda, and if the variance is smaller than the critical value lambda, filling corresponding units by adopting the spatial interpolation; otherwise, interpolation is not performed.
2. The robust statistics-based OD journey time reliability estimation method according to claim 1, wherein in step S2, the time in the journey start point and end point record are the start time and end time of the journey, respectively; the corresponding taxi journey record comprises a journey starting point, a journey end point and all records in the journey end point, and the GPS coordinates in each journey record are the passing points of the taxies in the journey.
3. The robust statistics-based OD travel time reliability estimation method according to claim 1, wherein in step S4, the sample data set includes a start unit, an end unit, an OD travel time t i And the starting point and the destination point in each sample data set are the same.
4. The robust statistics-based OD journey time reliability estimation method of claim 1, wherein the TTR metrics include central trend CT, dispersion D, standard dispersion SD, order statistics OS, buffer index BI measurements; wherein:
the central trend CT is characterized by an OD travel time sample median (T), wherein median (& gt) is the median, and T= { T 1 ,t 2 ,…,t i ,…,t N -sample data set, N is the number of OD travel time samples; i is the serial number of the OD travel time sample;
the dispersion D is characterized by covariance, i.e
Figure FDA0004040277480000021
In (1) the->
Figure FDA0004040277480000022
Is the mean value of central trend indexes in the classical statistical method, < + >>
Figure FDA0004040277480000023
Sigma is standard deviation>
Figure FDA0004040277480000024
The standard deviation SD is defined by a coefficient of fractional deviation:
Figure FDA0004040277480000025
75 in th (T)、25 th (T) 75% and 25% fractional numbers, respectively;
the order statistic OS takes 90% of the quantiles or 95% of the quantiles;
the buffer index BI is defined as:
Figure FDA0004040277480000026
in the formula, 90 th (T) is 90% fractional number.
5. The OD travel time reliability estimation system based on robust statistics, which is applied to the OD travel time reliability estimation method based on robust statistics according to any one of claims 1 to 4, is characterized in that the system comprises an interaction module, a data docking and preprocessing module, a map visualization module and a TTR estimation module, wherein,
the interaction module is used for transmitting the date, the time range and the space range appointed by the user to the data docking and preprocessing module and the map visualization module, and transmitting the TTR estimation index appointed by the user and the number of the target unit to the TTR estimation module;
the data docking and preprocessing module is used for retrieving taxi operation data in the appointed date, the appointed time range and the space range through the interaction module, identifying the starting point and the end point of each trip of all taxis based on the taxi operation data, and generating a trip record, wherein the trip record comprises the starting point and the end point of the trip and all records in the trip, and GPS coordinates in each record are passing points of vehicles in the trip;
the map visualization module is used for carrying out gridding processing on the map area in the space range through the instruction of the interaction module to obtain the unit; defining all the units as filling objects, and constructing a gradual change color mixing panel; respectively defining color mapping according to a theoretical numerical interval of the TTR estimation index; accessing TTR indexes of the TTR estimation module to fill objects; based on TTR index values of the filled units, adopting a common Kriging method to perform spatial interpolation;
the TTR estimation module is connected with the travel record of the data docking and preprocessing module, the map grid processing data of the map visualization module and the target unit number of the interaction module, and the travel record of which the terminal coordinates belong to the target unit range is screened out; according to the origin coordinate attribution unit, calculating the OD travel time corresponding to the travel record; and accessing TTR indexes of the interaction module, and estimating the TTR indexes by adopting a steady statistical method.
6. The robust statistics-based OD journey time reliability estimation system of claim 5, wherein each of the taxi operation data comprises a license plate number, a time stamp, vehicle GPS longitude and latitude coordinates, and passenger status; the TTR indexes comprise a central trend CT, a dispersion D, a standard dispersion SD, a sequence statistic OS and a buffer index BI.
7. The robust statistics-based OD travel time reliability estimation system according to claim 5, wherein said TTR indicator includes a center trend CT, a dispersion D, a standard dispersion SD, a sequence statistic OS, a buffer index BI measurement; wherein:
the central trend CT is characterized by an OD travel time sample median (T), wherein median (& gt) is the median, and T= { T 1 ,t 2 ,…,t i ,…,t N -sample data set, N is the number of OD travel time samples; i is the serial number of the OD travel time sample;
the dispersion D is characterized by covariance, i.e
Figure FDA0004040277480000041
In (1) the->
Figure FDA0004040277480000042
Is the mean value of central trend indexes in the classical statistical method, < + >>
Figure FDA0004040277480000043
Sigma is standard deviation>
Figure FDA0004040277480000044
The standard deviation SD is defined by a coefficient of fractional deviation:
Figure FDA0004040277480000045
75 in th (T)、25 th (T) 75% and 25% fractional numbers, respectively;
the order statistic OS takes 90% of the quantiles or 95% of the quantiles;
the buffer index BI is defined as:
Figure FDA0004040277480000046
in the formula, 90 th (T) is 90% fractional number. />
CN201910291447.1A 2019-04-12 2019-04-12 OD travel time reliability estimation method and system based on robust statistics Active CN110083801B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201910291447.1A CN110083801B (en) 2019-04-12 2019-04-12 OD travel time reliability estimation method and system based on robust statistics
PCT/CN2019/115223 WO2020206996A1 (en) 2019-04-12 2019-11-04 Method and system for estimating od travel time reliability based on robust statistics

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910291447.1A CN110083801B (en) 2019-04-12 2019-04-12 OD travel time reliability estimation method and system based on robust statistics

Publications (2)

Publication Number Publication Date
CN110083801A CN110083801A (en) 2019-08-02
CN110083801B true CN110083801B (en) 2023-05-12

Family

ID=67414799

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910291447.1A Active CN110083801B (en) 2019-04-12 2019-04-12 OD travel time reliability estimation method and system based on robust statistics

Country Status (2)

Country Link
CN (1) CN110083801B (en)
WO (1) WO2020206996A1 (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110083801B (en) * 2019-04-12 2023-05-12 江苏智通交通科技有限公司 OD travel time reliability estimation method and system based on robust statistics
CN112734870B (en) * 2020-12-28 2023-12-05 威创集团股份有限公司 Thermodynamic diagram continuous dynamic evolution visualization method and system
CN113407906B (en) * 2021-05-13 2024-02-02 东南大学 Method for determining traffic travel distribution impedance function based on mobile phone signaling data
CN113379233B (en) * 2021-06-08 2023-02-28 重庆大学 Travel time reliability estimation method and device based on high-order moment
CN113673571A (en) * 2021-07-22 2021-11-19 华设设计集团股份有限公司 Taxi abnormal order identification method based on density clustering method
CN116343486B (en) * 2023-05-19 2023-07-25 中南大学 Expressway network group mobility identification method based on abnormal field

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH09319992A (en) * 1996-05-27 1997-12-12 Matsushita Electric Ind Co Ltd Route information calculation device
CN106448165A (en) * 2016-11-02 2017-02-22 浙江大学 Road network travel time reliability evaluation method based on online booked car data
CN106529754A (en) * 2016-06-27 2017-03-22 江苏智通交通科技有限公司 Taxi operation condition assessment method based on big data analysis
CN106875314A (en) * 2017-01-31 2017-06-20 东南大学 A kind of Urban Rail Transit passenger flow OD method for dynamic estimation

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102930718A (en) * 2012-09-20 2013-02-13 同济大学 Intermittent flow path section travel time estimation method based on floating car data and coil flow fusion
CN110083801B (en) * 2019-04-12 2023-05-12 江苏智通交通科技有限公司 OD travel time reliability estimation method and system based on robust statistics

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH09319992A (en) * 1996-05-27 1997-12-12 Matsushita Electric Ind Co Ltd Route information calculation device
CN106529754A (en) * 2016-06-27 2017-03-22 江苏智通交通科技有限公司 Taxi operation condition assessment method based on big data analysis
CN106448165A (en) * 2016-11-02 2017-02-22 浙江大学 Road network travel time reliability evaluation method based on online booked car data
CN106875314A (en) * 2017-01-31 2017-06-20 东南大学 A kind of Urban Rail Transit passenger flow OD method for dynamic estimation

Also Published As

Publication number Publication date
CN110083801A (en) 2019-08-02
WO2020206996A1 (en) 2020-10-15

Similar Documents

Publication Publication Date Title
CN110083801B (en) OD travel time reliability estimation method and system based on robust statistics
CN111128399B (en) Epidemic disease epidemic situation risk level assessment method based on people stream density
CN109299438B (en) Public transport facility supply level evaluation method based on network appointment data
CN106529754B (en) Taxi operation condition evaluation method based on big data analysis
US10572847B2 (en) Dynamic space-time diagram for visualization of transportation schedule adherence
DE102017129701A1 (en) Map-based travel trajectory and data integration system
WO2020108219A1 (en) Traffic safety risk based group division and difference analysis method and system
CN111881243B (en) Taxi track hot spot area analysis method and system
CN115205718B (en) Geographic information measuring system and measuring method thereof
CN116579697A (en) Cold chain full link data information management method, device, equipment and storage medium
CN111854786A (en) Regular bus route planning visualization method, device, equipment and medium
CN114912689A (en) Map grid index and XGBOST-based over-limit vehicle destination prediction method and system
CN115221218A (en) Quality evaluation method and device for vehicle data, computer equipment and storage medium
CN111127930A (en) Method and device for acquiring common driving route of vehicle
CN116664025A (en) Loading and unloading position point generation method, device and equipment
CN116543242A (en) Vehicle type detection method, device, equipment and medium based on high-speed charging data
CN111723125A (en) Method and system for rapidly analyzing automobile DTS (data transfer system) measurement data
CN115271332A (en) Drought monitoring method
CN114997527A (en) Enterprise assessment and evaluation method, system and terminal based on road transportation dynamic data
CN111710157B (en) Method for extracting hot spot area of taxi
CN114169247A (en) Method, device and equipment for generating simulated traffic flow and computer readable storage medium
CN114407661A (en) Data-driven electric vehicle energy consumption prediction method, system, device and medium
Schumann et al. flowCyBar-Analyze flow cytometric data using gate information
CN116665342B (en) New energy automobile driving behavior analysis method, system and equipment
CN101763435A (en) Resident trip data correctness identifying method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: 211106 19 Su Yuan Avenue, Jiangning economic and Technological Development Zone, Nanjing, Jiangsu

Applicant after: JIANGSU ZHITONG TRANSPORTATION TECHNOLOGY Co.,Ltd.

Address before: 210006, Qinhuai District, Jiangsu, Nanjing should be 388 days street, Chenguang 1865 Technology Creative Industry Park E10 building on the third floor

Applicant before: JIANGSU ZHITONG TRANSPORTATION TECHNOLOGY Co.,Ltd.

GR01 Patent grant
GR01 Patent grant