CN116091115A - Network taxi sharing rate prediction method based on built environment and travel characteristics - Google Patents

Network taxi sharing rate prediction method based on built environment and travel characteristics Download PDF

Info

Publication number
CN116091115A
CN116091115A CN202310086926.6A CN202310086926A CN116091115A CN 116091115 A CN116091115 A CN 116091115A CN 202310086926 A CN202310086926 A CN 202310086926A CN 116091115 A CN116091115 A CN 116091115A
Authority
CN
China
Prior art keywords
travel
network
census
characteristic parameters
population
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310086926.6A
Other languages
Chinese (zh)
Inventor
杨鸿泰
罗鹏
黎朝敬
翟国聪
韩科
杨林川
刘昱岗
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sichuan Guolan Zhongtian Environmental Technology Group Co ltd
Original Assignee
Sichuan Guolan Zhongtian Environmental Technology Group Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sichuan Guolan Zhongtian Environmental Technology Group Co ltd filed Critical Sichuan Guolan Zhongtian Environmental Technology Group Co ltd
Priority to CN202310086926.6A priority Critical patent/CN116091115A/en
Publication of CN116091115A publication Critical patent/CN116091115A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • G06Q30/0202Market predictions or forecasting for commercial activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0283Price estimation or determination
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/40Business processes related to the transportation industry
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Development Economics (AREA)
  • Strategic Management (AREA)
  • Finance (AREA)
  • Accounting & Taxation (AREA)
  • Marketing (AREA)
  • Theoretical Computer Science (AREA)
  • Economics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Game Theory and Decision Science (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Primary Health Care (AREA)
  • Tourism & Hospitality (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a network about car sharing rate prediction method based on a built environment and travel characteristics, which comprises the steps of obtaining network about car original data and preprocessing; extracting travel characteristic parameters according to the preprocessed network vehicle original data; acquiring socioeconomic data and built environment data, and respectively extracting socioeconomic characteristic parameters and built environment characteristic parameters; performing colinear analysis on the extracted travel characteristic parameters, socioeconomic characteristic parameters and built environment characteristic parameters; constructing a network car sharing rate prediction model, and calibrating parameters; and predicting the network taxi sharing rate by using the network taxi sharing rate prediction model after parameter calibration. According to the method, the network taxi sharing rate prediction model based on the travel characteristics and the built environment information is constructed, so that the prediction accuracy of the network taxi sharing rate can be effectively improved, the taxi sharing rate of a prediction area is further improved, and the problems of environmental pollution and traffic jam are solved.

Description

Network taxi sharing rate prediction method based on built environment and travel characteristics
Technical Field
The invention relates to the technical field of network taxi sharing rate prediction, in particular to a network taxi sharing rate prediction method based on a built environment and travel characteristics.
Background
The network about car is widely accepted by the public as a novel travel mode due to flexible service modes and higher service quality. However, an increase in the number of network-bound vehicle trips also brings adverse effects, such as air pollution and traffic congestion. Since 2014, network about car companies have successively introduced a new service form, car sharing. The carpooling, as a new way of sharing travel, allows passengers to share the same journey (vehicle) with others at lower cost, which can reduce pollution and relieve traffic jams. And the carpooling rate is the ratio of the number of carpooling orders to the total number of orders in an area network. The carpooling rate reflects the sharing degree of the network about car travel in a region, predicts the carpooling rate and identifies factors which have important influence on the carpooling rate, so that government departments, traffic planning departments and network about car operating companies can be helped to formulate corresponding policy providing basis, and the carpooling rate is corrected and improved.
However, most of the current schemes pay attention to the selection of the carpool lines and the matching of carpool orders, but few technologies pay attention to the prediction of carpool rates and factors affecting the carpool rates. This lack of information has led government agencies and operators to be unable to determine which areas have lower rates and how to increase rates by improving the build environment and fare.
Disclosure of Invention
Aiming at the defects in the prior art, the invention provides a network car sharing rate prediction method based on the built environment and travel characteristics.
In order to achieve the aim of the invention, the invention adopts the following technical scheme:
a network taxi sharing rate prediction method based on a built environment and travel characteristics comprises the following steps:
s1, acquiring network about vehicle original data and preprocessing;
s2, extracting travel characteristic parameters according to the preprocessed network vehicle original data;
s3, obtaining social and economic data and built environment data, and respectively extracting social and economic characteristic parameters and built environment characteristic parameters;
s4, performing colinear analysis on the extracted travel characteristic parameters, the socioeconomic characteristic parameters and the built environment characteristic parameters, and determining the optimized travel characteristic parameters, socioeconomic characteristic parameters and built environment characteristic parameters;
s5, a network vehicle-sharing rate prediction model is constructed according to the optimized travel characteristic parameters, the optimized socioeconomic characteristic parameters and the optimized environmental characteristic parameters, and parameter calibration is carried out;
and S6, predicting the network taxi sharing rate by using the network taxi sharing rate prediction model calibrated by the parameters.
Optionally, the network about car raw data acquired in step S1 specifically includes:
the method comprises the steps of travel number, travel starting time, travel ending time, travel distance, travel expense, population census region number where a travel starting point is located, population census region number where a travel ending point is located, whether passengers are willing to share a car with other passengers, the number of passengers participating in the car sharing in the travel, the latitude of the population census region center where a passenger carrying starting point is located and the longitude of the population census region center where the passenger carrying starting point is located.
Optionally, in step S1, preprocessing the network about vehicle original data specifically includes:
filling missing values of original data of the net appointment vehicle;
and carrying out noise data cleaning on the filled net car original data.
Optionally, the filling the missing value of the network about car original data specifically includes:
according to the latitude of the population census region center where the passenger carrying starting point position is located and the longitude of the population census region center where the passenger carrying starting point position is located, carrying out point throwing display on the population census region centers at the travel starting point and the travel end point;
intersecting the projection display result with the population census region map, and determining the population census region number at which the travel starting point is located and the population census region number at which the travel ending point is located.
Optionally, the cleaning the noise data of the filled network vehicle original data specifically includes:
traversing travel expense data in the network vehicle-reduction original data, and deleting the corresponding network vehicle-reduction original data with zero travel expense;
traversing travel distance data in the network vehicle-restraining original data, and deleting the corresponding network vehicle-restraining original data with the travel distance smaller than a travel distance threshold value;
traversing travel time data in the network vehicle-restraining original data, and deleting the corresponding network vehicle-restraining original data with travel time smaller than a travel time threshold;
and traversing the data of the number of passengers participating in the carpooling in the trip in the network car-closing original data, and screening the corresponding network car-closing original data with the number of passengers participating in the carpooling in the trip being more than 1.
Optionally, step S2 specifically includes:
calculating the ratio of the travel cost standard deviation to the travel cost median according to the travel cost of each travel start point and travel end point pair in the network about vehicle original data;
selecting all travel starting point and travel end point pairs with the ratio of the travel expense standard deviation to the travel expense median larger than a set threshold value, and calculating the ratio of the travel expense median when the car is in a carpool travel to the travel expense median when the car is not in a carpool travel;
calculating the difference value of the ratio of the travel expense median of the 1 and the travel expense median of the carpool and the travel expense median of the non-carpool to obtain a travel expense discount, and calculating the travel expense discount median;
and calculating the travel distance median of each travel starting point and travel end point pair according to the travel distance in the network about vehicle original data.
Optionally, the socioeconomic characteristic parameters extracted in step S3 specifically include:
female population ratio in the census region, young population ratio in the census region, no-vehicle family ratio in the census region, white population ratio in the census region, asian population ratio in the census region, black population ratio in the census region, scholars' level in the census region and above population ratio, low income population ratio in the census region and median of family annual income in the census region.
Optionally, the built environment characteristic parameters extracted in step S3 specifically include:
population density in a census area, employment density in a census area, road density in a census area, airport presence variables in a census area, subway station density in a census area, distance from a city center in a census area, residential area land percentage in a census area, commercial area land percentage in a census area, land use mixed entropy in a census area, and violent crime density in a census area.
Optionally, in step S5, the construction of the network taxi sharing rate prediction model specifically includes:
Figure BDA0004069111070000041
/>
wherein y is i Average car sharing rate, beta, of all network car trips representing trip starting point or trip ending point as ith personal opening general investigation region 0 Representing the total intercept, beta k Represents the kth fixed effect coefficient, K represents the total number of fixed effect coefficients, x ik Represents the optimized socioeconomic characteristic parameter, f (·) represents the TPS smoothing function, x il Representing the characteristic parameters of the built environment taking the ith personal opening census area as the travel starting point or the travel end point, x im Represents the optimized trip characteristic parameters epsilon i Representing the error term.
The invention has the following beneficial effects:
(1) According to the network about car original data, the socioeconomic data and the built environment data, the travel characteristic parameters, the socioeconomic characteristic parameters and the built environment characteristic parameters are respectively extracted, and parameter optimization is carried out; the prediction accuracy of the network taxi sharing rate can be effectively improved by constructing a network taxi sharing rate prediction model based on travel characteristics and built environment information and identifying relevant influence factors.
(2) According to the method, the network taxi sharing rate prediction model based on the travel characteristics and the built environment information is built, related influence factors can be determined, and a data base of an effective decision basis is provided for improving the taxi sharing rate, so that the taxi sharing rate of a prediction area is improved, and the problems of environmental pollution and traffic jam are solved.
Drawings
Fig. 1 is a flow chart of a method for predicting the network taxi sharing rate based on the built environment and the travel characteristics in the embodiment of the invention.
Detailed Description
The following description of the embodiments of the present invention is provided to facilitate understanding of the present invention by those skilled in the art, but it should be understood that the present invention is not limited to the scope of the embodiments, and all the inventions which make use of the inventive concept are protected by the spirit and scope of the present invention as defined and defined in the appended claims to those skilled in the art.
As shown in fig. 1, the embodiment of the invention provides a method for predicting the network taxi sharing rate based on the built environment and the travel characteristics, which comprises the following steps S1 to S6:
s1, acquiring network about vehicle original data and preprocessing;
in an optional embodiment of the present invention, the present embodiment first determines a space-time range of a prediction area, and obtains network about vehicle original data in the space-time range, which specifically includes:
the method comprises the steps of travel number, travel starting time, travel ending time, travel distance, travel expense, population census region number where a travel starting point is located, population census region number where a travel ending point is located, whether passengers are willing to share a car with other passengers, the number of passengers participating in the car sharing in the travel, the latitude of the population census region center where a passenger carrying starting point is located and the longitude of the population census region center where the passenger carrying starting point is located.
Then preprocessing the original data of the network about car, which specifically comprises the following steps:
filling missing values of original data of the net appointment vehicle;
and carrying out noise data cleaning on the filled net car original data.
The method for filling the missing value of the original data of the net appointment vehicle specifically comprises the following steps:
according to the latitude of the population census region center where the passenger carrying starting point position is located and the longitude of the population census region center where the passenger carrying starting point position is located, carrying out point throwing display on the population census region centers at the travel starting point and the travel end point;
intersecting the projection display result with the population census region map, and determining the population census region number at which the travel starting point is located and the population census region number at which the travel ending point is located.
The method for cleaning the noise data of the filled net appointment vehicle original data specifically comprises the following steps:
traversing travel expense data in the network vehicle-reduction original data, and deleting the corresponding network vehicle-reduction original data with zero travel expense;
traversing travel distance data in the network vehicle-restraining original data, and deleting the corresponding network vehicle-restraining original data with the travel distance smaller than a travel distance threshold value;
traversing travel time data in the network vehicle-restraining original data, and deleting the corresponding network vehicle-restraining original data with travel time smaller than a travel time threshold;
and traversing the data of the number of passengers participating in the carpooling in the trip in the network car-closing original data, and screening the corresponding network car-closing original data with the number of passengers participating in the carpooling in the trip being more than 1.
Specifically, in this embodiment, chicago in the united states is used as the target city of the study, the population census area of chicago is 801, and each population census area may be the starting point (O) of one trip or the ending point (D) of another trip. The time range is 1 month 1 day to 5 months 31 days in 2019, and the data is from the Chicago portal. The center coordinates of the Chicago city are 41 degrees 39 degrees in North latitude, 87 degrees in west longitude and 34 degrees in west longitude, 40.23 kilometers in north and 40.14 kilometers in east-west width.
The raw data of the net jockey car is shown in table 1.
Table 1 net jockey original dataset
Figure BDA0004069111070000071
Figure BDA0004069111070000081
The original data set of the net jockey car is easy to be affected by noise and missing values, so that the original data set of the net jockey car needs to be preprocessed, and the method mainly comprises missing value filling and noise data cleaning.
The missing value process is to supplement the population census region numbers at the missing trip start and end points in the trip data to preserve significant amounts of valid data. The step can be completed in ArcGIS software, firstly, the 'display XY data' function is utilized, the population census region centers at the travel start point and the travel end point are subjected to spot display through the latitude of the population census region center where the passenger start point is located and the longitude field data of the population census region center where the passenger start point is located, and then the 'space connection' function is utilized to intersect with the map file of the population census region in Chicago city so as to restore the population census region numbers of the travel start point and the travel end point in the initial data.
Noise data cleaning may be performed using Python software, first to delete data with a cost of 0, travel distance less than 0.1 miles, and travel time less than 120 seconds. The initial data are 40,130,321 pieces, the cleaned data are 40,124,669 pieces, then the value of the cleaned data is defined as a carpool trip according to a Trips Pooled variable, the value of the cleaned data is greater than 1, the cleaned data are defined as a single trip, the cleaned data are separated from the carpool trip data, and the sample size is 9,671,045 pieces.
S2, extracting travel characteristic parameters according to the preprocessed network vehicle original data;
in an alternative embodiment of the invention, travel characteristic variables based on microscopic individual travel or macroscopic space aggregation analysis are processed by an initial network vehicle-about data set, and comprise individual travel distance, travel cost and the median of OD discount on travel distance and travel cost.
The step S2 specifically comprises the following steps:
calculating the ratio of the travel cost standard deviation to the travel cost median according to the travel cost of each travel start point and travel end point pair in the network about vehicle original data;
selecting all travel starting point and travel end point pairs with the ratio of the travel expense standard deviation to the travel expense median larger than a set threshold value, and calculating the ratio of the travel expense median when the car is in a carpool travel to the travel expense median when the car is not in a carpool travel;
calculating the difference value of the ratio of the travel expense median of the 1 and the travel expense median of the carpool and the travel expense median of the non-carpool to obtain a travel expense discount, and calculating the travel expense discount median;
and calculating the travel distance median of each travel starting point and travel end point pair according to the travel distance in the network about vehicle original data.
Specifically, the travel characteristic parameters in this embodiment include a travel distance median and a travel fee discount median, as shown in table 2.
Table 2 trip characteristic parameters
Figure BDA0004069111070000091
The calculation mode of the trip expense discount median is as follows: the average value of each trip cost in the same OD is calculated first, then the standard deviation of the trip cost is calculated, and then the standard deviation is used to compare the cost median with the ratio similar to the coefficient of variation, which can reflect the level of deviation of the observed value from the median. OD pairs with coefficients greater than 0.5 were considered to be greatly biased OD pairs, which were excluded from the usage data for this protocol; and then taking the median of the carpool expense and the median of the non-carpool travel expense from each OD pair, and subtracting the ratio of 1 to the median to obtain the expense discount. Finally, the median of the trip expense discount can be calculated.
The trip distance is the trip distance median of each OD pair can be calculated directly by using the groupby () function of Python software without privacy processing of the original real data.
S3, obtaining social and economic data and built environment data, and respectively extracting social and economic characteristic parameters and built environment characteristic parameters;
in an optional embodiment of the present invention, the socioeconomic characteristic parameters extracted in this embodiment specifically include:
female population ratio in the census region, young population ratio in the census region, no-vehicle family ratio in the census region, white population ratio in the census region, asian population ratio in the census region, black population ratio in the census region, scholars' level in the census region and above population ratio, low income population ratio in the census region and median of family annual income in the census region.
The built environment data extracted in this embodiment specifically includes:
population density in a census area, employment density in a census area, road density in a census area, airport presence variables in a census area, subway station density in a census area, distance from a city center in a census area, residential area land percentage in a census area, commercial area land percentage in a census area, land use mixed entropy in a census area, and violent crime density in a census area.
Specifically, the present embodiment acquires socioeconomic data from the u.s.census office census data, and extracts therefrom socioeconomic characteristic parameters of female population ratio in census area, young population ratio in census area, non-vehicle family ratio in census area, white population ratio in census area, asian population ratio in census area, black population ratio in census area, having a scholars degree and above in census area, low-income population ratio in census area, and median annual income of family in census area, as shown in table 3.
TABLE 3 socioeconomic profile
Figure BDA0004069111070000111
For various percentage parameters such as the social and economic characteristic parameters of female proportion, young proportion and the like, the percentage parameters can be calculated by utilizing a division operation formula of Excel software, and the specific formula can be defined and described in the parameters in Table 3.
In the built environment data acquired in this embodiment, population data is acquired from a census bureau. Employment data is obtained from the U.S. longitudinal employment-family dynamics (Longitudinal Employer-Household Dynamics, abbreviated LEHD) program that combines the employment dynamics of federal, state, and census offices to provide a regional employment. The present embodiment uses employment post data based on workplace areas to aggregate statistics with a census chunk unit. The present embodiment aggregates employment data for census zone units by unique GEOID numbers. Road data, subway station data, bus stop data and violent crime data are all obtained from the chicago portal data website. Land is downloaded from the illinois government website (Illinois State Government website) using a map file.
For population density and employment density in the density class parameters, it is obtained by dividing the population and employment number within each population census zone by the area of each population census zone. The census area is obtained based on the calculated geometric area under the WGS1984World Mercator projection coordinate system. The other parameters include road density, station density and violent crime density, and space statistics are needed. The method comprises the steps of respectively importing shape files of roads, bus stations, subway stations and violent crimes in a projection coordinate system format and shape files of population census areas in chicago city into an ArcGIS platform, respectively obtaining the total length of the roads, the number of stations and the violent crimes in each population census area through space connection and statistics, and dividing the total length by the area of each population census area to obtain each density parameter.
For different types of land occupancy in the percentage type parameter, the different types of land occupancy may be divided by the total area of the census region. The specific operation of data processing is that a shape file of land utilization and a shape file of a population screening area are imported into an ArcGIS platform, and then land utilization layers are segmented through intersecting functions according to boundary outlines of the population screening area, so that occupied areas of various lands in each population screening area are obtained. The diversity of land utilization can be calculated after the percentage of various land types is calculated, the land utilization is characterized by using the mixed entropy elements, and the mixed entropy index of the land utilization in each area can be calculated according to the following formula.
Figure BDA0004069111070000121
Wherein P is j Represents the percentage of the area occupied by the J-th type of land in each area, J represents the number of land types in the area, and there are 5 types in total in this embodiment. The value of the land use mixed entropy is from 0 to 1, the value is closer to 0, the land use in the area is single, the value is closer to 1, the land use mixed entropy is higher in the area, and the land occupation ratios of various types are close.
For region-to-city center distances in the distance class parameters, a destination reachability is characterized. The commercial center street is a commonly accepted city center of chicago, the geographic mass centers of the commercial center street and the census areas are calculated through an element turning point geographic processing tool in the arcGIS, and then the distances from the census areas to the city center are measured by adopting a neighbor analysis tool.
Through the above processing, the population density in the census area, employment density in the census area, road density in the census area, airport existence variable in the census area, subway station density in the census area, distance from the center of the city in the census area, residence area usage percentage in the census area, business area usage percentage in the census area, land usage mixed entropy in the census area and violent crime density in the census area are obtained, and the built environment characteristic parameters are shown in table 4.
TABLE 4 build environmental characterization parameters
Figure BDA0004069111070000131
Figure BDA0004069111070000141
S4, performing colinear analysis on the extracted travel characteristic parameters, the socioeconomic characteristic parameters and the built environment characteristic parameters, and determining the optimized travel characteristic parameters, socioeconomic characteristic parameters and built environment characteristic parameters;
in an alternative embodiment of the present invention, when there is a high degree of correlation between two or more of the feature parameters in the model, serious multiple collinearity problems can occur between the feature parameters. Bias can occur in parameter estimation, which affects the reliability of the model results. It is necessary to check the correlation of the various characteristic parameters within the model prior to model parameter estimation. The correlation problem between the characteristic parameters is checked in this embodiment using a variance expansion factor (Variance inflation factor, VIF for short). The formula can be expressed as:
Figure BDA0004069111070000142
/>
wherein, tolerance i In order to explain the tolerance of the variable i,
Figure BDA0004069111070000143
for the decision coefficient between the i-th explanatory variable and other explanatory variables, the more relevant the other explanatory variables are, the higher the value is. In general, when the value of VIF is greater than 10, it indicates thatThere is a serious problem of multiple collinearity between explanatory variables, and it is necessary to gradually reject explanatory variables having VIF values greater than 10 until the VIF values of all explanatory variables are less than 10.
S5, a network vehicle-sharing rate prediction model is constructed according to the optimized travel characteristic parameters, the optimized socioeconomic characteristic parameters and the optimized environmental characteristic parameters, and parameter calibration is carried out;
in an optional embodiment of the present invention, the prediction model of the network vehicle-sharing rate is specifically set according to the optimized travel characteristic parameter, the socioeconomic characteristic parameter and the built environment characteristic parameter:
Figure BDA0004069111070000151
wherein y is i Average car sharing rate, beta, of all network car trips representing trip starting point or trip ending point as ith personal opening general investigation region 0 Representing the total intercept, beta k Represents the kth fixed effect coefficient, K represents the total number of fixed effect coefficients, x ik Represents the optimized socioeconomic characteristic parameter, f (·) represents the TPS smoothing function, x il Representing the characteristic parameters of the built environment taking the ith personal opening census area as the travel starting point or the travel end point, x im Represents the optimized trip characteristic parameters epsilon i Representing the error term.
And then, putting the travel characteristic parameters, the socioeconomic characteristic parameters and the built environment characteristic parameters determined in the steps S2 to S4 into a model, and determining various parameters of the model. The parameter calibration of the dependent and independent variables of the final model is shown in table 5.
TABLE 5 model parameter calibration results
Figure BDA0004069111070000152
/>
Figure BDA0004069111070000161
Note that s () represents a smoothing function; significance level p <0.05, p <0.01, p < 0.0001. And S6, predicting the network taxi sharing rate by using the network taxi sharing rate prediction model calibrated by the parameters.
In an optional embodiment of the present invention, the present embodiment may obtain, after processing in steps S2 to S4, a current travel feature parameter, a socioeconomic feature parameter, and a built environment feature parameter according to the network taxi-offer raw data in the space-time range to be predicted, so that the current network taxi-offer ratio may be predicted according to the current travel feature parameter, the socioeconomic feature parameter, and the built environment feature parameter by using the network taxi-offer ratio prediction model after parameter calibration.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The principles and embodiments of the present invention have been described in detail with reference to specific examples, which are provided to facilitate understanding of the method and core ideas of the present invention; meanwhile, as those skilled in the art will have variations in the specific embodiments and application scope in accordance with the ideas of the present invention, the present description should not be construed as limiting the present invention in view of the above.
Those of ordinary skill in the art will recognize that the embodiments described herein are for the purpose of aiding the reader in understanding the principles of the present invention and should be understood that the scope of the invention is not limited to such specific statements and embodiments. Those of ordinary skill in the art can make various other specific modifications and combinations from the teachings of the present disclosure without departing from the spirit thereof, and such modifications and combinations remain within the scope of the present disclosure.

Claims (9)

1. The method for predicting the network taxi sharing rate based on the built environment and the travel characteristics is characterized by comprising the following steps of:
s1, acquiring network about vehicle original data and preprocessing;
s2, extracting travel characteristic parameters according to the preprocessed network vehicle original data;
s3, obtaining social and economic data and built environment data, and respectively extracting social and economic characteristic parameters and built environment characteristic parameters;
s4, performing colinear analysis on the extracted travel characteristic parameters, the socioeconomic characteristic parameters and the built environment characteristic parameters, and determining the optimized travel characteristic parameters, socioeconomic characteristic parameters and built environment characteristic parameters;
s5, a network vehicle-sharing rate prediction model is constructed according to the optimized travel characteristic parameters, the optimized socioeconomic characteristic parameters and the optimized environmental characteristic parameters, and parameter calibration is carried out;
and S6, predicting the network taxi sharing rate by using the network taxi sharing rate prediction model calibrated by the parameters.
2. The method for predicting the network taxi sharing rate based on the built environment and the travel characteristics according to claim 1, wherein the network taxi sharing original data obtained in the step S1 specifically comprises:
the method comprises the steps of travel number, travel starting time, travel ending time, travel distance, travel expense, population census region number where a travel starting point is located, population census region number where a travel ending point is located, whether passengers are willing to share a car with other passengers, the number of passengers participating in the car sharing in the travel, the latitude of the population census region center where a passenger carrying starting point is located and the longitude of the population census region center where the passenger carrying starting point is located.
3. The method for predicting the network taxi sharing rate based on the built environment and the travel characteristics according to claim 1 or 2, wherein the preprocessing of the network taxi sharing original data in step S1 specifically comprises:
filling missing values of original data of the net appointment vehicle;
and carrying out noise data cleaning on the filled net car original data.
4. The method for predicting the network taxi sharing rate based on the built environment and the travel characteristics according to claim 3, wherein the filling the missing value of the original network taxi sharing data specifically comprises:
according to the latitude of the population census region center where the passenger carrying starting point position is located and the longitude of the population census region center where the passenger carrying starting point position is located, carrying out point throwing display on the population census region centers at the travel starting point and the travel end point;
intersecting the projection display result with the population census region map, and determining the population census region number at which the travel starting point is located and the population census region number at which the travel ending point is located.
5. The method for predicting the network taxi sharing rate based on the built environment and the travel characteristics according to claim 3, wherein the step of cleaning the noise data of the filled network taxi raw data specifically comprises the following steps:
traversing travel expense data in the network vehicle-reduction original data, and deleting the corresponding network vehicle-reduction original data with zero travel expense;
traversing travel distance data in the network vehicle-restraining original data, and deleting the corresponding network vehicle-restraining original data with the travel distance smaller than a travel distance threshold value;
traversing travel time data in the network vehicle-restraining original data, and deleting the corresponding network vehicle-restraining original data with travel time smaller than a travel time threshold;
and traversing the data of the number of passengers participating in the carpooling in the trip in the network car-closing original data, and screening the corresponding network car-closing original data with the number of passengers participating in the carpooling in the trip being more than 1.
6. The method for predicting the network taxi sharing rate based on the built environment and the travel characteristics according to claim 1, wherein the step S2 specifically comprises:
calculating the ratio of the travel cost standard deviation to the travel cost median according to the travel cost of each travel start point and travel end point pair in the network about vehicle original data;
selecting all travel starting point and travel end point pairs with the ratio of the travel expense standard deviation to the travel expense median larger than a set threshold value, and calculating the ratio of the travel expense median when the car is in a carpool travel to the travel expense median when the car is not in a carpool travel;
calculating the difference value of the ratio of the travel expense median of the 1 and the travel expense median of the carpool and the travel expense median of the non-carpool to obtain a travel expense discount, and calculating the travel expense discount median;
and calculating the travel distance median of each travel starting point and travel end point pair according to the travel distance in the network about vehicle original data.
7. The method for predicting the network taxi sharing rate based on the built environment and the travel characteristics according to claim 1, wherein the socioeconomic characteristic parameters extracted in the step S3 specifically comprise:
female population ratio in the census region, young population ratio in the census region, no-vehicle family ratio in the census region, white population ratio in the census region, asian population ratio in the census region, black population ratio in the census region, scholars' level in the census region and above population ratio, low income population ratio in the census region and median of family annual income in the census region.
8. The method for predicting the network taxi sharing rate based on the built environment and the travel characteristics according to claim 1, wherein the built environment characteristic parameters extracted in the step S3 specifically comprise:
population density in a census area, employment density in a census area, road density in a census area, airport presence variables in a census area, subway station density in a census area, distance from a city center in a census area, residential area land percentage in a census area, commercial area land percentage in a census area, land use mixed entropy in a census area, and violent crime density in a census area.
9. The method for predicting the network taxi sharing rate based on the built environment and the travel characteristics according to claim 1, wherein the method for predicting the network taxi sharing rate in step S5 is specifically characterized in that:
Figure FDA0004069111050000041
wherein y is i Average car sharing rate, beta, of all network car trips representing trip starting point or trip ending point as ith personal opening general investigation region 0 Representing the total intercept, beta k Represents the kth fixed effect coefficient, K represents the total number of fixed effect coefficients, x ik Represents the optimized socioeconomic characteristic parameters, and f (·) represents the TPS smoothing functionNumber, x il Representing the characteristic parameters of the built environment taking the ith personal opening census area as the travel starting point or the travel end point, x im Represents the optimized trip characteristic parameters epsilon i Representing the error term.
CN202310086926.6A 2023-01-29 2023-01-29 Network taxi sharing rate prediction method based on built environment and travel characteristics Pending CN116091115A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310086926.6A CN116091115A (en) 2023-01-29 2023-01-29 Network taxi sharing rate prediction method based on built environment and travel characteristics

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310086926.6A CN116091115A (en) 2023-01-29 2023-01-29 Network taxi sharing rate prediction method based on built environment and travel characteristics

Publications (1)

Publication Number Publication Date
CN116091115A true CN116091115A (en) 2023-05-09

Family

ID=86208099

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310086926.6A Pending CN116091115A (en) 2023-01-29 2023-01-29 Network taxi sharing rate prediction method based on built environment and travel characteristics

Country Status (1)

Country Link
CN (1) CN116091115A (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104766146A (en) * 2015-04-24 2015-07-08 陆化普 Traffic demand forecasting method and system
CN108734361A (en) * 2017-04-18 2018-11-02 北京嘀嘀无限科技发展有限公司 Share-car order processing method and apparatus
CN111860929A (en) * 2020-03-18 2020-10-30 北京嘀嘀无限科技发展有限公司 Car-sharing order-form-piecing-rate estimation method and system
CN112150207A (en) * 2020-09-30 2020-12-29 武汉大学 Online taxi appointment order demand prediction method based on space-time context attention network
CN114254250A (en) * 2021-12-14 2022-03-29 北京航空航天大学 Network taxi appointment travel demand prediction method considering space-time non-stationarity

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104766146A (en) * 2015-04-24 2015-07-08 陆化普 Traffic demand forecasting method and system
CN108734361A (en) * 2017-04-18 2018-11-02 北京嘀嘀无限科技发展有限公司 Share-car order processing method and apparatus
CN111860929A (en) * 2020-03-18 2020-10-30 北京嘀嘀无限科技发展有限公司 Car-sharing order-form-piecing-rate estimation method and system
CN112150207A (en) * 2020-09-30 2020-12-29 武汉大学 Online taxi appointment order demand prediction method based on space-time context attention network
CN114254250A (en) * 2021-12-14 2022-03-29 北京航空航天大学 Network taxi appointment travel demand prediction method considering space-time non-stationarity

Similar Documents

Publication Publication Date Title
Dean et al. Spatial variation in shared ride-hail trip demand and factors contributing to sharing: Lessons from Chicago
Chiu Chuen et al. Mode choice between private and public transport in Klang Valley, Malaysia
CN109299438B (en) Public transport facility supply level evaluation method based on network appointment data
CN106844624A (en) A kind of visual public transport big data analysis system
CN104025075A (en) Method and system for fleet navigation, dispatching and multi-vehicle, multi-destination routing
Llorca et al. The usage of location based big data and trip planning services for the estimation of a long-distance travel demand model. Predicting the impacts of a new high speed rail corridor
CN112579718B (en) Urban land function identification method and device and terminal equipment
CN110472999B (en) Passenger flow mode analysis method and device based on subway and shared bicycle data
CN115062873B (en) Traffic travel mode prediction method and device, storage medium and electronic device
Gaduh et al. Life in the slow lane: Unintended consequences of public transit in Jakarta
CN104024801A (en) Method and system for navigation using bounded geograhic regions
Seya et al. Decisions on truck parking place and time on expressways: An analysis using digital tachograph data
Deng et al. Heterogenous Trip Distance‐Based Route Choice Behavior Analysis Using Real‐World Large‐Scale Taxi Trajectory Data
Liu et al. Exploring the spatially heterogeneous effect of the built environment on ride-hailing travel demand: A geographically weighted quantile regression model
CN112488419A (en) Passenger flow distribution prediction method, device, equipment and storage medium based on OD analysis
Wang et al. The role of urban form in the performance of shared automated vehicles
CN111914940A (en) Shared vehicle station clustering method, system, device and storage medium
CN110222884B (en) Station reachability evaluation method based on POI data and passenger flow volume
CN117196197A (en) Public transportation site layout optimization method
CN108711286B (en) Traffic distribution method and system based on multi-source Internet of vehicles and mobile phone signaling
Tian et al. Identifying residential and workplace locations from transit smart card data
Roxas Jr et al. Estimating the environmental effects of the car shifting behavior along EDSA
CN116091115A (en) Network taxi sharing rate prediction method based on built environment and travel characteristics
Crawford et al. Analysing spatial intrapersonal variability of road users using point-to-point sensor data
Link et al. Combining GPS tracking and surveys for a mode choice model: Processing data from a quasi-natural experiment in Germany

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination