CN113344268B - Urban traffic trip data analysis method - Google Patents

Urban traffic trip data analysis method Download PDF

Info

Publication number
CN113344268B
CN113344268B CN202110616013.1A CN202110616013A CN113344268B CN 113344268 B CN113344268 B CN 113344268B CN 202110616013 A CN202110616013 A CN 202110616013A CN 113344268 B CN113344268 B CN 113344268B
Authority
CN
China
Prior art keywords
data
station
time
travel
traffic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110616013.1A
Other languages
Chinese (zh)
Other versions
CN113344268A (en
Inventor
刘俊伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hefei Tairui Shuchuang Technology Co ltd
Original Assignee
Hefei Tairui Shuchuang Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hefei Tairui Shuchuang Technology Co ltd filed Critical Hefei Tairui Shuchuang Technology Co ltd
Priority to CN202110616013.1A priority Critical patent/CN113344268B/en
Publication of CN113344268A publication Critical patent/CN113344268A/en
Application granted granted Critical
Publication of CN113344268B publication Critical patent/CN113344268B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/29Geographical information databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • G06Q10/047Optimisation of routes or paths, e.g. travelling salesman problem
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data

Abstract

The invention relates to the technical field of traffic travel data analysis, in particular to an urban traffic travel data analysis method, which comprises the following steps: s1, acquiring multi-source travel data; s2, replacing the original defective data with high-quality data through data cleaning and preprocessing; s3, excavating total traffic demands and various traffic operation conditions based on the processed data; s4, obtaining a travel track by using a data fusion technology, and deeply mining an internal mechanism of data; s5, establishing a space optimization system facing service coverage, and acquiring a suggested scheme for optimizing the traffic route layout; the method can effectively complete data, improves the accuracy and the usability of the data, and lays a foundation for the subsequent data fusion; meanwhile, a space optimization system facing service coverage is established, and a user can be helped to obtain a suggested scheme for optimizing the layout of the traffic route.

Description

Urban traffic trip data analysis method
Technical Field
The invention relates to the technical field of traffic travel data analysis, in particular to an urban traffic travel data analysis method.
Background
The economic development level of China is continuously improved, the number of motor vehicles is increased rapidly, and the traveling modes of people are also changed profoundly. The increasing demand of diversified outgoing of residents puts new requirements on urban traffic systems, however, due to unreasonable urban road design and wire network planning, imperfect public traffic outgoing facilities and line congestion become the primary problems which disturb the outgoing of residents in large, medium and small cities in China. The urban public transport is preferentially developed, the public transport trip service level is improved, more residents are guided to select the public transport trip mode, and the method is a law of the law of governing traffic jam for stubborn diseases.
With the popularization of public transportation IC cards and the continuous development and perfection of urban intelligent public transportation system platforms, massive IC card swiping data and vehicle operation GPS data are generated, and abundant passenger travel information is hidden in the data. In addition, with the increasing development of technologies such as cloud computing, big data mining and data mining, the method makes it possible to carry out multi-dimensional and deep mining on passenger travel characteristics and passenger flow rules from real mass multi-source public transportation data, and is more comprehensive, real, accurate and deep compared with the traditional traffic survey questionnaire analysis.
The invention disclosed in patent No. 201711120069.8 discloses a method, device, terminal and computer readable medium for processing travel survey data, wherein the method comprises: acquiring multi-source traffic trip survey data; carrying out data analysis on the multi-source transportation travel survey data to obtain travel survey information of a travel group in a specified area; generating a origin-destination (OD) matrix based on the travel information; carrying out travel characteristic analysis on the travel groups of the traffic analysis cells according to the origin-destination matrix, and generating and outputting travel characteristic data for the travel groups in the traffic analysis cells; the invention as patent No. 202011119990.2 discloses an urban traffic travel data analysis method based on federal learning, which comprises the following steps: passenger data are provided by all parties of the urban traffic travel data source, and data processing is respectively completed according to a pre-agreed data specification; all parties of the urban traffic travel data source train and construct a local self-owned model through machine learning or deep learning, and sample data are obtained based on the self-owned model; inputting sample data into a federated learning platform for joint modeling; however, these prior arts still have some problems in practical use, such as: urban traffic is influenced by surrounding signal shielding to generate a positioning failure phenomenon, in addition, a certain packet loss rate also exists in the data transmission process, GPS data of one station is lost, so that a large number of passengers getting on the bus at the station cannot complete station matching, and the positioning failure data cannot be repaired at present; the characteristics of passenger travel data and the passenger flow rule are not deeply mined, and the public transport travel service level is urgently needed to be improved.
Disclosure of Invention
The invention designs an urban traffic travel data analysis method aiming at the problems brought forward by the background technology.
The invention is realized by the following technical scheme:
a method for analyzing urban traffic travel data comprises the following steps:
s1, acquiring multi-source transportation travel data, wherein the data comprises public transportation IC card data, subway card swiping card data, taxi operation data, vehicle-mounted GPS data, mobile phone signaling data, road monitoring data, vehicle identification data and traffic payment data;
s2, due to the influences of instability of data acquisition equipment and transmission equipment and complexity of surrounding environment, various errors can occur in the data acquisition and transmission of the equipment, and the original defective data needs to be replaced by high-quality data through data cleaning and preprocessing;
s3, excavating total traffic demands and various traffic running conditions based on the processed data, extracting an boarding station, a taking route and a disembarking station from each travel record, determining an actual travel route, a transfer station and a final station, and recovering travel chain records of individual passengers;
s4, obtaining a travel track by using a data fusion technology, deeply mining the internal mechanism of data, and calculating and predicting to obtain potential information according to the internal correlation information between the data;
and S5, establishing a service coverage-oriented space optimization system from the perspective of station and line expansion in the face of an area with remarkable difference of supply and demand of traffic service, and acquiring a suggested scheme for optimizing the layout of the traffic line.
As a further improvement to the above scheme, the types of data errors in step S2 include abnormal data, data loss, and data overlap, where the abnormal data refers to data with some attribute values with incorrect format or attribute values significantly exceeding the normal value range, and obviously does not conform to the actual situation, and exceeds the time normal value range; the data loss is the problem that one or more attribute values in card data, operation data, GPS data, mobile phone signaling data, monitoring data, vehicle identification data and traffic payment data are lost; the data overlapping means that some attribute values appear multiple times in a short time and interfere with the subsequent data fusion.
As a further improvement to the above scheme, the method for processing the abnormal data comprises: comparing the presented format of the abnormal data with the normal data, setting the range of the normal data, deleting the abnormal data when the abnormal data appears, and recording the error content and the error type; the data loss processing method comprises the following steps: matching the attribute and the specific position of the data loss, and performing correlation completion with other data in the attribute; the data overlapping processing method comprises the following steps: and setting a minimum time interval threshold, if the time interval of the same attribute value is less than the threshold, determining the attribute value as the attribute value, reserving one attribute value, and deleting other redundant repeated attribute values.
As a further improvement of the above scheme, the data loss includes subway vehicle-mounted GPS data loss, and the concrete completion steps for the subway vehicle-mounted GPS data loss are as follows:
a. firstly, sequencing vehicle station reporting data recorded by a vehicle-mounted GPS of each subway according to the station sequence of a specific line operation according to each shift to obtain a sequence: { a1, a2, A3, a4 … Ak };
b. comparing the information with subway scheduling information, wherein the information is complete and does not need to be repaired, and if the information is incomplete, finding a region with missing station reporting information, and recording the region as { Am, Am +1, Am +2 and … An };
c. finding out card swiping data of all the buses of the current bus, setting a continuous card swiping time interval threshold, counting a time starting point and a time finishing point of each card swiping, and fusing data of the length of a road segment, the traffic volume and the acceleration of the bus;
d. then converting the subway vehicle-mounted GPS data problem into a subway station-to-station inter-row prediction problem, and calculating a travel time probability density function from each missing station by means of data as follows:
Figure BDA0003097540080000031
wherein t is the road section running time, mu is a position parameter and represents an expected value of the logarithm lnt of the road section running time, and sigma is a scale parameter and represents a standard deviation of lnt; when mu is far larger than sigma, the travel time probability distribution function can be regarded as normal distribution, and the expected and standard deviation of the road section travel time is obtained by the following formula:
Figure BDA0003097540080000032
Figure BDA0003097540080000033
e (t) and SD (t) respectively represent the expectation and standard deviation of the road section travel time, and the cumulative distribution formula corresponding to the road section travel time is as follows:
Figure BDA0003097540080000041
e. and d, calculating the actual travel time of the A1 station from the Am station by using the card swiping data, calculating the actual travel time of the Ak station from the An station by using the card swiping data, substituting the actual travel time into the probability density function in the step d, and respectively obtaining two maximum values which are Am and An to infer the corresponding stations.
As a further improvement to the above scheme, the data fusion in step S4 includes matching subway stations, matching bus stations, and creating individual travel chain records.
As a further improvement to the above scheme, the concrete process of matching the subway stations is as follows: firstly, dividing all subway card swiping data into a system database according to the days, wherein the card swiping time of the day is defined to be 00: 00-24: 00; the subway card swiping data comprises the card swiping time, the station and the card number of each trip record; sorting the smart card data of the same card logic number from morning to evening through time; for the trip of passengers taking the subway, each complete subway trip comprises two card swiping records of entering and exiting stations, and the card swiping data of each passenger are paired pairwise according to the time sequence; for each pair of card swiping data, the data with the settlement amount is inbound, the data without the settlement amount is outbound, and the subway station matching data of each passenger every day can be obtained.
As a further improvement to the above scheme, the specific process of bus stop matching is as follows: firstly, acquiring IC card data and GPS data records, and performing primary matching on the vehicle number attributes recorded by the IC card data and the GPS data to find out bus card swiping data and GPS data of the same vehicle in the same shift; the card swiping time of the bus IC card is between the arrival time and the departure time of the GPS station, and secondary time matching is carried out according to the principle to obtain the data of the IC card and the GPS data of the getting-on station and the up-down direction; for the initial bus station, the card swiping time of the initial bus station is possibly before the arrival time, the driving direction of the line can be automatically identified, and the information of the initial station of the line can be called.
As a further improvement to the above scheme, the specific process of establishing the individual trip chain record is as follows: after the single trip information of each passenger is acquired, the trips comprise walking, taxi taking, subway and public transport, the track of the single trip of the passenger can be obtained, the track of the trip can be changed for many times, then the mode of the trip at each time is connected in a chain structure in the time sequence, the chain structure can also describe the process that the trip of the passenger starts from the starting point and finally arrives at the destination, and the information of the time, the space and the trip type is included.
As a further improvement to the above solution, the establishment process of the space optimization system in step S5 is as follows:
a. firstly, preprocessing starting point and terminal point information input by a passenger, when the starting point and the terminal point are lines crossing an urban area, suggesting the passenger to select a high-speed rail for going out, and automatically planning a route from the starting point to a high-speed rail station and a route from the high-speed rail terminal station to a client input terminal station;
b. rasterizing a GPS electronic map, marking the starting point and the end point geographic positions input by passengers in the electronic map in real time, wherein the corresponding grid geometric center points are marked positions;
c. then, whether a subway station and a bus station exist or not is searched in a circular 1km area of the geographical positions marked at the starting point and the end point, if yes, an optimal walking route from the starting point marked position to the station is planned, and if not, the passenger is recommended to select a taxi taking and traveling mode;
d. after the walking route is planned, multi-source transportation travel data can be obtained, a plurality of planned routes can be obtained by means of a big data calculation method and recorded as L1, L2 and L3 … Ln, and each planned route comprises one or more travel modes of walking, driving, subway and bus;
e. and finally, screening the shortest route, the shortest time route and the least traffic jam route.
As a further improvement of the above scheme, in step e, the total travel cost including taxi taking cost, subway cost and bus cost can be calculated according to the shortest route, the shortest route in time and the minimum route in traffic jam.
Compared with the prior art, the invention has the beneficial effects that:
1. in the invention, the subway vehicle-mounted GPS data loss problem is converted into the inter-row prediction problem of subway station-to-station, the travel time probability density function from each missing station is calculated, the actual travel time from the A1 station to the Am station is calculated by using card swiping data, the actual travel time from the Ak station to the An station is calculated by using card swiping data, the probability density function in the step d is substituted, the obtained two maximum values are respectively the Am and the An to infer corresponding stations, and the station information between the Am and the An can be obtained by combining the data of the vehicle, so that the data completion can be effectively carried out, the accuracy and the usability of the data are improved, and a foundation is laid for the subsequent data fusion.
2. The space optimization system facing the service coverage is established in the invention, can help users to obtain a proposal for optimizing the layout of the traffic route, can help public transport companies and government transportation departments to timely master and understand the rules and the space-time characteristics of the commuting and traveling passenger flow of the urban buses, find out related problems existing in the current situation of traffic operation, provide decision-making support for traffic network adjustment, road network planning, infrastructure construction, customized bus route planning and the like, and has important significance for improving and improving the urban traffic traveling service level, increasing the passenger flow sharing rate of public traffic and further relieving peak congestion.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a flow chart of an urban transportation trip data analysis method according to the present invention;
FIG. 2 is a flow chart of a concrete completion method when subway vehicle-mounted GPS data is lost in the invention;
FIG. 3 is a flow chart of subway station matching in the present invention;
FIG. 4 is a flow chart of the spatial optimization system of the present invention.
Detailed Description
In order to make the technical solutions better understood by those skilled in the art, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only partial embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The technical scheme of the invention is further explained by combining the attached drawings.
An urban transportation travel data analysis method, as shown in fig. 1, includes the following steps:
s1, acquiring multi-source transportation travel data, wherein the data comprises public transportation IC card data, subway card swiping card data, taxi operation data, vehicle-mounted GPS data, mobile phone signaling data, road monitoring data, vehicle identification data and traffic payment data;
s2, due to the influences of instability of data acquisition equipment and transmission equipment and complexity of surrounding environment, various errors can occur in the data acquisition and transmission of the equipment, and the original defective data needs to be replaced by high-quality data through data cleaning and preprocessing; the types of data errors include abnormal data, data loss and data overlap, wherein the abnormal data refers to data with an incorrect format of certain attribute values or attribute values obviously exceeding a normal value range, obviously does not accord with an actual condition and exceeds a time normal value range; the data loss is the problem that one or more attribute values in card data, operation data, GPS data, mobile phone signaling data, monitoring data, vehicle identification data and traffic payment data are lost; data overlapping means that some attribute values appear for many times in a short time and interfere with subsequent data fusion;
s3, excavating total traffic demands and various traffic running conditions based on the processed data, extracting an boarding station, a taking route and a disembarking station from each travel record, determining an actual travel route, a transfer station and a final station, and recovering travel chain records of individual passengers;
s4, obtaining a travel track by using a data fusion technology, deeply mining the internal mechanism of data, and calculating and predicting to obtain potential information according to the internal correlation information between the data;
and S5, establishing a service coverage-oriented space optimization system from the perspective of station and line expansion in the face of an area with remarkable difference of supply and demand of traffic service, and acquiring a suggested scheme for optimizing the layout of the traffic line.
The abnormal data processing method comprises the following steps: comparing the presented format of the abnormal data with the normal data, setting the range of the normal data, deleting the abnormal data when the abnormal data appears, and recording the error content and the error type; the data loss processing method comprises the following steps: matching the attribute and the specific position of the data loss, and performing correlation completion with other data in the attribute; the data overlapping processing method comprises the following steps: and setting a minimum time interval threshold, if the time interval of the same attribute value is less than the threshold, determining the attribute value as the attribute value, reserving one attribute value, and deleting other redundant repeated attribute values.
As shown in fig. 2, the data loss includes subway vehicle-mounted GPS data loss, and the concrete completion steps for the subway vehicle-mounted GPS data loss are as follows:
a. firstly, sequencing vehicle station reporting data recorded by a vehicle-mounted GPS of each subway according to the station sequence of a specific line operation according to each shift to obtain a sequence: { a1, a2, A3, a4 … Ak };
b. comparing the information with subway scheduling information, wherein the information is complete and does not need to be repaired, and if the information is incomplete, finding a region with missing station reporting information, and recording the region as { Am, Am +1, Am +2 and … An };
c. finding out card swiping data of all the buses of the current bus, setting a continuous card swiping time interval threshold, counting a time starting point and a time finishing point of each card swiping, and fusing data of the length of a road segment, the traffic volume and the acceleration of the bus;
d. then converting the subway vehicle-mounted GPS data problem into a subway station-to-station inter-row prediction problem, and calculating a travel time probability density function from each missing station by means of data as follows:
Figure BDA0003097540080000071
wherein t is the road section running time, mu is a position parameter and represents an expected value of the logarithm lnt of the road section running time, and sigma is a scale parameter and represents a standard deviation of lnt; when mu is far larger than sigma, the travel time probability distribution function can be regarded as normal distribution, and the expected and standard deviation of the road section travel time is obtained by the following formula:
Figure BDA0003097540080000081
Figure BDA0003097540080000082
e (t) and SD (t) respectively represent the expectation and standard deviation of the road section travel time, and the cumulative distribution formula corresponding to the road section travel time is as follows:
Figure BDA0003097540080000083
e. and d, calculating the actual travel time of the A1 station from the Am station by using the card swiping data, calculating the actual travel time of the Ak station from the An station by using the card swiping data, substituting the actual travel time into the probability density function in the step d, respectively deducing corresponding stations for Am and An by using the obtained two maximum values, and then combining the data of the vehicle to obtain the station information between Am and An.
As shown in fig. 1 and 3, the data fusion in step S4 includes matching subway stations, matching bus stations, and creating individual trip chain records; the concrete process of subway station matching is as follows: firstly, dividing all subway card swiping data into a system database according to the days, wherein the card swiping time of the day is defined to be 00: 00-24: 00; the subway card swiping data comprises the card swiping time, the station and the card number of each trip record; sorting the smart card data of the same card logic number from morning to evening through time; for the trip of passengers taking the subway, each complete subway trip comprises two card swiping records of entering and exiting stations, and the card swiping data of each passenger are paired pairwise according to the time sequence; for each pair of card swiping data, the data with the settlement amount is inbound, the data without the settlement amount is outbound, and the subway station matching data of each passenger every day can be obtained; the specific process of bus stop matching is as follows: firstly, acquiring IC card data and GPS data records, and performing primary matching on the vehicle number attributes recorded by the IC card data and the GPS data to find out bus card swiping data and GPS data of the same vehicle in the same shift; the card swiping time of the bus IC card is between the arrival time and the departure time of the GPS station, and secondary time matching is carried out according to the principle to obtain the data of the IC card and the GPS data of the getting-on station and the up-down direction; for the initial bus station, the card swiping time of the initial bus station is possibly before the arrival time, the driving direction of the line can be automatically identified, and the information of the initial station of the line is called; the specific process of establishing the individual trip chain record is as follows: after the single trip information of each passenger is acquired, the trips comprise walking, taxi taking, subway and public transport, the track of the single trip of the passenger can be obtained, the track of the trip can be changed for many times, then the mode of the trip at each time is connected in a chain structure in the time sequence, the chain structure can also describe the process that the trip of the passenger starts from the starting point and finally arrives at the destination, and the information of the time, the space and the trip type is included.
As shown in fig. 1 and 4, the process of establishing the space optimization system in step S5 is as follows:
a. firstly, preprocessing starting point and terminal point information input by a passenger, when the starting point and the terminal point are lines crossing an urban area, suggesting the passenger to select a high-speed rail for going out, and automatically planning a route from the starting point to a high-speed rail station and a route from the high-speed rail terminal station to a client input terminal station;
b. rasterizing a GPS electronic map, marking the starting point and the end point geographic positions input by passengers in the electronic map in real time, wherein the corresponding grid geometric center points are marked positions;
c. then, whether a subway station and a bus station exist or not is searched in a circular 1km area of the geographical positions marked at the starting point and the end point, if yes, an optimal walking route from the starting point marked position to the station is planned, and if not, the passenger is recommended to select a taxi taking and traveling mode;
d. after the walking route is planned, multi-source transportation travel data can be obtained, a plurality of planned routes can be obtained by means of a big data calculation method and recorded as L1, L2 and L3 … Ln, and each planned route comprises one or more travel modes of walking, driving, subway and bus;
e. and finally, screening the shortest route, the shortest time route and the least traffic jam route.
And e, calculating the total travel cost by using the shortest route, the shortest route in time and the minimum route with traffic jam, wherein the total travel cost comprises taxi taking cost, subway cost and bus cost.
When urban traffic is influenced by surrounding signal shielding to cause a positioning failure phenomenon or a certain packet loss rate exists in the data transmission process, data completion is effectively carried out, the accuracy and the usability of data are improved, and a foundation is laid for subsequent data fusion; meanwhile, a space optimization system facing service coverage is established, a user can be helped to obtain a proposal for optimizing the layout of the traffic route, public transport companies and government transportation departments can be helped to timely master and understand the rules and the space-time characteristics of the commuting and traveling passenger flow of the urban buses, relevant problems existing in the current traffic operation situation can be found, decision-making support is provided for traffic line adjustment, road network planning, infrastructure construction, customized bus route planning and the like, and the space optimization system has important significance for improving and improving the urban traffic traveling service level, increasing the passenger flow sharing rate of public traffic and further relieving peak congestion.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention are intended to be included within the scope of the present invention.

Claims (7)

1. A method for analyzing urban traffic travel data is characterized by comprising the following steps:
s1, acquiring multi-source transportation travel data, wherein the data comprises public transportation IC card data, subway card swiping card data, taxi operation data, vehicle-mounted GPS data, mobile phone signaling data, road monitoring data, vehicle identification data and traffic payment data;
s2, due to the influences of instability of data acquisition equipment and transmission equipment and complexity of surrounding environment, various errors can occur in the data acquisition and transmission of the equipment, and the original defective data needs to be replaced by high-quality data through data cleaning and preprocessing;
s3, excavating total traffic demands and various traffic running conditions based on the processed data, extracting an boarding station, a taking route and a disembarking station from each travel record, determining an actual travel route, a transfer station and a final station, and recovering travel chain records of individual passengers;
s4, obtaining a travel track by using a data fusion technology, deeply mining the internal mechanism of data, and calculating and predicting to obtain potential information according to the internal correlation information between the data;
s5, establishing a space optimization system facing service coverage from the perspective of station and line expansion in the face of an area with remarkable difference of supply and demand of traffic service, and acquiring a suggested scheme for optimizing the layout of the traffic line;
the type of the error in the step S2 includes subway vehicle-mounted GPS data loss, and the concrete completion steps for the subway vehicle-mounted GPS data loss are as follows:
a. firstly, sequencing vehicle station reporting data recorded by a vehicle-mounted GPS of each subway according to the station sequence of a specific line operation according to each shift to obtain a sequence: { a1, a2, A3, a4 … Ak };
b. comparing the information with subway scheduling information, wherein the information is complete and does not need to be repaired, and if the information is incomplete, finding a region with missing station reporting information, and recording the region as { Am, Am +1, Am +2 and … An };
c. finding out card swiping data of all the buses of the current bus, setting a continuous card swiping time interval threshold, counting a time starting point and a time finishing point of each card swiping, and fusing data of the length of a road segment, the traffic volume and the acceleration of the bus;
d. then converting the subway vehicle-mounted GPS data problem into a subway station-to-station inter-row prediction problem, and calculating the travel time probability density function of the subway to each missing station by means of the data as follows:
Figure DEST_PATH_IMAGE001
wherein t is the road section running time, mu is a position parameter and represents an expected value of the logarithm lnt of the road section running time, and sigma is a scale parameter and represents a standard deviation of lnt; when mu is far larger than sigma, the travel time probability distribution function can be regarded as normal distribution, and the expected and standard deviation of the road section travel time is obtained by the following formula:
Figure DEST_PATH_IMAGE002
Figure DEST_PATH_IMAGE003
e (t) and SD (t) respectively represent the expectation and standard deviation of the road section travel time, and the cumulative distribution formula corresponding to the road section travel time is as follows:
Figure DEST_PATH_IMAGE004
e. calculating the actual travel time of the A1 station from the Am station by using the card swiping data, calculating the actual travel time of the Ak station from the An station by using the card swiping data, and substituting the actual travel time into the probability density function in the step d to obtain two maximum values which are Am and An to infer the corresponding stations respectively;
the process of establishing the spatial optimization system in step S5 is as follows:
a. firstly, preprocessing starting point and terminal point information input by a passenger, when the starting point and the terminal point are lines crossing an urban area, suggesting the passenger to select a high-speed rail for going out, and automatically planning a route from the starting point to a high-speed rail station and a route from the high-speed rail terminal station to a client input terminal station;
b. rasterizing a GPS electronic map, marking the starting point and the end point geographic positions input by passengers in the electronic map in real time, wherein the corresponding grid geometric center points are marked positions;
c. then, whether a subway station and a bus station exist or not is searched in a circular 1km area of the geographical positions marked at the starting point and the end point, if yes, an optimal walking route from the starting point marked position to the station is planned, and if not, the passenger is recommended to select a taxi taking and traveling mode;
d. after the walking route is planned, then the multisource transportation travel data can be obtained, and a plurality of planned routes can be obtained by means of a big data calculation method and recorded as
Figure DEST_PATH_IMAGE005
Each planning line comprises one or more travel modes of walking, taxi taking, subway and public transport;
e. and finally, screening the shortest route, the shortest time route and the least traffic jam route.
2. The urban transportation travel data analysis method according to claim 1, characterized in that: the types of data errors in step S2 include abnormal data, data loss, and data overlap, where the abnormal data refers to data with some attribute values with incorrect format or attribute values significantly exceeding the normal value range, and obviously does not conform to the actual situation, and exceeds the time normal value range; the data loss is the problem that one or more attribute values in card data, operation data, GPS data, mobile phone signaling data, monitoring data, vehicle identification data and traffic payment data are lost; the data overlapping means that some attribute values appear multiple times in a short time and interfere with the subsequent data fusion.
3. The urban transportation travel data analysis method according to claim 2, characterized in that: the processing method of the abnormal data comprises the following steps: comparing the presented format of the abnormal data with the normal data, setting the range of the normal data, deleting the abnormal data when the abnormal data appears, and recording the error content and the error type; the data loss processing method comprises the following steps: matching the attribute and the specific position of the data loss, and performing correlation completion with other data in the attribute; the data overlapping processing method comprises the following steps: and setting a minimum time interval threshold, if the time interval of the same attribute value is less than the threshold, determining the attribute value as the attribute value, reserving one attribute value, and deleting other redundant repeated attribute values.
4. The urban transportation travel data analysis method according to claim 1, characterized in that: and the data fusion in the step S4 comprises subway station matching, bus station matching and individual trip chain record establishment.
5. The urban transportation travel data analysis method according to claim 4, wherein: the specific process of bus stop matching is as follows: firstly, acquiring IC card data and GPS data records, and performing primary matching on the vehicle number attributes recorded by the IC card data and the GPS data to find out bus card swiping data and GPS data of the same vehicle in the same shift; the card swiping time of the bus IC card is between the arrival time and the departure time of the GPS station, and secondary time matching is carried out according to the principle to obtain the data of the IC card and the GPS data of the getting-on station and the up-down direction; for the initial bus station, the card swiping time of the initial bus station is possibly before the arrival time, the driving direction of the line is automatically recognized, and the information of the initial station of the line is called.
6. The urban transportation travel data analysis method according to claim 4, wherein: the specific process of establishing the individual trip chain record is as follows: after the single trip information of each passenger is acquired, the trips comprise walking, taxi taking, subway and public transport, the track of the single trip of the passenger is obtained, the track of the trip may need multiple transfer behaviors, then the mode of the trip at each time is connected in a chain structure in the time sequence, and the chain structure can also describe the process that the trip of the passenger starts from the starting point and finally arrives at the destination, and the information of the time, the space and the trip type is included.
7. The urban transportation travel data analysis method according to claim 1, characterized in that: and e, calculating the total travel cost of the shortest route, the shortest route in time and the minimum route in traffic jam in the step e of the establishment process of the space optimization system, wherein the total travel cost comprises taxi taking cost, subway cost and bus cost.
CN202110616013.1A 2021-06-02 2021-06-02 Urban traffic trip data analysis method Active CN113344268B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110616013.1A CN113344268B (en) 2021-06-02 2021-06-02 Urban traffic trip data analysis method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110616013.1A CN113344268B (en) 2021-06-02 2021-06-02 Urban traffic trip data analysis method

Publications (2)

Publication Number Publication Date
CN113344268A CN113344268A (en) 2021-09-03
CN113344268B true CN113344268B (en) 2022-04-19

Family

ID=77475091

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110616013.1A Active CN113344268B (en) 2021-06-02 2021-06-02 Urban traffic trip data analysis method

Country Status (1)

Country Link
CN (1) CN113344268B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116206452B (en) * 2023-05-04 2023-08-15 北京城建交通设计研究院有限公司 Sparse data characteristic analysis method and system for urban traffic travel
CN116611984B (en) * 2023-07-11 2024-02-02 鹏城实验室 Travel data processing method, system, equipment and medium under multiple modes

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108230724A (en) * 2018-01-31 2018-06-29 华南理工大学 A kind of urban mass-transit system Vehicle station name announcing missing data method for repairing and mending based on maximum probability estimation
CN109859495A (en) * 2019-03-31 2019-06-07 东南大学 A method of overall travel speed is obtained based on RFID data
CN110009046A (en) * 2019-04-09 2019-07-12 中通服公众信息产业股份有限公司 A kind of community in urban areas safety predicting method based on big data

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11138619B2 (en) * 2017-09-06 2021-10-05 T-Mobile Usa, Inc. Using digital traffic data for analysis
CN107886723B (en) * 2017-11-13 2021-07-20 深圳大学 Traffic travel survey data processing method
US11164216B2 (en) * 2017-11-17 2021-11-02 Mastercard International Incorporated Electronic system and method for advertisement pricing
CN112363999B (en) * 2020-10-13 2023-01-03 厦门市国土空间和交通研究中心(厦门规划展览馆) Public traffic passenger flow analysis method, device, equipment and storage medium
CN112163979A (en) * 2020-10-19 2021-01-01 科技谷(厦门)信息技术有限公司 Urban traffic trip data analysis method based on federal learning

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108230724A (en) * 2018-01-31 2018-06-29 华南理工大学 A kind of urban mass-transit system Vehicle station name announcing missing data method for repairing and mending based on maximum probability estimation
CN109859495A (en) * 2019-03-31 2019-06-07 东南大学 A method of overall travel speed is obtained based on RFID data
CN110009046A (en) * 2019-04-09 2019-07-12 中通服公众信息产业股份有限公司 A kind of community in urban areas safety predicting method based on big data

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于交通轨迹数据挖掘的道路限速信息识别方法;廖律超等;《交通运输工程学报》;20151015(第05期);全文 *

Also Published As

Publication number Publication date
CN113344268A (en) 2021-09-03

Similar Documents

Publication Publication Date Title
CN113256987B (en) Multi-source data fusion-based resident trip chain generation method and co-multiplication query method
Chiu Chuen et al. Mode choice between private and public transport in Klang Valley, Malaysia
CN110942198B (en) Passenger path identification method and system for rail transit operation
CN113344268B (en) Urban traffic trip data analysis method
CN110796337B (en) System for evaluating service accessibility of urban bus stop
Hora et al. Estimation of Origin-Destination matrices under Automatic Fare Collection: the case study of Porto transportation system
Huang et al. A method for bus OD matrix estimation using multisource data
CN113259900B (en) Distributed multi-source heterogeneous traffic data fusion method and device
CN112036757A (en) Parking transfer parking lot site selection method based on mobile phone signaling and floating car data
Li et al. Using smart card data trimmed by train schedule to analyze metro passenger route choice with synchronous clustering
Liu et al. Data analytics approach for train timetable performance measures using automatic train supervision data
Yatskiv et al. Evaluating Riga transport system accessibility
Arbex et al. Before-and-after evaluation of a bus network improvement using performance indicators from historical smart card data
Zhang et al. Analysis of spatial-temporal characteristics of operations in public transport networks based on multisource data
Tian et al. Identifying residential and workplace locations from transit smart card data
JP6999519B2 (en) Transport capacity adjustment device, transport capacity adjustment system and transport capacity adjustment method
CN115662124A (en) GPS track data road section flow matching method based on network coding
Bojic et al. Optimal railway disruption bridging using heterogeneous bus fleets
CN114971085A (en) Method and system for predicting accessibility of bus station and storage medium
CN111754760B (en) Method and device for determining bus getting-off station and upper computer
Ochiai et al. Punctuality analysis by the microscopic simulation and visualization of web-based train information system data
Kucirek Comparison between MATSIM & EMME: Developing a dynamic, activity-based microsimulation transit assignment model for Toronto
Kou et al. Last-mile shuttle planning for improving bus commuters’ travel time reliability: a case study of Beijing
CN116611984B (en) Travel data processing method, system, equipment and medium under multiple modes
JP7365521B1 (en) Programs, methods and systems

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant