CN110716935A - Track data analysis and visualization method and system based on online taxi appointment travel - Google Patents

Track data analysis and visualization method and system based on online taxi appointment travel Download PDF

Info

Publication number
CN110716935A
CN110716935A CN201910953655.3A CN201910953655A CN110716935A CN 110716935 A CN110716935 A CN 110716935A CN 201910953655 A CN201910953655 A CN 201910953655A CN 110716935 A CN110716935 A CN 110716935A
Authority
CN
China
Prior art keywords
data
track
point
analysis
road
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910953655.3A
Other languages
Chinese (zh)
Inventor
李静
刘贤
何小波
罗跃
金贤锋
张海鹏
何志明
曾攀
何宗
谭攀
陈雪洋
钱文进
彭婧
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing Geographic Information And Remote Sensing Application Center (chongqing Surveying And Mapping Product Quality Inspection And Testing Center)
Original Assignee
Chongqing Geographic Information And Remote Sensing Application Center (chongqing Surveying And Mapping Product Quality Inspection And Testing Center)
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing Geographic Information And Remote Sensing Application Center (chongqing Surveying And Mapping Product Quality Inspection And Testing Center) filed Critical Chongqing Geographic Information And Remote Sensing Application Center (chongqing Surveying And Mapping Product Quality Inspection And Testing Center)
Priority to CN201910953655.3A priority Critical patent/CN110716935A/en
Publication of CN110716935A publication Critical patent/CN110716935A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/26Visual data mining; Browsing structured data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/29Geographical information databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9537Spatial or temporal dependent retrieval, e.g. spatiotemporal queries
    • G06Q50/40

Abstract

The invention discloses a track data analysis and visualization method and system based on online taxi appointment travel, wherein the method comprises the following steps: preprocessing original network car booking track data; storing the preprocessed network car booking track data into an object-relation database in a manner of a daily sub-table, and constructing a space-time index; data mining is carried out on the stored network car booking track data by adopting a data analysis method, and OD analysis, vehicle speed analysis and traffic flow analysis of the network car booking track data are realized; and visually displaying the obtained data mining result. The remarkable effects are as follows: the data volume, the processing speed and the usability and the applicability of the mining technology are comprehensively considered, a set of simple, operable and easily-mastered track data mining method under multi-technology integration is constructed, and the method is used for mining the track data of small and medium scales.

Description

Track data analysis and visualization method and system based on online taxi appointment travel
Technical Field
The invention relates to the technical field of visualization of vehicle trajectory data, in particular to a method and a system for analyzing and visualizing trajectory data based on network appointment vehicle travel.
Background
With the rapid development of mobile internet, LBS location and cloud computing, mobile devices and applications with location function are widely used, and a large amount of trajectory data is generated. The trajectory data can be divided into human trajectory data, animal trajectory data, vehicle trajectory data and natural regulation trajectory data, and has the characteristics of large data volume, large real-time and various 3V characteristics, poor space sequency, different frequency sampling performance and poor data quality, and the characteristics have high technical requirements on software and hardware facilities and methods for data mining, and the application difficulty of the data is increased. The track data represents the movement and behavior history of individuals or groups, provides a new idea for mining human activities and migration rules, is widely applied to the fields of traffic jam, road condition prediction, popular experience path recommendation and the like, and has important values for inducing resident travel rules, optimizing resource allocation and the like.
The enormous application value promotes the development of a great deal of theoretical and technical research related to the track data storage and mining technology. The storage, management and rapid retrieval of a large amount of trajectory data are key to the analysis of trajectory data mining. However, the track data contains personal travel tracks and has hidden danger of privacy disclosure, and a policy method, a distortion method and an encryption method are main strategies for protecting the privacy of the track data at present. The policy method is simple to operate and easy to implement, but the protection degree of privacy is not high; the distortion method is easy to realize, the privacy protection degree is high, but data is distorted, and the risk of privacy disclosure exists; the encryption method has high privacy protection degree, but has high requirements on communication and calculation and is complex to deploy.
The application of the visualization technology enables the display of the data mining result to be more visual, the induction and the summarization of the mining result are facilitated, and the visualization method comprises direct visualization, aggregation visualization and feature visualization. Each track is directly drawn through direct visualization, and the method is the most basic visualization method; the aggregation visualization is that the data is clustered and then the reserved important data is visualized; feature clustering requires first calculating the features of the divided trajectory.
With the development of big data technology, a distributed parallel processing architecture represented by Hadoop and Spark of a MapReduce model provides a new idea for massive trajectory data mining, and compared with the traditional mining technology, the distributed parallel processing architecture method has the advantages of high operation speed and high processing efficiency. However, the MapReduce belongs to the underlying basic technology, is difficult to implement, is difficult to use, and has an unobvious effect on improving the speed of small-scale data processing. For processing of mass data, the traditional data mining technology is certainly different from the big data processing technology in terms of processing efficiency, but the traditional analysis technology has certain advantages in terms of implementation difficulty and technology maturity in the aspect of processing and analyzing small and medium-scale data, and the requirement on data processing speed can be met through a certain technical method.
Based on the above, a set of simple, operable and accessible trajectory data visualization method under multi-technology integration needs to be constructed, and a solution is provided for mining of small and medium-scale trajectory data on the basis of comprehensively considering the data volume, the processing speed and the usability and applicability of the mining technology.
Disclosure of Invention
Aiming at the defects of the prior art, the invention aims to provide a track data analysis and visualization method and system based on the online taxi appointment trip, wherein the method firstly realizes the quick retrieval of medium-scale data through the modes of track compression, space-time index construction, sublist storage and the like; secondly, mining and analyzing the track data by using technologies such as advanced spatial analysis, internet data acquisition and the like; and the visualization technology is adopted again to realize the intuitive, various and efficient display of the data mining results.
In order to achieve the purpose, the technical scheme adopted by the invention is as follows:
a track data analysis and visualization method based on online taxi appointment travel is characterized by comprising the following steps:
step 1: carrying out data preprocessing on original network car booking track data;
the data preprocessing comprises a data desensitization processing process, a data deviation rectifying processing process and a data compression processing process;
step 2: storing the preprocessed network car booking track data into an object-relation database in a manner of a daily sub-table, and constructing a space-time index;
and step 3: data mining is carried out on the stored network car booking track data by adopting a data analysis method, and OD analysis, vehicle speed analysis and traffic flow analysis of the network car booking track data are realized;
and 4, step 4: and 3, visually displaying the data mining result obtained in the step 3.
Further, the network appointment track data comprises order data and track point data, wherein the order data comprises a user ID, a position where a passenger sends a vehicle using request, a passenger boarding position, a passenger alighting position, position coordinates and address information of a starting point in the order, position coordinates and address information of an arrival point in the order, order initiation time and order ending time; the track point data comprises a user ID, a position latitude, a position longitude, a position point time, a direction, a horizontal precision factor and a user data source.
Further, the data deviation rectifying process comprises the following specific steps:
step A1: data processing, namely breaking the current road network data at the road intersections which are mutually communicated at certain intervals;
step A2: determining the deviation track points to be determined and determining the deviation track points, namely taking the maximum error distance plus the width of the road as the radius of a track point buffer area, if a plurality of roads exist in the buffer area range, and the point of the road section where the road section is located cannot be determined through topological relation as the deviation track point to be determined, and recording as Ei(ii) a If only one road exists in the buffer area range, or a plurality of roads exist, but the attributive track point can be judged as the determinable offset track point through the topological relation, and is marked as Pi
Step A3: the offset track point P can be determinediBy correcting, i.e. by obtaining a determinable offset locus point PiPerforming buffer area analysis on the central point, wherein if only one road exists in the buffer area, the road is an actual road corresponding to the track point; if a plurality of roads exist, determining the road to which the track belongs through the topological adjacency relation;
step A4: offset point E to be determinediCorrection of (2), i.e. cyclic acquisition of the offset point E to be determinediBuffer in range and Ei-1Road section set R corresponding to track pointsiDetermining the track point P of the corresponding road by combining the non-deviated track point or the step threei+nReversely deducing to obtain a track point P according to the topological relation of the corresponding road sectioniCorresponding roadAnd (4) correcting the offset track points.
Further, the data compression processing process selects a DP algorithm to compress the network appointment track data, and the specific steps are as follows:
step B1: a straight line connects the starting point and the end point of one track, and the straight line is used as an approximate track of the track;
step B2: calculating the vertical Euclidean distance D from each track point to the approximate tracki
Step B3: from DiSelecting a track point i with the largest Euclidean distance, and if the Euclidean distance of the track point i is larger than a preset threshold value T, taking the track point i as a dividing point to divide the whole track into two sub-tracks;
step B4: repeating the steps B2 and B3 to process the two sub-tracks until the maximum vertical Euclidean distance in each sub-track is smaller than a set distance threshold T or only two track points are arranged in each sub-track;
step B5: and constructing a compressed network car booking track according to the track starting point, the track ending point and the dividing point.
Further, when the preprocessed network appointment track data are stored in the step 2, a classified storage, a sub-table storage and a grid storage mode are adopted, wherein:
classified storage, namely respectively establishing order data, track data and a service related data table according to the data type and the functional requirements of application;
storing in a sub-table, and dispersing data according to a day sub-table by taking days as time intervals;
storing according to grids, dividing an order starting point and a destination point in advance according to grids with certain sizes, and segmenting and storing data according to days.
Further, the index in step 2 includes a general index and a spatial index, where the general index adopts a B-Tree index structure, and the spatial index adopts a Gist index structure.
Further, the data analysis method in step 3 includes a kernel density analysis method, a cluster analysis method and a correlation analysis method, wherein:
the mathematical model of the nuclear density analysis method is as follows:
Figure BDA0002226552950000051
wherein f (x) represents an estimate of nuclear density at x; r is the search radius; n is the number of sample points in the neighborhood range; dixIs the distance between the elements i and x; k is a spatial weight function.
Further, in the step 4, the visualization is shown by utilizing a Mapv visualization open source library to display spatial data on the basis of a Mapbox base map.
Further, based on the above method for analyzing and visualizing track data of car booking, the application also provides a system for analyzing and visualizing track data based on travel of car booking, which comprises a data layer and a system layer, wherein the data layer is used for providing various data required by the operation of a support system, and the system layer is used for providing an operation interface for the operation of the system, processing the request of a system user and the response of the system, realizing the operation of various data in the system, performing basic space-time statistical calculation on the data based on various data models, and visually displaying the analysis result; specifically, the method comprises the following steps:
the system comprises a data layer and a system layer, wherein the data layer is used for providing various data required by the operation of a support system, and the system layer is used for providing an operation interface for the operation of the system, processing the request of a system user and the response of the system, realizing the operation of various data in the system, carrying out basic space-time statistical calculation on the data based on various data models and visually displaying the analysis result; specifically, the method comprises the following steps:
the data layer comprises an original data preprocessing module, a database storage module and a retrieval module, wherein the original data preprocessing module is used for carrying out data preprocessing comprising a data desensitization processing process, a data deviation rectifying processing process and a data compression processing process on the network appointment vehicle track data; the database storage module is used for storing the preprocessed network car appointment track data into an object-relation database in a manner of a day-to-day table; the retrieval module is used for constructing a data index.
Further, the system layer comprises a data mining module and a visualization module, wherein the data mining module is used for mining the stored network car booking track data by adopting a data analysis method to realize OD analysis, vehicle speed analysis and traffic flow analysis of the network car booking track data; the visualization module is used for visually displaying the data mining result.
The invention has the following remarkable effects:
1. the method has simple flow, reliability and easy operation, comprehensively considers various technologies such as a time-space database, web front-end visualization, advanced space analysis, network data acquisition and the like, fully utilizes means such as data compression, time-space index, sub-table storage, data cleaning and the like, overcomes the problems of data processing efficiency, interactive response time requirement, pretreatment of numerous and miscellaneous data and the like in medium and small-scale data mining, and provides a full-flow processing scheme from data pretreatment to final mining result visualization for the mining of medium and small-scale track data;
2. compared with the defects of difficult architecture implementation, high function development difficulty, high hardware cost and the like of the novel big data technology, the method has obvious advantages in implementation difficulty and technical maturity, and the hardware cost consumption is low;
3. compared with a space-time data mining technology which utilizes tools such as a database and ArcGIS and the like in the traditional method, the defects that the performance is rapidly reduced due to excessive single-table storage records, and the processing performance and the visualization efficiency of the ArcGIS to million-level space-time data are also reduced are effectively avoided;
4. the method establishes an efficient space-time index on a database by utilizing an open source database PostgreSQL and adopting a sub-table storage mechanism; meanwhile, an SSM (Spring + Spring MVC + MyBatis) development framework is used for developing a background processing function, a parallel process is supported, and the data processing efficiency is effectively improved; the visual open source library based on the MapV space-time data is adopted, and the open source library supports efficient and diversified display of a large amount of data such as points, lines and surfaces, so that the development difficulty is greatly reduced on the basis of meeting the requirements of storage, analysis and visualization performance, and the development of data mining is facilitated.
Drawings
FIG. 1 is a flow chart of a method of the present invention;
FIG. 2 is a block diagram of the system architecture of the present invention;
FIG. 3 is a thermodynamic diagram of a networked taxi appointment order of the present invention;
FIG. 4 is a schematic view of an online taxi appointment order OD connection according to the present invention;
FIG. 5 is a schematic view of an analysis of the instantaneous vehicle speed of the networked taxi appointment of the present invention;
FIG. 6 is a schematic diagram illustrating an analysis of average vehicle number of networked appointment vehicles according to the present invention;
FIG. 7 is a schematic diagram of the cumulative congestion duration of the online taxi appointment in the invention;
FIG. 8 is a schematic view of the traffic flow analysis of the networked car appointment of the present invention.
Detailed Description
The following provides a more detailed description of the embodiments and the operation of the present invention with reference to the accompanying drawings.
As shown in fig. 1, a track data analysis and visualization method based on a network car booking trip includes the following specific steps:
entering a step 1: carrying out data preprocessing on original network car booking track data;
in this embodiment, the data of the network appointment trajectory from 3 to 5 months in 2018 is taken as an example for explanation. The network appointment track data comprises order data and track point data, and details are shown in table 1.
Table 1 network car booking track data detailed table
Figure BDA0002226552950000081
As can be seen from table 1, the network appointment vehicle trajectory data has the characteristics of large quantity, diversity, space-time sequence, poor data quality and road network correlation, and these characteristics determine that the trajectory data has higher requirements on software performance, data preprocessing, mining method selection and the like in the mining process.
The data preprocessing comprises a data desensitization processing process, a data deviation rectifying processing process and a data compression processing process;
for data desensitization process:
since sensitive information such as user names, mobile phone numbers and the like are hidden in the network appointment track data, in order to protect user privacy, desensitization processing needs to be carried out on the data by using various privacy protection technologies. The desensitized data are independent from each other, all order data of the same user cannot be located, and the track point data and the order data cannot be associated.
For the data deviation rectifying process:
the basic data has the problem of deviation of departure place, destination and track point from the actual position due to the influence of software and hardware equipment such as GPS sensor noise and positioning physical environment such as object shielding. In this example, the track point whose track exceeds the road range due to the track deviation is called a track deviation point, and the track point deviation in the same road is ignored, so that the track point deviation is small, and the point after deviation does not exceed the road width range. When the network car is driven on the road, the track point generated at a certain moment is taken as the center, the sum of the maximum error range and the road width is taken as the radius to perform buffer area analysis, one or more roads can be found in the buffer area range, and one of the roads is the road on which the network car is driven. And the road sections of two adjacent track points of the same network appointment vehicle are bound to have topological relation.
Based on this, the present embodiment uses the current road network data and the trace point data, and utilizes the buffer area and the topology analysis to construct the offset trace point correction algorithm, so as to realize the correction of the offset trace point.
The method comprises the following specific steps:
step A1: data processing, namely breaking the current road network data at every 50m interval and at road intersections (part of three-dimensional intersections are crossed but not communicated) which are communicated with each other, so that the identification of the road sections corresponding to the subsequent track points is facilitated;
step A2: determining the deviation track point to be determined and the deviation track point, i.e. using the maximum error distance plus the road width as the radius of track point buffer area, if it isMultiple roads exist in the buffer area range, the point of the road section where the road section is located cannot be determined to be the offset track point to be determined through the topological relation, and the set of the offset points is marked as Ei(ii) a The deviation point can be determined to be a track point which only has one road in the buffer area range or has a plurality of roads but can be judged to belong through topological relation, and the set of the track points is marked as Pi
Step A3: deviation correction of the determined deviation locus point, i.e. with the determined deviation locus point P obtainediPerforming buffer area analysis on the central point, wherein if only one road exists in the buffer area, the road is the actual road corresponding to the track point; if a plurality of roads exist, determining the road to which the track belongs through the topological adjacency relation;
step A4: correcting the deviation point to be determined, i.e. circularly obtaining the deviation point E to be determinediBuffer in range with Ei-1Road section set R corresponding to track pointsiDetermining the track point P of the corresponding road by combining the non-deviated track point or the step threei+nCorresponding road sections are reversely deduced according to the topological relation to obtain track points PiAnd correcting the offset track points of the corresponding roads.
For the data compression process:
one travel track data of the networked taxi appointment is composed of a large number of track points acquired at certain time intervals, and the time intervals of original data are recorded in a second level, so that the data volume of source data is huge. The method is limited by the technical constraints of software and hardware equipment and a mining method, and the massive trajectory data brings challenges to data storage, data analysis and data visualization. How to compress the mass trajectory data scientifically and reasonably is of great importance to ensure that the compressed data still retains the space-time characteristics, geometric forms and motion characteristics of the source data.
In this embodiment, on the basis of comprehensively considering the compression rate, the globality, the trajectory feature retention degree and the method difficulty of the DP (douglas peucker) algorithm, the sliding window algorithm (sliding window), the open window algorithm (opening window) and the semantic compression algorithm, the DP algorithm is selected to compress the network appointment trajectory data, and the specific steps are as follows:
step B1: a straight line connects the starting point and the end point of one track, and the straight line is used as an approximate track of the track;
step B2: calculating the vertical Euclidean distance from each track point to the approximate track, and recording the position Di
Step B3: from DiSelecting a track point i with the largest Euclidean distance, and if the Euclidean distance of the track point i is larger than a preset threshold value T, taking the track point i as a dividing point to divide the whole track into two sub-tracks;
step B4: repeating the steps B2 and B3 to process the two sub-tracks until the maximum vertical Euclidean distance in each sub-track is smaller than a set distance threshold T or only two track points are arranged in each sub-track;
step B5: and constructing a compressed network car booking track according to the track starting point, the track ending point and the dividing point.
Step 2: storing the preprocessed network car booking track data into an object-relation database in a manner of a daily sub-table, and constructing a space-time index;
for data storage:
the object-relational database PostgreSQL is used for storage, mainly for the following reasons: first, PostgreSQL supports massive data processing, and the table structure and contents are easily extensible. And secondly, the storage and management (PostGIS) of the spatial data are supported, and the PostGIS can play a good business support role in the expansion application of the spatial data. And finally, the inheritance relationship of the object-oriented table structure is supported, so that the unified management of a large amount of data is facilitated.
In addition, because the network appointment order and the track data volume are huge, the storage structure combination and the data access efficiency are considered, and convenience in data offline is also required, so that the data storage is realized by adopting a layered dimension reduction idea. That is, during storage, a classified storage, a sub-table storage and a grid storage mode are adopted, wherein:
classified storage, namely respectively establishing order data, track data and a service related data table according to the data type and the functional requirements of application;
storing in a sub-table, and dispersing data according to a day sub-table by taking days as time intervals;
storing according to grids, dividing an order starting point and a destination point in advance according to grids with certain sizes, and segmenting and storing data according to days.
For constructing the spatio-temporal index:
the data volume of the network appointment track data used for research is large, and the storage and retrieval efficiency of the database is inevitably reduced. Although the data retrieval speed is improved to a certain extent by the ways of table-division retrieval and grid-based retrieval, the database processing capacity and the retrieval efficiency are improved to a greater extent by the way of constructing the spatio-temporal index.
Among various index structures, the B-Tree index structure has the advantages of high positioning efficiency, high utilization rate and self-balance, is suitable for high-radix fields, and is very efficient in positioning single or small-range data. The Gist index structure is a general Search Tree (Generalized Search Tree), is a recommended geographic data index in the PostGIS, and is suitable for retrieval of spatial data. For the common data field and the spatial data field, the embodiment selects the B-Tree index structure and the Gist index structure to access the data objects in the database quickly and purposefully.
And step 3: data mining is carried out on the stored network car booking track data by adopting a data analysis method, and ID analysis, vehicle speed analysis and traffic flow analysis of the network car booking track data are realized;
in the example, potential information and associated information of order data and travel track point data are mined by using analysis methods such as kernel density analysis and statistical analysis, so that OD analysis, vehicle speed analysis and traffic flow analysis of network appointment track data are realized.
The kernel density analysis takes any spatial point in the space and the neighborhood range around the line element as a density calculation range, calculates the density of the element in the surrounding neighborhood, and carries out continuous simulation on density distribution, and the density value of each grid pixel element reflects the distribution characteristics of the spatial element. The kernel density function is formulated as follows:
Figure BDA0002226552950000131
wherein f (x) represents an estimate of nuclear density at x; r is the search radius; n is the number of sample points in the neighborhood range; dixIs the distance between the elements i and x; k is a spatial weight function.
For OD analysis:
the spatial distribution difference of urban resources enables residents to flow purposefully in space, the popular departure place and the destination reflect the conventional traffic of the residents to a certain extent, the popular departure place and the destination of the residents are identified, the travel rule is analyzed and summarized, and effective support can be provided for road network construction, bus route optimization and operation maintenance.
For vehicle speed analysis:
the vehicle speed is an important reference index for researches such as road traffic condition evaluation and road condition improvement condition evaluation, and has significance for analyzing the vehicle speed of a road network based on the vehicle appointment trajectory data. The method comprises the steps of instantaneous vehicle speed analysis, average vehicle speed analysis and accumulated jam duration analysis. The network car booking travel is a point-to-point travel service mode, and the network car booking congestion time of the branch is analyzed by taking the network car booking as an object, so that the traffic pressure problem of the traffic tip of the branch can be disclosed to a certain extent, and the fine control of the capillary vessel of the traffic network can be assisted. There have been studies to classify different congestion levels at different speeds-per-hour, very clear (greater than 37km/h), clear (30 to 37km/h), light congestion (23 to 25km/h), medium congestion (19 to 23km/h) and severe congestion (less than 19 km/h). And defining the average speed of the whole road network to be less than or equal to 25km/h as a congestion time interval.
For traffic flow analysis:
the traffic flow is the total number of vehicles passing through the network in a certain space range in a certain time period.
And 4, step 4: and 3, visually displaying the data mining result obtained in the step 3.
The network appointment vehicle trajectory data have time, coordinates, speed, direction and other space attributes and business attributes, and the visualization technology converts one or more attributes of the data into visual graphs or images, so that the implicit space-time rules in the data are conveniently mined. According to the method, the Mapbox high-definition vector tile map is used as the base map, and the technical problems of large data volume, slow loading and the like in the process of converting the local map to the online map are solved. On the basis of a Mapbox base map, a Mapv visual open source library is used for displaying space data such as order points, trajectory lines, administrative divisions and the like, forms such as point density, line data thermodynamic diagrams, line highlight superposition, user-defined surfaces according to color intervals and the like are supported, various animation effects are also supported, and the method is suitable for visualization of a large amount of network appointment track data with space-time attributes. After visualization, the Echarts technology can be combined to perform statistical analysis and display of order change trend, hot areas and the like.
When the OD connecting lines are visualized, the large number of OD connecting lines cause the visualization to be blocked, and the aesthetic feeling is reduced. Therefore, the connecting lines between the departure point and the destination point in the specific area are displayed in a gathering manner, so that the display speed is increased, the aesthetic feeling of the display effect is improved, and the rule can be conveniently found from the complicated OD connecting lines, thereby summarizing the travel characteristics.
Referring to fig. 2, according to the above-mentioned method for analyzing and visualizing track data of a networked car appointment, the embodiment further provides a system for analyzing and visualizing track data based on travel of a networked car appointment, which includes a data layer and a system layer, wherein the data layer is used for providing various data required by operation of a support system, and the system layer is used for providing an operation interface for system operation, processing a system user request and a system response, implementing operation of various data in the system, performing basic space-time statistical calculation on data based on various data models, and visually displaying an analysis result; the specific characteristics are as follows:
the data layer comprises an original data preprocessing module, a database storage module and a retrieval module, wherein the data preprocessing module is used for carrying out data preprocessing comprising a data desensitization processing process, a data deviation rectifying processing process and a data compression processing process on the network appointment vehicle track data; the database storage module is used for storing the preprocessed network car appointment track data into an object-relation database in a manner of a day-to-day table; the retrieval module is used for constructing a data index;
the system layer comprises a data mining module and a visualization module, wherein the data mining module is used for mining the stored network car booking track data by adopting a data analysis method to realize ID analysis, vehicle speed analysis and traffic flow analysis of the network car booking track data; the visualization module is used for visually displaying the data mining result.
The system realizes thermodynamic diagram analysis and OD connection line analysis of the network appointment orders in any space and time period.
(1) There are three ways of arbitrary spatial determination:
the first method is as follows: the preset area is selected according to the requirement. Based on the consideration of common areas, the system sets various levels of administrative division ranges as alternative areas, and can select specific counties, townships or community villages according to actual needs.
The second method comprises the following steps: hand-painted spatial range. The drawing of the space range is supported, and the space range required by analysis can be drawn as required.
The third method comprises the following steps: and uploading the local data. And local shp data uploading is supported.
(2) And selecting any time. Any time point can be selected, accurate to minutes, for example: time is set to 9:20-9:30, showing the distribution of starting points for all orders containing 9:20-9:30 in the "order time to launch" field value.
(3) And (6) visualization.
The first method is as follows: a thermodynamic diagram. By using the large scale to show the point location and the small scale to show the thermodynamic diagram in the nanocube diagram, the bright and dark colors represent the number of the gathered points, and different color bands can be used as well as the gradual change colors, as shown in fig. 3, and the grading can be set by self-definition by adjusting the legend.
The second method comprises the following steps: OD connecting line graph. The source or destination of the passenger flow in the selected area is shown by a line graph of the order origin and destination, as shown in fig. 4.
(II) vehicle speed analysis:
the vehicle speed is an important reference index for researches such as road traffic condition evaluation and road condition improvement condition evaluation, and has significance for analyzing the vehicle speed of a road network based on the vehicle appointment trajectory data. The realization function is as follows:
(1) and analyzing the instantaneous vehicle speed. The instantaneous vehicle speed of each road section is displayed in the form of thermodynamic diagrams, the top 20 hot spot areas with higher vehicle speed are listed in the form of bar graphs, as shown in fig. 5, the corresponding area can be zoomed to the center of a map by clicking the bar graphs, and the full-area instantaneous vehicle speed can be automatically played every minute.
(2) Average vehicle speed analysis, as shown in fig. 6. And any space and time interval selection is supported, the average speed of the time interval in the space range is analyzed, and grading color setting and legend addition can be performed.
(3) The congestion time period is accumulated as shown in fig. 7. As with other analyses, custom settings for region, time, and layer color are also supported here.
(III) analyzing the traffic flow:
as shown in fig. 8. Arbitrary spatial and temporal, and hierarchical and color settings are supported.
The method comprises the steps of firstly, realizing the rapid retrieval of medium-scale data through the modes of track compression, space-time index construction, sub-table storage and the like; secondly, mining and analyzing the track data by using technologies such as advanced spatial analysis, internet data acquisition and the like; and the current mainstream visualization technology is adopted again to realize visual, various and efficient display of the achievements. The acquired network appointment track data are used as data sources, citizen travel rules and characteristics are mined by adopting the constructed track data mining scheme, hot departure points and arrival points can be identified, and the connection between track stations and the branch utilization degree can be analyzed. The problems of massive data, data processing efficiency, interactive response time requirements, preprocessing of numerous and complex data and the like in data mining are solved, and a full-flow processing scheme from data preprocessing to visualization of final mining results is provided for mining of small and medium-scale track data.
The technical solution provided by the present invention is described in detail above. The principles and embodiments of the present invention are explained herein using specific examples, which are presented only to assist in understanding the method and its core concepts. It should be noted that, for those skilled in the art, without departing from the principle of the present invention, several improvements and modifications can be made to the present invention, and these improvements and modifications also fall into the protection scope of the claims of the present invention.

Claims (10)

1. A track data analysis and visualization method based on online taxi appointment travel is characterized by comprising the following steps:
step 1: carrying out data preprocessing on original network car booking track data;
the data preprocessing comprises a data desensitization processing process, a data deviation rectifying processing process and a data compression processing process;
step 2: storing the preprocessed network car booking track data into an object-relation database in a manner of a daily sub-table, and constructing a space-time index;
and step 3: data mining is carried out on the stored network car booking track data by adopting a data analysis method, and OD analysis, vehicle speed analysis and traffic flow analysis of the network car booking track data are realized;
and 4, step 4: and 3, visually displaying the data mining result obtained in the step 3.
2. The network appointment travel-based trajectory data analysis and visualization method according to claim 1, characterized in that: the network appointment track data comprises order data and track point data, wherein the order data comprises a user ID, a position where a passenger sends a vehicle taking request, a passenger boarding position, a passenger alighting position, position coordinates and address information of a departure point in an order, position coordinates and address information of an arrival point in the order, order initiation time and order ending time; the track point data comprises a user ID, a position latitude, a position longitude, a position point time, a direction, a horizontal precision factor and a user data source.
3. The network appointment travel-based trajectory data analysis and visualization method according to claim 1, characterized in that: the data deviation rectifying process comprises the following specific steps:
step A1: data processing, namely breaking the current road network data at the road intersections which are mutually communicated at certain intervals;
step A2: determining the deviation track points to be determined and the deviation track points, namely, taking the maximum error distance plus the width of the road as the radius of a track point buffer area, if a plurality of roads exist in the buffer area range, and the point of the road section where the point can not be determined through the topological relation is the deviation track point to be determined, and marking as Ei(ii) a If only one road exists in the buffer area range, or a plurality of roads exist, but the attributive track point can be judged as the determinable offset track point through the topological relation, and is marked as Pi
Step A3: the offset track point P can be determinediBy correcting, i.e. by obtaining a determinable offset locus point PiPerforming buffer area analysis on the central point, wherein if only one road exists in the buffer area, the road is an actual road corresponding to the track point; if a plurality of roads exist, determining the road to which the track belongs through the topological adjacency relation;
step A4: offset point E to be determinediCorrection of (2), i.e. cyclic acquisition of the offset point E to be determinediBuffer in range and Ei-1Road section set R corresponding to track pointsiDetermining the track point P of the corresponding road by combining the non-deviated track point or the step threei+nCorresponding road sections are reversely deduced according to the topological relation to obtain track points PiAnd correcting the offset track points of the corresponding roads.
4. The network appointment travel-based trajectory data analysis and visualization method according to claim 1, characterized in that: the data compression processing process adopts a DP algorithm to compress the network car booking track data, and the specific steps are as follows:
step B1: a straight line connects the starting point and the end point of one track, and the straight line is used as an approximate track of the track;
step B2: calculating the vertical Euclidean distance D from each track point to the approximate tracki
Step B3: from DiSelecting a track point i with the largest Euclidean distance, and if the Euclidean distance of the track point i is larger than a preset threshold value T, taking the track point i as a dividing point to divide the whole track into two sub-tracks;
step B4: repeating the steps B2 and B3 to process the two sub-tracks until the maximum vertical Euclidean distance in each sub-track is smaller than a set distance threshold T or only two track points are in each sub-track;
step B5: and constructing a compressed network car booking track according to the track starting point, the track ending point and the dividing point.
5. The network appointment travel-based trajectory data analysis and visualization method according to claim 1, characterized in that: when the preprocessed network appointment track data are stored in the step 2, classified storage, sub-table storage and grid storage are adopted, wherein:
classified storage, namely respectively establishing order data, track data and a service related data table according to the data type and the functional requirements of application;
storing in a sub-table, and dispersing data according to a day sub-table by taking days as time intervals;
storing according to grids, dividing an order starting point and a destination point in advance according to grids with certain sizes, and segmenting and storing data according to days.
6. The network appointment travel-based trajectory data analysis and visualization method according to claim 1 or 5, characterized in that: and 2, the index comprises a common index and a spatial index, wherein the common index adopts a B-Tree index structure, and the spatial index adopts a Gist index structure.
7. The network appointment travel-based trajectory data analysis and visualization method according to claim 1, characterized in that: the data analysis method in the step 3 comprises a nuclear density analysis method, a cluster analysis method and a correlation analysis method, wherein:
the mathematical model of the nuclear density analysis method is as follows:
Figure FDA0002226552940000031
wherein f (x) represents an estimate of nuclear density at x; r is the search radius; n is the number of sample points in the neighborhood range; dixIs the distance between the elements i and x; k is a spatial weight function.
8. The network appointment travel-based trajectory data analysis and visualization method according to claim 1, characterized in that: and 4, visually displaying the spatial data by using a Mapv visual open source library on the basis of the Mapbox base map.
9. The utility model provides a trajectory data analysis and visual system based on net appointment car trip which characterized in that: the system comprises a data layer and a system layer, wherein the data layer is used for providing various data required by the operation of a support system, and the system layer is used for providing an operation interface for the operation of the system, processing the request of a system user and the response of the system, realizing the operation of various data in the system, carrying out basic space-time statistical calculation on the data based on various data models and visually displaying the analysis result; specifically, the method comprises the following steps:
the data layer comprises an original data preprocessing module, a database storage module and a retrieval module, wherein the original data preprocessing module is used for carrying out data preprocessing comprising a data desensitization processing process, a data deviation rectifying processing process and a data compression processing process on the network appointment vehicle track data; the database storage module is used for storing the preprocessed network car booking track data into an object-relationship database in a manner of a daily score table; the retrieval module is used for constructing a data index.
10. The system for analyzing and visualizing trajectory data of online taxi appointment-based travel according to claim 9, wherein: the system layer comprises a data mining module and a visualization module, wherein the data mining module is used for mining the stored network car booking track data by adopting a data analysis method to realize ID analysis, vehicle speed analysis and traffic flow analysis of the network car booking track data; the visualization module is used for visually displaying the data mining result.
CN201910953655.3A 2019-10-09 2019-10-09 Track data analysis and visualization method and system based on online taxi appointment travel Pending CN110716935A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910953655.3A CN110716935A (en) 2019-10-09 2019-10-09 Track data analysis and visualization method and system based on online taxi appointment travel

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910953655.3A CN110716935A (en) 2019-10-09 2019-10-09 Track data analysis and visualization method and system based on online taxi appointment travel

Publications (1)

Publication Number Publication Date
CN110716935A true CN110716935A (en) 2020-01-21

Family

ID=69212364

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910953655.3A Pending CN110716935A (en) 2019-10-09 2019-10-09 Track data analysis and visualization method and system based on online taxi appointment travel

Country Status (1)

Country Link
CN (1) CN110716935A (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111553966A (en) * 2020-04-24 2020-08-18 泰华智慧产业集团股份有限公司 Method for realizing animation playback history track based on ArcGIS API for JavaScript
CN111814289A (en) * 2020-09-08 2020-10-23 成都同飞科技有限责任公司 Water supply pipe network pipe burst analysis method and analysis system based on schema theory
CN111882475A (en) * 2020-06-24 2020-11-03 北京工业大学 Visual analysis method for travel mode of urban rail transit station
CN111915690A (en) * 2020-08-03 2020-11-10 北京吉威空间信息股份有限公司 Thermodynamic diagram data reduction method based on vector tiles
CN112905729A (en) * 2021-03-05 2021-06-04 亿海蓝(北京)数据技术股份公司 Thermodynamic diagram generation method and device for track data, electronic equipment and storage medium
CN113298144A (en) * 2021-05-24 2021-08-24 中南大学 Urban three-generation space identification and situation analysis method based on multi-source data
CN115017252A (en) * 2022-08-08 2022-09-06 深圳市航盛车云技术有限公司 Intelligent playback system for vehicle track of mobile phone digital vehicle key
CN115344628A (en) * 2022-08-15 2022-11-15 北京索为云网科技有限公司 Space-time information system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106899306A (en) * 2017-02-20 2017-06-27 武汉大学 A kind of track of vehicle line data compression method of holding moving characteristic
CN108242149A (en) * 2018-03-16 2018-07-03 成都智达万应科技有限公司 A kind of big data analysis method based on traffic data
CN109145954A (en) * 2018-07-24 2019-01-04 同济大学 A kind of net based on multisource spatio-temporal data about vehicle safety evaluation method and system
US20190130745A1 (en) * 2017-11-01 2019-05-02 Avis Budget Car Rental, LLC Connected user communication and interface system with shuttle tracking application

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106899306A (en) * 2017-02-20 2017-06-27 武汉大学 A kind of track of vehicle line data compression method of holding moving characteristic
US20190130745A1 (en) * 2017-11-01 2019-05-02 Avis Budget Car Rental, LLC Connected user communication and interface system with shuttle tracking application
CN108242149A (en) * 2018-03-16 2018-07-03 成都智达万应科技有限公司 A kind of big data analysis method based on traffic data
CN109145954A (en) * 2018-07-24 2019-01-04 同济大学 A kind of net based on multisource spatio-temporal data about vehicle safety evaluation method and system

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
王法辉等: "《基于GIS的数量方法与应用》", 30 June 2009 *
谢博晖等: "GPS 轨迹数据纠偏方法研究", 《计算机技术与发展》 *
赵树平等: "《海洋渔业3S系统研究与应用》", 31 August 2017 *

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111553966A (en) * 2020-04-24 2020-08-18 泰华智慧产业集团股份有限公司 Method for realizing animation playback history track based on ArcGIS API for JavaScript
CN111553966B (en) * 2020-04-24 2023-08-25 泰华智慧产业集团股份有限公司 Method for realizing animation playback history track based on ArcGIS API for JavaScript
CN111882475A (en) * 2020-06-24 2020-11-03 北京工业大学 Visual analysis method for travel mode of urban rail transit station
CN111882475B (en) * 2020-06-24 2024-01-05 北京工业大学 Visual analysis method for travel mode of urban rail transit station
CN111915690A (en) * 2020-08-03 2020-11-10 北京吉威空间信息股份有限公司 Thermodynamic diagram data reduction method based on vector tiles
CN111814289A (en) * 2020-09-08 2020-10-23 成都同飞科技有限责任公司 Water supply pipe network pipe burst analysis method and analysis system based on schema theory
CN112905729A (en) * 2021-03-05 2021-06-04 亿海蓝(北京)数据技术股份公司 Thermodynamic diagram generation method and device for track data, electronic equipment and storage medium
CN112905729B (en) * 2021-03-05 2024-01-30 亿海蓝(北京)数据技术股份公司 Thermodynamic diagram generation method and device for track data, electronic equipment and storage medium
CN113298144A (en) * 2021-05-24 2021-08-24 中南大学 Urban three-generation space identification and situation analysis method based on multi-source data
CN115017252A (en) * 2022-08-08 2022-09-06 深圳市航盛车云技术有限公司 Intelligent playback system for vehicle track of mobile phone digital vehicle key
CN115017252B (en) * 2022-08-08 2022-10-25 深圳市航盛车云技术有限公司 Intelligent driving track playback system of mobile phone digital car key
CN115344628A (en) * 2022-08-15 2022-11-15 北京索为云网科技有限公司 Space-time information system

Similar Documents

Publication Publication Date Title
CN110716935A (en) Track data analysis and visualization method and system based on online taxi appointment travel
Hwang et al. An effective taxi recommender system based on a spatio-temporal factor analysis model
US9183221B2 (en) Component and method for overlying information bearing hexagons on a map display
Yang et al. Scalable space-time trajectory cube for path-finding: A study using big taxi trajectory data
CN103020222B (en) For the visual method for digging of vehicle GPS data analysis and exception monitoring
Jiang et al. Large-scale taxi O/D visual analytics for understanding metropolitan human movement patterns
Tang et al. A network Kernel Density Estimation for linear features in space–time analysis of big trace data
Guney et al. Tailoring a geomodel for analyzing an urban skyline
Goetz et al. OpenStreetMap in 3D–detailed insights on the current situation in Germany
WO2017206484A1 (en) Geographic data presentation method and apparatus
CN106157624B (en) More granularity roads based on traffic location data shunt visual analysis method
CN110555544B (en) Traffic demand estimation method based on GPS navigation data
CN110503485B (en) Geographical region classification method and device, electronic equipment and storage medium
CN106651027A (en) Internet regular bus route optimization method based on social network
CN113378891A (en) Urban area relation visual analysis method based on track distribution representation
Schoier et al. Spatial data mining for highlighting hotspots in personal navigation routes
Chen et al. The impact of rainfall on the temporal and spatial distribution of taxi passengers
CN108038734B (en) Urban commercial facility spatial distribution detection method and system based on comment data
CN114187420A (en) Real-time online city planning sand table simulation method
CN107121143B (en) Road selection method for collaborative POI data
Sun et al. TZVis: Visual analysis of bicycle data for traffic zone division
Chaudhuri et al. Application of web-based Geographical Information System (GIS) in tourism development
Schoier et al. Individual movements and geographical data mining. Clustering algorithms for highlighting hotspots in personal navigation routes
CN112066998A (en) Rendering method and system for airline map
Vullings et al. Dealing with the uncertainty of having incomplete sources of geo-information in spatial planning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200121

RJ01 Rejection of invention patent application after publication