CN107301254B - Road network hot spot area mining method - Google Patents
Road network hot spot area mining method Download PDFInfo
- Publication number
- CN107301254B CN107301254B CN201710735328.1A CN201710735328A CN107301254B CN 107301254 B CN107301254 B CN 107301254B CN 201710735328 A CN201710735328 A CN 201710735328A CN 107301254 B CN107301254 B CN 107301254B
- Authority
- CN
- China
- Prior art keywords
- track
- space
- cluster
- sub
- similarity
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2458—Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
- G06F16/2462—Approximate or statistical queries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/28—Databases characterised by their database models, e.g. relational or object models
- G06F16/284—Relational databases
- G06F16/285—Clustering or classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9537—Spatial or temporal dependent retrieval, e.g. spatiotemporal queries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/46—Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
- G06V10/462—Salient features, e.g. scale invariant feature transforms [SIFT]
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Probability & Statistics with Applications (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Fuzzy Systems (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Multimedia (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a road network hot spot region mining method, belongs to the technical field of data mining, and solves the problem that track clustering is carried out by adopting track space-time similarity measurement and clustering calculation in the prior art. The method comprises the following steps of 1, carrying out track segmentation on all track segments, and calculating the space-time similarity and space-time distance between two segmented sub-track segments; step 2, performing clustering calculation on all track segment data in the grid space according to the space-time similarity and space-time distance of the sub-tracks and a dynamic neighbor-based DBSCAN algorithm; step 3, selecting a significant cluster set from the cluster calculated by clustering, and extracting staying spots from the significant cluster set; and 4, obtaining a high-heat-degree area of the stay spots according to the number of the track sections carried by the stay spots, and obtaining a hot spot area in the road network in the area where the high-heat-degree stay spots are located. The invention is used for positioning the space position.
Description
Technical Field
A road network hot spot region mining method is used for positioning spatial positions and belongs to the technical field of data mining.
Background
In recent years, with the rapid development and application of spatial location positioning technologies, along with the rapid popularization of these technologies, we can easily track the location information of almost any moving object, so as to form a huge trajectory database taking the trajectory as an expression form, and these massive trajectory data contain a large amount of deep information capable of reflecting some motion behavior of the moving object. The space-time trajectory data is used as one kind of space-time data, mainly records the trend of the space position of a moving object changing along with time, and the vehicle space-time trajectory data is more special and is limited in a road network, so that many common data mining methods cannot be directly applied to the space-time trajectory data mining and need to be improved to a certain extent.
Since research on hot spot areas in a road network has important practical application value, research on hot spot path areas must be performed on track data that is effective in a road network. Clustering analysis of the trajectory data is a common method for finding hot gate paths in a road network. Trajectory clustering mainly comprises two parts: and (4) measuring the space-time similarity of the tracks and calculating the clustering. The most common research method in the aspect of measuring the track time-space similarity mainly divides the track based on a grid space, firstly, the method divides the grid space and cuts the track data, and adds the time-space similarity and the time similarity of the divided sub-tracks to obtain the time-space similarity of the track. The method can accurately calculate the space-time similarity between the tracks, but the method respectively calculates the space similarity and the time similarity of the similarity measurement between each pair of tracks, and when the track data volume is large, the response time of the algorithm is large. In the aspect of cluster calculation, because the shape of the track cluster is often similar to a strip shape rather than a spherical shape, the most typical density clustering algorithm DBSCAN is often adopted in the cluster calculation process, and the algorithm can realize cluster calculation of clusters with any shapes. However, the method needs to artificially input two parameter values of the neighborhood radius and the neighborhood density threshold when performing the clustering calculation, and the quality of the two parameter values directly affects the clustering result, and the DBSCAN algorithm does not provide a method for determining the two parameter values.
Disclosure of Invention
The invention aims to: the method solves the problems that in the prior art, when the tracks are clustered by adopting track space-time similarity measurement and clustering calculation, the response time is longer when the track data volume is larger by adopting the space-time similarity measurement; the Euclidean coordinates cannot accurately express the distance between two tracks in the road network; when the density clustering algorithm DBSCAN is adopted for clustering calculation, the neighborhood radius and the neighborhood density threshold value need to be artificially input, and when the value is inaccurate, the clustering result can be directly influenced; the invention provides a road network hot spot area mining method.
The technical scheme adopted by the invention is as follows:
a road network hot spot region mining method is characterized by comprising the following steps:
step 1, carrying out track segmentation on all track segments, and calculating the space-time similarity and space-time distance between two segmented sub-track segments;
step 2, performing clustering calculation on all track segment data in the grid space according to the space-time similarity and space-time distance of the sub-tracks and a dynamic neighbor-based DBSCAN algorithm;
step 3, selecting a significant cluster set from the cluster calculated by clustering, and extracting staying spots from the significant cluster set;
and 4, obtaining a high-heat-degree area of the stay spots according to the number of the track sections carried by the stay spots, and obtaining a hot spot area in the road network in the area where the high-heat-degree stay spots are located.
Further, the specific steps of step 1 are as follows:
step 1.1, dividing a dynamic grid space for a space area where all track sections are located;
step 1.2, carrying out track segmentation on a track sequence in a grid space according to a breakpoint;
and 1.3, calculating the space-time similarity and space-time distance between the two sub-track sections after the track segmentation.
Further, the specific steps of step 1.1 are as follows:
step 1.11, solving the minimum circumscribed rectangle of the space region where all the track segments are located according to the minimum convex hull principle;
step 1.12, solving the length of each track segment and the number of sampling points contained in the track segment, and calculating the average distance of the vehicle on the track segment moving in the time of two adjacent sampling points;
and step 1.13, taking the average distance as the size of a grid space, and performing dynamic grid space division on the minimum external rectangle.
Further, the specific steps of step 1.2 are as follows:
step 1.21, sequentially reading data of each sampling point on each track segment;
step 1.22: comparing longitude and latitude data of positions of sampling points of two adjacent track sections;
step 1.23: if the longitude and the latitude between two adjacent sampling points are unchanged, the middle position of the two sampling points is a breakpoint;
step 2.4: and carrying out track segmentation on the original track segment according to the calculated positions of the breakpoints.
Further, the specific steps of step 1.3 are as follows:
step 1.31, calculating the spatial similarity between the two sub-track segments, if the spatial similarity is not zero, calculating the time similarity between the two sub-track segments, otherwise, turning to step 1.33, wherein the formula for calculating the spatial similarity and the time similarity is as follows:
in the formula, Lc(TRi,TRj) Representing the spatial or temporal cumulative length of sub-track segments within two tracks, L (TR)i) Representing sub-tracks TRiTotal length of L (TR)j) Representing sub-tracks TRjTotal length of L (TR)i)+L(TRj)-Lc(TRi,TRj) The total length in space or time, i.e. the span, Sim (TR), of the two sub-track segments is indicatedi,TRj) Representing spatial or temporal similarity between two sub-trajectory segments;
step 1.32, if the time similarity is not zero, calculating the space-time similarity between the two sub-track segments, otherwise, turning to step 1.33, and calculating the space-time similarity according to the formula:
STSim(TRi,TRj)=SSim(TRi,TRj)×TSim(TRi,TRj);
in the formula, SSim (TR)i,TRj) The spatial similarity, TSim (TR), between two sub-track segments is showni,TRj) Shown is the temporal similarity, STSim (TR) between the two sub-tracksi,TRj) Representing the calculated space-time similarity measurement of the two sub-track segments;
step 1.33, calculating the space-time distance between the two sub-tracks, wherein the calculation method comprises the following steps:
STDist(TRi,TRj)=1-STSim(TRi,TRj);
in the formula, STSim (TR)i,TRj) Shown is a spatio-temporal similarity metric, STDist (TR) between two sub-trajectory segmentsi,TRj) The spatiotemporal distance between two sub-trajectory segments is represented.
Further, the specific steps of step 2 are as follows:
step 2.1, calculating the neighbor scale change of the sampling points on each track segment according to the space-time similarity, the space-time distance, the neighbor scale evolution algorithm and the DBSCAN algorithm of the sub-tracks;
step 2.2, calculating the distance between each sampling point on the track segment and other sampling points, marking the sampling point with the maximum distance to one sampling point as max, marking the sampling point with the minimum distance to the sampling point as min, if max is more than 2min, dividing the sampling point into a vibration object set, otherwise, dividing the sampling point into a stable object set;
step 2.3, initializing the Cluster _ id of the clusters in the stable object set and the oscillation object set to be 1, and defaulting the Cluster number of the nodes in the stable object set to be 0;
2.4, randomly selecting a core object v with a cluster number of 0 in the stable object set, and searching an object set Reach with reachable density in a breadth-first mode;
2.5, searching a Core object set Core in the object set Reach, and searching the minimum Cluster number Min _ Cluster in the Core object set Core;
step 2.6, if Min _ Cluster is 0, marking the Cluster numbers of the object set Reach and the core object v as Cluster _ id, and if not, searching the object set Connect connected with the object set Reach and the core object v in density, and marking the Cluster numbers of the object set Reach, the object set Connect connected with the density and the core object v as Min _ Cluster, namely clustering to obtain a Cluster;
step 2.7, judging whether the core object v still exists in the stable set object, if so, returning to the step 2.3, otherwise, obtaining all the class clusters, and performing the step 2.8;
and 2.8, distinguishing boundary points and noise points in the Oscillation object set oscillography, and distributing the boundary points to different clusters in the class clusters.
Further, the specific steps of step 3 are as follows:
step 3.1, counting the number n of clusters obtained by clustering and the number m of all track segments;
step 3.2, making p equal to m/n; step 3.3, if the number of the track sections contained in the cluster obtained by clustering is more than p, marking the cluster as a significant cluster, otherwise, marking the cluster as a non-significant cluster;
step 3.4: selecting a significant cluster C from the clustering results, and setting the starting point of the track segment contained in the significant cluster C as a point set K;
step 3.5, randomly selecting a breakpoint b from the point set K, combining other breakpoints and the breakpoint b in sequence to form an expandable point set Q, and if the added breakpoint b causes that the minimum circumscribed circle radius of the point set Q is larger than a pre-specified threshold β, deleting the breakpoint b from the point set Q;
step 3.6, traversing all points of the point set K, and if the distribution of the broken point number contained in the point set Q is more than a threshold value α, marking the point set Q as a staying spot;
step 3.7: repeating the step 3.5 to the step 3.6 until all candidate stay spots in the significant cluster C are generated;
step 3.8: and repeating the steps 3.4-3.7 until the complete significant cluster is traversed.
Further, the specific steps of step 4 are as follows:
step 4.1, calculating the stay heat degree information corresponding to each stay spot, wherein the calculation method comprises the following steps:
wherein h isspotTo retain the heat of the spot, nsubtraNumber of track segments included for staying spots, ntraIndicating the number of traces that the dwell spot contains, β being a factor;
4.2, obtaining a high-heat-degree area of the staying spots from the staying heat-degree information;
and 4.3, obtaining a hot spot area in the road network according to the area where the high-heat stay spots are located.
In summary, due to the adoption of the technical scheme, the invention has the beneficial effects that:
1. the road hot spot area mining method provided by the invention combines the track space-time similarity measurement under the grid space and the optimized DBSCAN track clustering method, and better overcomes the defects that the distance between two tracks in a road network cannot be accurately expressed by the traditional European coordinates and the traditional DBSCAN clustering needs to manually input related parameters in advance;
2. the method for representing the track sequence by adopting the grid space coordinates overcomes the confusion that the space-time similarity between the tracks cannot be accurately calculated due to the deviation of the track sampling points caused by the network environment, sampling equipment and the like;
3. the method for mining the road hot spot area based on the vehicle track has the best effect on the track data with high sampling frequency, can save the storage space overhead of the track data, and can improve the execution efficiency of the whole system;
4. the method for obtaining the track space-time similarity by multiplying the track time similarity and the spatial similarity can greatly improve the calculation efficiency of calculating the track space-time similarity and has quicker response time.
Drawings
FIG. 1 is a sub-flow diagram of the computation of trajectory spatiotemporal similarity and spatiotemporal distance in the present invention;
FIG. 2 is a flow chart of a DBSCAN algorithm based on dynamic neighbor optimization in the present invention;
FIG. 3 is a sub-flowchart of the hot spot area mining based on clustering results in the present invention;
fig. 4 is a distribution condition of the salient clusters in step 5 of the present invention, wherein the road segments covered by the black areas are the salient cluster aggregation areas;
FIG. 5 is a diagram illustrating the distribution of hot spots in step 7, wherein the black spots are high heat spots, and the gray spots are normal heat spots.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
The invention provides a road hot spot area mining method. The mining effect of the road hot spot area can be accurately and effectively improved by carrying out dynamic neighbor optimization DBSCAN clustering on the vehicle track and calculating the stay spot heat. The dynamic neighbor-based DBSCAN clustering algorithm overcomes the defect that the clustering result is greatly influenced by manually input parameter values. And the distribution condition of the hot spot area can be more accurately described by calculating the heat information of the staying spots.
A road network hot spot area mining method comprises the following steps:
step 1, carrying out track segmentation on all track segments, and calculating the space-time similarity and space-time distance between two segmented sub-track segments; the method comprises the following specific steps:
step 1.1, dividing a dynamic grid space for a space area where all track sections are located; the method comprises the following specific steps:
step 1.11, solving the minimum circumscribed rectangle of the space region where all the track segments are located according to the minimum convex hull principle;
step 1.12, solving the length of each track segment and the number of sampling points contained in the track segment, and calculating the average distance of the vehicle on the track segment moving in the time of two adjacent sampling points;
and step 1.13, taking the average distance as the size of a grid space, and performing dynamic grid space division on the minimum external rectangle.
Step 1.2, carrying out track segmentation on a track sequence in a grid space according to a breakpoint; the method comprises the following specific steps:
step 1.21, sequentially reading data of each sampling point on each track segment;
step 1.22: comparing longitude and latitude data of positions of sampling points of two adjacent track sections;
step 1.23: if the longitude and the latitude between two adjacent sampling points are unchanged, the middle position of the two sampling points is a breakpoint;
step 2.4: and carrying out track segmentation on the original track segment according to the calculated positions of the breakpoints.
And 1.3, calculating the space-time similarity and space-time distance between the two sub-track sections after the track segmentation. Step 1.31, calculating the spatial similarity between the two sub-track segments, if the spatial similarity is not zero, calculating the time similarity between the two sub-track segments, otherwise, turning to step 1.33, wherein the formula for calculating the spatial similarity and the time similarity is as follows:
in the formula, Lc(TRi,TRj) Representing the spatial or temporal cumulative length of sub-track segments within two tracks, L (TR)i) Representing sub-tracks TRiTotal length of L (TR)j) Representing sub-tracks TRjTotal length of L (TR)i)+L(TRj)-Lc(TRi,TRj) The total length in space or time, i.e. the span, Sim (TR), of the two sub-track segments is indicatedi,TRj) Representing two sub-tracksSpatial or temporal similarity between traces;
step 1.32, if the time similarity is not zero, calculating the space-time similarity between the two sub-track segments, otherwise, turning to step 1.33, and calculating the space-time similarity according to the formula:
STSim(TRi,TRj)=SSim(TRi,TRj)×TSim(TRi,TRj);
in the formula, SSim (TR)i,TRj) The spatial similarity, TSim (TR), between two sub-track segments is showni,TRj) Shown is the temporal similarity, STSim (TR) between the two sub-tracksi,TRj) Representing the calculated space-time similarity measurement of the two sub-track segments;
step 1.33, calculating the space-time distance between the two sub-tracks, wherein the calculation method comprises the following steps:
STDist(TRi,TRj)=1-STSim(TRi,TRj);
in the formula, STSim (TR)i,TRj) Shown is a spatio-temporal similarity metric, STDist (TR) between two sub-trajectory segmentsi,TRj) The spatiotemporal distance between two sub-trajectory segments is represented.
Step 2, performing clustering calculation on all track segment data in the grid space according to the space-time similarity and space-time distance of the sub-tracks and a dynamic neighbor-based DBSCAN algorithm; the method comprises the following specific steps:
the method comprises the following specific steps:
step 2.1, calculating the neighbor scale change of the sampling points on each track segment according to the space-time similarity, the space-time distance, the neighbor scale evolution algorithm and the DBSCAN algorithm of the sub-tracks;
step 2.2, calculating the distance between each sampling point on the track segment and other sampling points, marking the sampling point with the maximum distance to one sampling point as max, marking the sampling point with the minimum distance to the sampling point as min, if max is more than 2min, dividing the sampling point into a vibration object set, otherwise, dividing the sampling point into a stable object set;
step 2.3, initializing the Cluster _ id of the clusters in the stable object set and the oscillation object set to be 1, and defaulting the Cluster number of the nodes in the stable object set to be 0;
2.4, randomly selecting a core object v with a cluster number of 0 in the stable object set, and searching the object set Reach with the reachable density preferentially, wherein the standard of whether the density is reachable or not is whether reachable paths exist among the objects, and if yes, the objects are reachable, and if not, the objects are unreachable;
2.5, searching a Core object set Core in the object set Reach, and searching the minimum Cluster number Min _ Cluster in the Core object set Core;
step 2.6, if Min _ Cluster is 0, marking the Cluster numbers of the object set Reach and the core object v as Cluster _ id, and if not, searching the object set Connect connected with the object set Reach and the core object v in density, and marking the Cluster numbers of the object set Reach, the object set Connect connected with the density and the core object v as Min _ Cluster, namely clustering to obtain a Cluster;
step 2.7, judging whether the core object v still exists in the stable set object, if so, returning to the step 2.3, otherwise, obtaining all the class clusters, and performing the step 2.8;
and 2.8, distinguishing boundary points and noise points in the Oscillation object set oscillography, and distributing the boundary points to different clusters in the class clusters.
Step 3, selecting a significant cluster set from the cluster calculated by clustering, and extracting staying spots from the significant cluster set; the method comprises the following specific steps:
step 3.1, counting the number n of clusters obtained by clustering and the number m of all track segments;
step 3.2, making p equal to m/n; step 3.3, if the number of the track sections contained in the cluster obtained by clustering is more than p, marking the cluster as a significant cluster, otherwise, marking the cluster as a non-significant cluster;
step 3.4: selecting a significant cluster C from the clustering results, and setting the starting point of the track segment contained in the significant cluster C as a point set K;
step 3.5, randomly selecting a breakpoint b from the point set K, combining other breakpoints and the breakpoint b in sequence to form an expandable point set Q, and if the added breakpoint b causes that the minimum circumscribed circle radius of the point set Q is larger than a pre-specified threshold β, deleting the breakpoint b from the point set Q;
step 3.6, traversing all points of the point set K, and if the distribution of the broken point number contained in the point set Q is more than a threshold value α, marking the point set Q as a staying spot;
step 3.7: repeating the step 3.5 to the step 3.6 until all candidate stay spots in the significant cluster C are generated;
step 3.8: and repeating the steps 3.4-3.7 until the complete significant cluster is traversed.
Step 4, obtaining a high-heat-degree area of the stay spots according to the number of the track sections carried by the stay spots, and obtaining a hot spot area in the road network in the area where the high-heat-degree stay spots are located; the method comprises the following specific steps:
step 4.1, calculating the stay heat degree information corresponding to each stay spot, wherein the calculation method comprises the following steps:
wherein h isspotTo retain the heat of the spot, nsubtraNumber of track segments included for staying spots, ntraIndicating the number of traces contained in the dwell spot, β is a coefficient set during the testWherein any two track segments, whether identical or not, are different track segments, but if the two track segments are identical, they are the same track. I.e. the number of track segments is greater than (if there are identical tracks) or equal to (if all track segments are not identical) the number of tracks.
4.2, obtaining a high-heat-degree area of the staying spots from the staying heat-degree information;
and 4.3, obtaining a hot spot area in the road network according to the area where the high-heat stay spots are located.
Compared with the prior art, the road hot spot area mining method provided by the invention combines the track space-time similarity measurement under the grid space and the optimized DBSCAN track clustering method, and better overcomes the defects that the distance between two tracks in a road network cannot be accurately expressed by the traditional European coordinates and the traditional DBSCAN clustering needs to manually input related parameters in advance. Meanwhile, the method for representing the track sequence by adopting the grid space coordinates overcomes the problem that the space-time similarity between the tracks cannot be accurately calculated due to the deviation of the track sampling points caused by the network environment, sampling equipment and the like.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention are intended to be included within the scope of the present invention.
Claims (1)
1. A road network hot spot region mining method is characterized by comprising the following steps:
step 1, carrying out track segmentation on all track segments, and calculating the space-time similarity and space-time distance between two segmented sub-track segments;
step 2, performing clustering calculation on all track segment data in the grid space according to the space-time similarity and space-time distance of the sub-tracks and a dynamic neighbor-based DBSCAN algorithm;
step 3, selecting a significant cluster set from the cluster calculated by clustering, and extracting staying spots from the significant cluster set;
step 4, obtaining a high-heat-degree area of the stay spots according to the number of the track sections carried by the stay spots, and obtaining a hot spot area in the road network in the area where the high-heat-degree stay spots are located;
the specific steps of the step 1 are as follows:
step 1.1, dividing a dynamic grid space for a space area where all track sections are located;
step 1.2, carrying out track segmentation on a track sequence in a grid space according to a breakpoint;
step 1.3, calculating the space-time similarity and space-time distance between two sub-track sections after track segmentation;
the specific steps of step 1.1 are as follows:
step 1.11, solving the minimum circumscribed rectangle of the space region where all the track segments are located according to the minimum convex hull principle;
step 1.12, solving the length of each track segment and the number of sampling points contained in the track segment, and calculating the average distance of the vehicle on the track segment moving in the time of two adjacent sampling points;
step 1.13, taking the average distance as the size of a grid space, and performing dynamic grid space division on the minimum external rectangle;
the specific steps of step 1.2 are as follows:
step 1.21, sequentially reading data of each sampling point on each track segment;
step 1.22: comparing longitude and latitude data of positions of sampling points of two adjacent track sections;
step 1.23: if the longitude and the latitude between two adjacent sampling points are unchanged, the middle position of the two sampling points is a breakpoint;
step 1.24: carrying out track segmentation on the original track segment according to the calculated positions of all the breakpoints;
the specific steps of step 1.3 are as follows:
step 1.31, calculating the spatial similarity between the two sub-track segments, if the spatial similarity is not zero, calculating the time similarity between the two sub-track segments, otherwise, turning to step 1.33, wherein the formula for calculating the spatial similarity and the time similarity is as follows:
in the formula, Lc(TRi,TRj) Representing the spatial or temporal cumulative length of sub-track segments within two tracks, L (TR)i) Representing sub-tracks TRiTotal length of L (TR)j) Representing sub-tracks TRjTotal length of L (TR)i)+L(TRj)-Lc(TRi,TRj) Representing the sum of space or time of two sub-track segmentsLength, i.e. span, Sim (TR)i,TRj) Representing spatial or temporal similarity between two sub-trajectory segments;
step 1.32, if the time similarity is not zero, calculating the space-time similarity between the two sub-track segments, otherwise, turning to step 1.33, wherein the formula for calculating the space-time similarity is as follows:
STSim(TRi,TRj)=SSim(TRi,TRj)×TSim(TRi,TRj);
in the formula, SSim (TR)i,TRj) The spatial similarity, TSim (TR), between two sub-track segments is showni,TRj) Shown is the temporal similarity, STSim (TR) between the two sub-tracksi,TRj) Representing the calculated space-time similarity measurement of the two sub-track segments;
step 1.33, calculating the space-time distance between the two sub-tracks, wherein the calculation method comprises the following steps:
STDist(TRi,TRj)=1-STSim(TRi,TRj);
in the formula, STSim (TR)i,TRj) Shown is a spatio-temporal similarity metric, STDist (TR) between two sub-trajectory segmentsi,TRj) Representing the spatiotemporal distance between two sub-trajectory segments;
the specific steps of the step 2 are as follows:
step 2.1, calculating the neighbor scale change of the sampling points on each track segment according to the space-time similarity, the space-time distance, the neighbor scale evolution algorithm and the DBSCAN algorithm of the sub-tracks;
step 2.2, calculating the distance between each sampling point on the track segment and other sampling points, marking the sampling point with the maximum distance to one sampling point as max, marking the sampling point with the minimum distance to the sampling point as min, if max is more than 2min, dividing the sampling point into a vibration object set, otherwise, dividing the sampling point into a stable object set;
step 2.3, initializing the Cluster _ id of the clusters in the stable object set and the oscillation object set to be 1, and defaulting the Cluster number of the nodes in the stable object set to be 0;
2.4, randomly selecting a core object v with a cluster number of 0 in the stable object set, and searching an object set Reach with reachable density in a breadth-first mode;
2.5, searching a Core object set Core in the object set Reach, and searching the minimum Cluster number Min _ Cluster in the Core object set Core;
step 2.6, if Min _ Cluster is 0, marking the Cluster numbers of the object set Reach and the core object v as Cluster _ id, and if not, searching the object set Connect connected with the object set Reach and the core object v in density, and marking the Cluster numbers of the object set Reach, the object set Connect connected with the density and the core object v as Min _ Cluster, namely clustering to obtain a Cluster;
step 2.7, judging whether the core object v still exists in the stable set object, if so, returning to the step 2.3, otherwise, obtaining all the class clusters, and performing the step 2.8;
step 2.8, distinguishing boundary points and noise points in the Oscillation object set oscillography, and distributing the boundary points to different clusters in the cluster class;
the specific steps of the step 3 are as follows:
step 3.1, counting the number n of clusters obtained by clustering and the number m of all track segments;
step 3.2, making p equal to m/n; step 3.3, if the number of the track sections contained in the cluster obtained by clustering is more than p, marking the cluster as a significant cluster, otherwise, marking the cluster as a non-significant cluster;
step 3.4: selecting a significant cluster C from the clustering results, and setting the starting point of the track segment contained in the significant cluster C as a point set K;
step 3.5, randomly selecting a breakpoint b from the point set K, combining other breakpoints and the breakpoint b in sequence to form an expandable point set Q, and if the added breakpoint b causes that the minimum circumscribed circle radius of the point set Q is larger than a pre-specified threshold β, deleting the breakpoint b from the point set Q;
step 3.6, traversing all points of the point set K, and if the distribution of the broken point number contained in the point set Q is more than a threshold value α, marking the point set Q as a staying spot;
step 3.7: repeating the step 3.5 to the step 3.6 until all candidate stay spots in the significant cluster C are generated;
step 3.8: repeating the step 3.4 to the step 3.7 until the complete salient cluster is traversed;
the specific steps of the step 4 are as follows:
step 4.1, calculating the stay heat degree information corresponding to each stay spot, wherein the calculation method comprises the following steps:
wherein h isspotTo retain the heat of the spot, nsubtraNumber of track segments included for staying spots, ntraIndicating the number of traces that the dwell spot contains, β being a factor;
4.2, obtaining a high-heat-degree area of the staying spots from the staying heat-degree information;
and 4.3, obtaining a hot spot area in the road network according to the area where the high-heat stay spots are located.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710735328.1A CN107301254B (en) | 2017-08-24 | 2017-08-24 | Road network hot spot area mining method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710735328.1A CN107301254B (en) | 2017-08-24 | 2017-08-24 | Road network hot spot area mining method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107301254A CN107301254A (en) | 2017-10-27 |
CN107301254B true CN107301254B (en) | 2020-07-10 |
Family
ID=60132049
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710735328.1A Active CN107301254B (en) | 2017-08-24 | 2017-08-24 | Road network hot spot area mining method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107301254B (en) |
Families Citing this family (22)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108133172B (en) * | 2017-11-16 | 2022-04-05 | 北京华道兴科技有限公司 | Method for classifying moving objects in video and method and device for analyzing traffic flow |
CN108427965B (en) * | 2018-03-05 | 2022-08-23 | 重庆邮电大学 | Hot spot area mining method based on road network clustering |
CN108804563B (en) * | 2018-05-22 | 2021-11-19 | 创新先进技术有限公司 | Data labeling method, device and equipment |
CN109033011B (en) * | 2018-06-19 | 2022-06-21 | 东软集团股份有限公司 | Method and device for calculating track frequency, storage medium and electronic equipment |
CN109767615B (en) * | 2018-10-19 | 2021-05-18 | 江苏智通交通科技有限公司 | Method for analyzing key flow direction and key path of road network traffic flow |
CN109686085B (en) * | 2018-12-17 | 2020-05-05 | 北京交通大学 | GPS data based dangerous cargo transport vehicle stop node activity type identification method |
CN109711451A (en) * | 2018-12-20 | 2019-05-03 | 成都四方伟业软件股份有限公司 | A kind of data processing method, device, electronic equipment and storage medium |
CN109615857B (en) * | 2018-12-20 | 2021-02-05 | 首都师范大学 | Deployment and scheduling method and device for roadside units in urban vehicle-mounted network |
CN111380541B (en) * | 2018-12-29 | 2022-09-13 | 沈阳美行科技股份有限公司 | Interest point determination method and device, computer equipment and storage medium |
CN111488417B (en) * | 2019-01-28 | 2023-10-24 | 阿里巴巴集团控股有限公司 | Information processing method, system, device, equipment and computer storage medium |
CN109977109B (en) * | 2019-04-03 | 2021-04-27 | 深圳市甲易科技有限公司 | Track data accompanying analysis method |
CN110174115B (en) * | 2019-06-05 | 2021-03-16 | 武汉中海庭数据技术有限公司 | Method and device for automatically generating high-precision positioning map based on perception data |
CN110457315A (en) * | 2019-07-19 | 2019-11-15 | 国家计算机网络与信息安全管理中心 | A kind of group's accumulation mode analysis method and system based on user trajectory data |
CN110909037B (en) * | 2019-10-09 | 2024-02-13 | 中国人民解放军战略支援部队信息工程大学 | Frequent track mode mining method and device |
CN110827540B (en) * | 2019-11-04 | 2021-03-12 | 黄传明 | Motor vehicle movement mode recognition method and system based on multi-mode data fusion |
CN111275963A (en) * | 2020-01-14 | 2020-06-12 | 北京百度网讯科技有限公司 | Method and device for mining hot spot area, electronic equipment and storage medium |
CN111881370A (en) * | 2020-05-21 | 2020-11-03 | 北京嘀嘀无限科技发展有限公司 | Method and system for describing contour of interest area |
CN111372188B (en) * | 2020-05-27 | 2020-09-01 | 腾讯科技(深圳)有限公司 | Method and device for determining hot spot track in area, storage medium and electronic device |
CN111667392B (en) * | 2020-06-12 | 2023-06-16 | 成都国铁电气设备有限公司 | Railway contact net defect hot spot area early warning method based on space-time clustering |
CN111797295B (en) * | 2020-06-19 | 2021-04-02 | 云从科技集团股份有限公司 | Multi-dimensional space-time trajectory fusion method and device, machine readable medium and equipment |
CN111897805B (en) * | 2020-06-24 | 2022-11-11 | 东南大学 | Hot spot path mining method based on longest common sub-track density clustering |
CN113129328B (en) * | 2021-04-22 | 2022-05-17 | 中国电子科技集团公司第二十九研究所 | Target hotspot area fine analysis method |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103559213A (en) * | 2013-10-10 | 2014-02-05 | 河南大学 | Efficient spatial nearest neighbor query method for highway networks |
CN103578265A (en) * | 2012-07-18 | 2014-02-12 | 北京掌城科技有限公司 | Method for acquiring taxi-hailing hot spot based on taxi GPS data |
CN106383868A (en) * | 2016-09-05 | 2017-02-08 | 电子科技大学 | Road network-based spatio-temporal trajectory clustering method |
WO2017107800A1 (en) * | 2015-12-24 | 2017-06-29 | 阿里巴巴集团控股有限公司 | Method of acquiring route hotspot of traffic road and device |
-
2017
- 2017-08-24 CN CN201710735328.1A patent/CN107301254B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103578265A (en) * | 2012-07-18 | 2014-02-12 | 北京掌城科技有限公司 | Method for acquiring taxi-hailing hot spot based on taxi GPS data |
CN103559213A (en) * | 2013-10-10 | 2014-02-05 | 河南大学 | Efficient spatial nearest neighbor query method for highway networks |
WO2017107800A1 (en) * | 2015-12-24 | 2017-06-29 | 阿里巴巴集团控股有限公司 | Method of acquiring route hotspot of traffic road and device |
CN106383868A (en) * | 2016-09-05 | 2017-02-08 | 电子科技大学 | Road network-based spatio-temporal trajectory clustering method |
Non-Patent Citations (2)
Title |
---|
A Novel Passenger Hotspots Searching Algorithm for Taxis in Urban Area;Yuhan Dong,et al;《IEEE Computer society》;20170628;第175~180页 * |
Discovering Hotspots:A Placement Strategy for Wi-Fi based Trajectory Monitoring within Buildings;Lorenz Schauer;《IEEE CONFERENCE》;20151111;第371~380页 * |
Also Published As
Publication number | Publication date |
---|---|
CN107301254A (en) | 2017-10-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107301254B (en) | Road network hot spot area mining method | |
CN107103754B (en) | Road traffic condition prediction method and system | |
US10323948B2 (en) | GPS data repair | |
CN111475596B (en) | Sub-segment similarity matching method based on multi-level track coding tree | |
CN104462190B (en) | A kind of online position predicting method excavated based on magnanimity space tracking | |
CN109241126B (en) | Spatio-temporal trajectory aggregation mode mining algorithm based on R-tree index | |
CN111046968B (en) | Road network track clustering analysis method based on improved DPC algorithm | |
CN113723715B (en) | Method, system, equipment and storage medium for automatically matching public transport network with road network | |
CN117093832B (en) | Data interpolation method and system for air quality data loss | |
CN111024098A (en) | Motor vehicle path fitting algorithm based on low-sampling data | |
Kriegel et al. | Proximity queries in large traffic networks | |
Makris et al. | A comparison of trajectory compression algorithms over AIS data | |
CN111027574A (en) | Building mode identification method based on graph convolution | |
US20160019248A1 (en) | Methods for processing within-distance queries | |
CN115292962B (en) | Path similarity matching method and device based on track rarefaction and storage medium | |
US11988522B2 (en) | Method, data processing apparatus and computer program product for generating map data | |
CN116452826A (en) | Coal gangue contour estimation method based on machine vision under shielding condition | |
CN117516558A (en) | Road network generation method, device, computer equipment and computer readable storage medium | |
WO2022127573A1 (en) | User trajectory positioning method, electronic device and computer storage medium | |
Wang et al. | Accurate Detection of Road Network Anomaly by Understanding Crowd's Driving Strategies from Human Mobility | |
CN114664104A (en) | Road network matching method and device | |
CN109446264B (en) | Urban mobile data analysis method based on flow visualization | |
Tian et al. | Road crack detection algorithm based on YOLOv3 | |
CN111723792A (en) | Real-time positioning point identification method suitable for rigid-flexible contact network | |
CN115659162B (en) | Method, system and equipment for extracting intra-pulse characteristics of radar radiation source signals |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |