CN107301254B - Road network hot spot area mining method - Google Patents

Road network hot spot area mining method Download PDF

Info

Publication number
CN107301254B
CN107301254B CN201710735328.1A CN201710735328A CN107301254B CN 107301254 B CN107301254 B CN 107301254B CN 201710735328 A CN201710735328 A CN 201710735328A CN 107301254 B CN107301254 B CN 107301254B
Authority
CN
China
Prior art keywords
track
space
cluster
sub
similarity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710735328.1A
Other languages
Chinese (zh)
Other versions
CN107301254A (en
Inventor
田玲
罗光春
殷光强
陈爱国
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN201710735328.1A priority Critical patent/CN107301254B/en
Publication of CN107301254A publication Critical patent/CN107301254A/en
Application granted granted Critical
Publication of CN107301254B publication Critical patent/CN107301254B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2462Approximate or statistical queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/285Clustering or classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9537Spatial or temporal dependent retrieval, e.g. spatiotemporal queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Fuzzy Systems (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Multimedia (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a road network hot spot region mining method, belongs to the technical field of data mining, and solves the problem that track clustering is carried out by adopting track space-time similarity measurement and clustering calculation in the prior art. The method comprises the following steps of 1, carrying out track segmentation on all track segments, and calculating the space-time similarity and space-time distance between two segmented sub-track segments; step 2, performing clustering calculation on all track segment data in the grid space according to the space-time similarity and space-time distance of the sub-tracks and a dynamic neighbor-based DBSCAN algorithm; step 3, selecting a significant cluster set from the cluster calculated by clustering, and extracting staying spots from the significant cluster set; and 4, obtaining a high-heat-degree area of the stay spots according to the number of the track sections carried by the stay spots, and obtaining a hot spot area in the road network in the area where the high-heat-degree stay spots are located. The invention is used for positioning the space position.

Description

Road network hot spot area mining method
Technical Field
A road network hot spot region mining method is used for positioning spatial positions and belongs to the technical field of data mining.
Background
In recent years, with the rapid development and application of spatial location positioning technologies, along with the rapid popularization of these technologies, we can easily track the location information of almost any moving object, so as to form a huge trajectory database taking the trajectory as an expression form, and these massive trajectory data contain a large amount of deep information capable of reflecting some motion behavior of the moving object. The space-time trajectory data is used as one kind of space-time data, mainly records the trend of the space position of a moving object changing along with time, and the vehicle space-time trajectory data is more special and is limited in a road network, so that many common data mining methods cannot be directly applied to the space-time trajectory data mining and need to be improved to a certain extent.
Since research on hot spot areas in a road network has important practical application value, research on hot spot path areas must be performed on track data that is effective in a road network. Clustering analysis of the trajectory data is a common method for finding hot gate paths in a road network. Trajectory clustering mainly comprises two parts: and (4) measuring the space-time similarity of the tracks and calculating the clustering. The most common research method in the aspect of measuring the track time-space similarity mainly divides the track based on a grid space, firstly, the method divides the grid space and cuts the track data, and adds the time-space similarity and the time similarity of the divided sub-tracks to obtain the time-space similarity of the track. The method can accurately calculate the space-time similarity between the tracks, but the method respectively calculates the space similarity and the time similarity of the similarity measurement between each pair of tracks, and when the track data volume is large, the response time of the algorithm is large. In the aspect of cluster calculation, because the shape of the track cluster is often similar to a strip shape rather than a spherical shape, the most typical density clustering algorithm DBSCAN is often adopted in the cluster calculation process, and the algorithm can realize cluster calculation of clusters with any shapes. However, the method needs to artificially input two parameter values of the neighborhood radius and the neighborhood density threshold when performing the clustering calculation, and the quality of the two parameter values directly affects the clustering result, and the DBSCAN algorithm does not provide a method for determining the two parameter values.
Disclosure of Invention
The invention aims to: the method solves the problems that in the prior art, when the tracks are clustered by adopting track space-time similarity measurement and clustering calculation, the response time is longer when the track data volume is larger by adopting the space-time similarity measurement; the Euclidean coordinates cannot accurately express the distance between two tracks in the road network; when the density clustering algorithm DBSCAN is adopted for clustering calculation, the neighborhood radius and the neighborhood density threshold value need to be artificially input, and when the value is inaccurate, the clustering result can be directly influenced; the invention provides a road network hot spot area mining method.
The technical scheme adopted by the invention is as follows:
a road network hot spot region mining method is characterized by comprising the following steps:
step 1, carrying out track segmentation on all track segments, and calculating the space-time similarity and space-time distance between two segmented sub-track segments;
step 2, performing clustering calculation on all track segment data in the grid space according to the space-time similarity and space-time distance of the sub-tracks and a dynamic neighbor-based DBSCAN algorithm;
step 3, selecting a significant cluster set from the cluster calculated by clustering, and extracting staying spots from the significant cluster set;
and 4, obtaining a high-heat-degree area of the stay spots according to the number of the track sections carried by the stay spots, and obtaining a hot spot area in the road network in the area where the high-heat-degree stay spots are located.
Further, the specific steps of step 1 are as follows:
step 1.1, dividing a dynamic grid space for a space area where all track sections are located;
step 1.2, carrying out track segmentation on a track sequence in a grid space according to a breakpoint;
and 1.3, calculating the space-time similarity and space-time distance between the two sub-track sections after the track segmentation.
Further, the specific steps of step 1.1 are as follows:
step 1.11, solving the minimum circumscribed rectangle of the space region where all the track segments are located according to the minimum convex hull principle;
step 1.12, solving the length of each track segment and the number of sampling points contained in the track segment, and calculating the average distance of the vehicle on the track segment moving in the time of two adjacent sampling points;
and step 1.13, taking the average distance as the size of a grid space, and performing dynamic grid space division on the minimum external rectangle.
Further, the specific steps of step 1.2 are as follows:
step 1.21, sequentially reading data of each sampling point on each track segment;
step 1.22: comparing longitude and latitude data of positions of sampling points of two adjacent track sections;
step 1.23: if the longitude and the latitude between two adjacent sampling points are unchanged, the middle position of the two sampling points is a breakpoint;
step 2.4: and carrying out track segmentation on the original track segment according to the calculated positions of the breakpoints.
Further, the specific steps of step 1.3 are as follows:
step 1.31, calculating the spatial similarity between the two sub-track segments, if the spatial similarity is not zero, calculating the time similarity between the two sub-track segments, otherwise, turning to step 1.33, wherein the formula for calculating the spatial similarity and the time similarity is as follows:
Figure BDA0001387991330000021
in the formula, Lc(TRi,TRj) Representing the spatial or temporal cumulative length of sub-track segments within two tracks, L (TR)i) Representing sub-tracks TRiTotal length of L (TR)j) Representing sub-tracks TRjTotal length of L (TR)i)+L(TRj)-Lc(TRi,TRj) The total length in space or time, i.e. the span, Sim (TR), of the two sub-track segments is indicatedi,TRj) Representing spatial or temporal similarity between two sub-trajectory segments;
step 1.32, if the time similarity is not zero, calculating the space-time similarity between the two sub-track segments, otherwise, turning to step 1.33, and calculating the space-time similarity according to the formula:
STSim(TRi,TRj)=SSim(TRi,TRj)×TSim(TRi,TRj);
in the formula, SSim (TR)i,TRj) The spatial similarity, TSim (TR), between two sub-track segments is showni,TRj) Shown is the temporal similarity, STSim (TR) between the two sub-tracksi,TRj) Representing the calculated space-time similarity measurement of the two sub-track segments;
step 1.33, calculating the space-time distance between the two sub-tracks, wherein the calculation method comprises the following steps:
STDist(TRi,TRj)=1-STSim(TRi,TRj);
in the formula, STSim (TR)i,TRj) Shown is a spatio-temporal similarity metric, STDist (TR) between two sub-trajectory segmentsi,TRj) The spatiotemporal distance between two sub-trajectory segments is represented.
Further, the specific steps of step 2 are as follows:
step 2.1, calculating the neighbor scale change of the sampling points on each track segment according to the space-time similarity, the space-time distance, the neighbor scale evolution algorithm and the DBSCAN algorithm of the sub-tracks;
step 2.2, calculating the distance between each sampling point on the track segment and other sampling points, marking the sampling point with the maximum distance to one sampling point as max, marking the sampling point with the minimum distance to the sampling point as min, if max is more than 2min, dividing the sampling point into a vibration object set, otherwise, dividing the sampling point into a stable object set;
step 2.3, initializing the Cluster _ id of the clusters in the stable object set and the oscillation object set to be 1, and defaulting the Cluster number of the nodes in the stable object set to be 0;
2.4, randomly selecting a core object v with a cluster number of 0 in the stable object set, and searching an object set Reach with reachable density in a breadth-first mode;
2.5, searching a Core object set Core in the object set Reach, and searching the minimum Cluster number Min _ Cluster in the Core object set Core;
step 2.6, if Min _ Cluster is 0, marking the Cluster numbers of the object set Reach and the core object v as Cluster _ id, and if not, searching the object set Connect connected with the object set Reach and the core object v in density, and marking the Cluster numbers of the object set Reach, the object set Connect connected with the density and the core object v as Min _ Cluster, namely clustering to obtain a Cluster;
step 2.7, judging whether the core object v still exists in the stable set object, if so, returning to the step 2.3, otherwise, obtaining all the class clusters, and performing the step 2.8;
and 2.8, distinguishing boundary points and noise points in the Oscillation object set oscillography, and distributing the boundary points to different clusters in the class clusters.
Further, the specific steps of step 3 are as follows:
step 3.1, counting the number n of clusters obtained by clustering and the number m of all track segments;
step 3.2, making p equal to m/n; step 3.3, if the number of the track sections contained in the cluster obtained by clustering is more than p, marking the cluster as a significant cluster, otherwise, marking the cluster as a non-significant cluster;
step 3.4: selecting a significant cluster C from the clustering results, and setting the starting point of the track segment contained in the significant cluster C as a point set K;
step 3.5, randomly selecting a breakpoint b from the point set K, combining other breakpoints and the breakpoint b in sequence to form an expandable point set Q, and if the added breakpoint b causes that the minimum circumscribed circle radius of the point set Q is larger than a pre-specified threshold β, deleting the breakpoint b from the point set Q;
step 3.6, traversing all points of the point set K, and if the distribution of the broken point number contained in the point set Q is more than a threshold value α, marking the point set Q as a staying spot;
step 3.7: repeating the step 3.5 to the step 3.6 until all candidate stay spots in the significant cluster C are generated;
step 3.8: and repeating the steps 3.4-3.7 until the complete significant cluster is traversed.
Further, the specific steps of step 4 are as follows:
step 4.1, calculating the stay heat degree information corresponding to each stay spot, wherein the calculation method comprises the following steps:
Figure BDA0001387991330000041
wherein h isspotTo retain the heat of the spot, nsubtraNumber of track segments included for staying spots, ntraIndicating the number of traces that the dwell spot contains, β being a factor;
4.2, obtaining a high-heat-degree area of the staying spots from the staying heat-degree information;
and 4.3, obtaining a hot spot area in the road network according to the area where the high-heat stay spots are located.
In summary, due to the adoption of the technical scheme, the invention has the beneficial effects that:
1. the road hot spot area mining method provided by the invention combines the track space-time similarity measurement under the grid space and the optimized DBSCAN track clustering method, and better overcomes the defects that the distance between two tracks in a road network cannot be accurately expressed by the traditional European coordinates and the traditional DBSCAN clustering needs to manually input related parameters in advance;
2. the method for representing the track sequence by adopting the grid space coordinates overcomes the confusion that the space-time similarity between the tracks cannot be accurately calculated due to the deviation of the track sampling points caused by the network environment, sampling equipment and the like;
3. the method for mining the road hot spot area based on the vehicle track has the best effect on the track data with high sampling frequency, can save the storage space overhead of the track data, and can improve the execution efficiency of the whole system;
4. the method for obtaining the track space-time similarity by multiplying the track time similarity and the spatial similarity can greatly improve the calculation efficiency of calculating the track space-time similarity and has quicker response time.
Drawings
FIG. 1 is a sub-flow diagram of the computation of trajectory spatiotemporal similarity and spatiotemporal distance in the present invention;
FIG. 2 is a flow chart of a DBSCAN algorithm based on dynamic neighbor optimization in the present invention;
FIG. 3 is a sub-flowchart of the hot spot area mining based on clustering results in the present invention;
fig. 4 is a distribution condition of the salient clusters in step 5 of the present invention, wherein the road segments covered by the black areas are the salient cluster aggregation areas;
FIG. 5 is a diagram illustrating the distribution of hot spots in step 7, wherein the black spots are high heat spots, and the gray spots are normal heat spots.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
The invention provides a road hot spot area mining method. The mining effect of the road hot spot area can be accurately and effectively improved by carrying out dynamic neighbor optimization DBSCAN clustering on the vehicle track and calculating the stay spot heat. The dynamic neighbor-based DBSCAN clustering algorithm overcomes the defect that the clustering result is greatly influenced by manually input parameter values. And the distribution condition of the hot spot area can be more accurately described by calculating the heat information of the staying spots.
A road network hot spot area mining method comprises the following steps:
step 1, carrying out track segmentation on all track segments, and calculating the space-time similarity and space-time distance between two segmented sub-track segments; the method comprises the following specific steps:
step 1.1, dividing a dynamic grid space for a space area where all track sections are located; the method comprises the following specific steps:
step 1.11, solving the minimum circumscribed rectangle of the space region where all the track segments are located according to the minimum convex hull principle;
step 1.12, solving the length of each track segment and the number of sampling points contained in the track segment, and calculating the average distance of the vehicle on the track segment moving in the time of two adjacent sampling points;
and step 1.13, taking the average distance as the size of a grid space, and performing dynamic grid space division on the minimum external rectangle.
Step 1.2, carrying out track segmentation on a track sequence in a grid space according to a breakpoint; the method comprises the following specific steps:
step 1.21, sequentially reading data of each sampling point on each track segment;
step 1.22: comparing longitude and latitude data of positions of sampling points of two adjacent track sections;
step 1.23: if the longitude and the latitude between two adjacent sampling points are unchanged, the middle position of the two sampling points is a breakpoint;
step 2.4: and carrying out track segmentation on the original track segment according to the calculated positions of the breakpoints.
And 1.3, calculating the space-time similarity and space-time distance between the two sub-track sections after the track segmentation. Step 1.31, calculating the spatial similarity between the two sub-track segments, if the spatial similarity is not zero, calculating the time similarity between the two sub-track segments, otherwise, turning to step 1.33, wherein the formula for calculating the spatial similarity and the time similarity is as follows:
Figure BDA0001387991330000061
in the formula, Lc(TRi,TRj) Representing the spatial or temporal cumulative length of sub-track segments within two tracks, L (TR)i) Representing sub-tracks TRiTotal length of L (TR)j) Representing sub-tracks TRjTotal length of L (TR)i)+L(TRj)-Lc(TRi,TRj) The total length in space or time, i.e. the span, Sim (TR), of the two sub-track segments is indicatedi,TRj) Representing two sub-tracksSpatial or temporal similarity between traces;
step 1.32, if the time similarity is not zero, calculating the space-time similarity between the two sub-track segments, otherwise, turning to step 1.33, and calculating the space-time similarity according to the formula:
STSim(TRi,TRj)=SSim(TRi,TRj)×TSim(TRi,TRj);
in the formula, SSim (TR)i,TRj) The spatial similarity, TSim (TR), between two sub-track segments is showni,TRj) Shown is the temporal similarity, STSim (TR) between the two sub-tracksi,TRj) Representing the calculated space-time similarity measurement of the two sub-track segments;
step 1.33, calculating the space-time distance between the two sub-tracks, wherein the calculation method comprises the following steps:
STDist(TRi,TRj)=1-STSim(TRi,TRj);
in the formula, STSim (TR)i,TRj) Shown is a spatio-temporal similarity metric, STDist (TR) between two sub-trajectory segmentsi,TRj) The spatiotemporal distance between two sub-trajectory segments is represented.
Step 2, performing clustering calculation on all track segment data in the grid space according to the space-time similarity and space-time distance of the sub-tracks and a dynamic neighbor-based DBSCAN algorithm; the method comprises the following specific steps:
the method comprises the following specific steps:
step 2.1, calculating the neighbor scale change of the sampling points on each track segment according to the space-time similarity, the space-time distance, the neighbor scale evolution algorithm and the DBSCAN algorithm of the sub-tracks;
step 2.2, calculating the distance between each sampling point on the track segment and other sampling points, marking the sampling point with the maximum distance to one sampling point as max, marking the sampling point with the minimum distance to the sampling point as min, if max is more than 2min, dividing the sampling point into a vibration object set, otherwise, dividing the sampling point into a stable object set;
step 2.3, initializing the Cluster _ id of the clusters in the stable object set and the oscillation object set to be 1, and defaulting the Cluster number of the nodes in the stable object set to be 0;
2.4, randomly selecting a core object v with a cluster number of 0 in the stable object set, and searching the object set Reach with the reachable density preferentially, wherein the standard of whether the density is reachable or not is whether reachable paths exist among the objects, and if yes, the objects are reachable, and if not, the objects are unreachable;
2.5, searching a Core object set Core in the object set Reach, and searching the minimum Cluster number Min _ Cluster in the Core object set Core;
step 2.6, if Min _ Cluster is 0, marking the Cluster numbers of the object set Reach and the core object v as Cluster _ id, and if not, searching the object set Connect connected with the object set Reach and the core object v in density, and marking the Cluster numbers of the object set Reach, the object set Connect connected with the density and the core object v as Min _ Cluster, namely clustering to obtain a Cluster;
step 2.7, judging whether the core object v still exists in the stable set object, if so, returning to the step 2.3, otherwise, obtaining all the class clusters, and performing the step 2.8;
and 2.8, distinguishing boundary points and noise points in the Oscillation object set oscillography, and distributing the boundary points to different clusters in the class clusters.
Step 3, selecting a significant cluster set from the cluster calculated by clustering, and extracting staying spots from the significant cluster set; the method comprises the following specific steps:
step 3.1, counting the number n of clusters obtained by clustering and the number m of all track segments;
step 3.2, making p equal to m/n; step 3.3, if the number of the track sections contained in the cluster obtained by clustering is more than p, marking the cluster as a significant cluster, otherwise, marking the cluster as a non-significant cluster;
step 3.4: selecting a significant cluster C from the clustering results, and setting the starting point of the track segment contained in the significant cluster C as a point set K;
step 3.5, randomly selecting a breakpoint b from the point set K, combining other breakpoints and the breakpoint b in sequence to form an expandable point set Q, and if the added breakpoint b causes that the minimum circumscribed circle radius of the point set Q is larger than a pre-specified threshold β, deleting the breakpoint b from the point set Q;
step 3.6, traversing all points of the point set K, and if the distribution of the broken point number contained in the point set Q is more than a threshold value α, marking the point set Q as a staying spot;
step 3.7: repeating the step 3.5 to the step 3.6 until all candidate stay spots in the significant cluster C are generated;
step 3.8: and repeating the steps 3.4-3.7 until the complete significant cluster is traversed.
Step 4, obtaining a high-heat-degree area of the stay spots according to the number of the track sections carried by the stay spots, and obtaining a hot spot area in the road network in the area where the high-heat-degree stay spots are located; the method comprises the following specific steps:
step 4.1, calculating the stay heat degree information corresponding to each stay spot, wherein the calculation method comprises the following steps:
Figure BDA0001387991330000081
wherein h isspotTo retain the heat of the spot, nsubtraNumber of track segments included for staying spots, ntraIndicating the number of traces contained in the dwell spot, β is a coefficient set during the test
Figure BDA0001387991330000082
Wherein any two track segments, whether identical or not, are different track segments, but if the two track segments are identical, they are the same track. I.e. the number of track segments is greater than (if there are identical tracks) or equal to (if all track segments are not identical) the number of tracks.
4.2, obtaining a high-heat-degree area of the staying spots from the staying heat-degree information;
and 4.3, obtaining a hot spot area in the road network according to the area where the high-heat stay spots are located.
Compared with the prior art, the road hot spot area mining method provided by the invention combines the track space-time similarity measurement under the grid space and the optimized DBSCAN track clustering method, and better overcomes the defects that the distance between two tracks in a road network cannot be accurately expressed by the traditional European coordinates and the traditional DBSCAN clustering needs to manually input related parameters in advance. Meanwhile, the method for representing the track sequence by adopting the grid space coordinates overcomes the problem that the space-time similarity between the tracks cannot be accurately calculated due to the deviation of the track sampling points caused by the network environment, sampling equipment and the like.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention are intended to be included within the scope of the present invention.

Claims (1)

1. A road network hot spot region mining method is characterized by comprising the following steps:
step 1, carrying out track segmentation on all track segments, and calculating the space-time similarity and space-time distance between two segmented sub-track segments;
step 2, performing clustering calculation on all track segment data in the grid space according to the space-time similarity and space-time distance of the sub-tracks and a dynamic neighbor-based DBSCAN algorithm;
step 3, selecting a significant cluster set from the cluster calculated by clustering, and extracting staying spots from the significant cluster set;
step 4, obtaining a high-heat-degree area of the stay spots according to the number of the track sections carried by the stay spots, and obtaining a hot spot area in the road network in the area where the high-heat-degree stay spots are located;
the specific steps of the step 1 are as follows:
step 1.1, dividing a dynamic grid space for a space area where all track sections are located;
step 1.2, carrying out track segmentation on a track sequence in a grid space according to a breakpoint;
step 1.3, calculating the space-time similarity and space-time distance between two sub-track sections after track segmentation;
the specific steps of step 1.1 are as follows:
step 1.11, solving the minimum circumscribed rectangle of the space region where all the track segments are located according to the minimum convex hull principle;
step 1.12, solving the length of each track segment and the number of sampling points contained in the track segment, and calculating the average distance of the vehicle on the track segment moving in the time of two adjacent sampling points;
step 1.13, taking the average distance as the size of a grid space, and performing dynamic grid space division on the minimum external rectangle;
the specific steps of step 1.2 are as follows:
step 1.21, sequentially reading data of each sampling point on each track segment;
step 1.22: comparing longitude and latitude data of positions of sampling points of two adjacent track sections;
step 1.23: if the longitude and the latitude between two adjacent sampling points are unchanged, the middle position of the two sampling points is a breakpoint;
step 1.24: carrying out track segmentation on the original track segment according to the calculated positions of all the breakpoints;
the specific steps of step 1.3 are as follows:
step 1.31, calculating the spatial similarity between the two sub-track segments, if the spatial similarity is not zero, calculating the time similarity between the two sub-track segments, otherwise, turning to step 1.33, wherein the formula for calculating the spatial similarity and the time similarity is as follows:
Figure FDA0002505037480000011
in the formula, Lc(TRi,TRj) Representing the spatial or temporal cumulative length of sub-track segments within two tracks, L (TR)i) Representing sub-tracks TRiTotal length of L (TR)j) Representing sub-tracks TRjTotal length of L (TR)i)+L(TRj)-Lc(TRi,TRj) Representing the sum of space or time of two sub-track segmentsLength, i.e. span, Sim (TR)i,TRj) Representing spatial or temporal similarity between two sub-trajectory segments;
step 1.32, if the time similarity is not zero, calculating the space-time similarity between the two sub-track segments, otherwise, turning to step 1.33, wherein the formula for calculating the space-time similarity is as follows:
STSim(TRi,TRj)=SSim(TRi,TRj)×TSim(TRi,TRj);
in the formula, SSim (TR)i,TRj) The spatial similarity, TSim (TR), between two sub-track segments is showni,TRj) Shown is the temporal similarity, STSim (TR) between the two sub-tracksi,TRj) Representing the calculated space-time similarity measurement of the two sub-track segments;
step 1.33, calculating the space-time distance between the two sub-tracks, wherein the calculation method comprises the following steps:
STDist(TRi,TRj)=1-STSim(TRi,TRj);
in the formula, STSim (TR)i,TRj) Shown is a spatio-temporal similarity metric, STDist (TR) between two sub-trajectory segmentsi,TRj) Representing the spatiotemporal distance between two sub-trajectory segments;
the specific steps of the step 2 are as follows:
step 2.1, calculating the neighbor scale change of the sampling points on each track segment according to the space-time similarity, the space-time distance, the neighbor scale evolution algorithm and the DBSCAN algorithm of the sub-tracks;
step 2.2, calculating the distance between each sampling point on the track segment and other sampling points, marking the sampling point with the maximum distance to one sampling point as max, marking the sampling point with the minimum distance to the sampling point as min, if max is more than 2min, dividing the sampling point into a vibration object set, otherwise, dividing the sampling point into a stable object set;
step 2.3, initializing the Cluster _ id of the clusters in the stable object set and the oscillation object set to be 1, and defaulting the Cluster number of the nodes in the stable object set to be 0;
2.4, randomly selecting a core object v with a cluster number of 0 in the stable object set, and searching an object set Reach with reachable density in a breadth-first mode;
2.5, searching a Core object set Core in the object set Reach, and searching the minimum Cluster number Min _ Cluster in the Core object set Core;
step 2.6, if Min _ Cluster is 0, marking the Cluster numbers of the object set Reach and the core object v as Cluster _ id, and if not, searching the object set Connect connected with the object set Reach and the core object v in density, and marking the Cluster numbers of the object set Reach, the object set Connect connected with the density and the core object v as Min _ Cluster, namely clustering to obtain a Cluster;
step 2.7, judging whether the core object v still exists in the stable set object, if so, returning to the step 2.3, otherwise, obtaining all the class clusters, and performing the step 2.8;
step 2.8, distinguishing boundary points and noise points in the Oscillation object set oscillography, and distributing the boundary points to different clusters in the cluster class;
the specific steps of the step 3 are as follows:
step 3.1, counting the number n of clusters obtained by clustering and the number m of all track segments;
step 3.2, making p equal to m/n; step 3.3, if the number of the track sections contained in the cluster obtained by clustering is more than p, marking the cluster as a significant cluster, otherwise, marking the cluster as a non-significant cluster;
step 3.4: selecting a significant cluster C from the clustering results, and setting the starting point of the track segment contained in the significant cluster C as a point set K;
step 3.5, randomly selecting a breakpoint b from the point set K, combining other breakpoints and the breakpoint b in sequence to form an expandable point set Q, and if the added breakpoint b causes that the minimum circumscribed circle radius of the point set Q is larger than a pre-specified threshold β, deleting the breakpoint b from the point set Q;
step 3.6, traversing all points of the point set K, and if the distribution of the broken point number contained in the point set Q is more than a threshold value α, marking the point set Q as a staying spot;
step 3.7: repeating the step 3.5 to the step 3.6 until all candidate stay spots in the significant cluster C are generated;
step 3.8: repeating the step 3.4 to the step 3.7 until the complete salient cluster is traversed;
the specific steps of the step 4 are as follows:
step 4.1, calculating the stay heat degree information corresponding to each stay spot, wherein the calculation method comprises the following steps:
Figure FDA0002505037480000031
wherein h isspotTo retain the heat of the spot, nsubtraNumber of track segments included for staying spots, ntraIndicating the number of traces that the dwell spot contains, β being a factor;
4.2, obtaining a high-heat-degree area of the staying spots from the staying heat-degree information;
and 4.3, obtaining a hot spot area in the road network according to the area where the high-heat stay spots are located.
CN201710735328.1A 2017-08-24 2017-08-24 Road network hot spot area mining method Active CN107301254B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710735328.1A CN107301254B (en) 2017-08-24 2017-08-24 Road network hot spot area mining method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710735328.1A CN107301254B (en) 2017-08-24 2017-08-24 Road network hot spot area mining method

Publications (2)

Publication Number Publication Date
CN107301254A CN107301254A (en) 2017-10-27
CN107301254B true CN107301254B (en) 2020-07-10

Family

ID=60132049

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710735328.1A Active CN107301254B (en) 2017-08-24 2017-08-24 Road network hot spot area mining method

Country Status (1)

Country Link
CN (1) CN107301254B (en)

Families Citing this family (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108133172B (en) * 2017-11-16 2022-04-05 北京华道兴科技有限公司 Method for classifying moving objects in video and method and device for analyzing traffic flow
CN108427965B (en) * 2018-03-05 2022-08-23 重庆邮电大学 Hot spot area mining method based on road network clustering
CN108804563B (en) * 2018-05-22 2021-11-19 创新先进技术有限公司 Data labeling method, device and equipment
CN109033011B (en) * 2018-06-19 2022-06-21 东软集团股份有限公司 Method and device for calculating track frequency, storage medium and electronic equipment
CN109767615B (en) * 2018-10-19 2021-05-18 江苏智通交通科技有限公司 Method for analyzing key flow direction and key path of road network traffic flow
CN109686085B (en) * 2018-12-17 2020-05-05 北京交通大学 GPS data based dangerous cargo transport vehicle stop node activity type identification method
CN109711451A (en) * 2018-12-20 2019-05-03 成都四方伟业软件股份有限公司 A kind of data processing method, device, electronic equipment and storage medium
CN109615857B (en) * 2018-12-20 2021-02-05 首都师范大学 Deployment and scheduling method and device for roadside units in urban vehicle-mounted network
CN111380541B (en) * 2018-12-29 2022-09-13 沈阳美行科技股份有限公司 Interest point determination method and device, computer equipment and storage medium
CN111488417B (en) * 2019-01-28 2023-10-24 阿里巴巴集团控股有限公司 Information processing method, system, device, equipment and computer storage medium
CN109977109B (en) * 2019-04-03 2021-04-27 深圳市甲易科技有限公司 Track data accompanying analysis method
CN110174115B (en) * 2019-06-05 2021-03-16 武汉中海庭数据技术有限公司 Method and device for automatically generating high-precision positioning map based on perception data
CN110457315A (en) * 2019-07-19 2019-11-15 国家计算机网络与信息安全管理中心 A kind of group's accumulation mode analysis method and system based on user trajectory data
CN110909037B (en) * 2019-10-09 2024-02-13 中国人民解放军战略支援部队信息工程大学 Frequent track mode mining method and device
CN110827540B (en) * 2019-11-04 2021-03-12 黄传明 Motor vehicle movement mode recognition method and system based on multi-mode data fusion
CN111275963A (en) * 2020-01-14 2020-06-12 北京百度网讯科技有限公司 Method and device for mining hot spot area, electronic equipment and storage medium
CN111881370A (en) * 2020-05-21 2020-11-03 北京嘀嘀无限科技发展有限公司 Method and system for describing contour of interest area
CN111372188B (en) * 2020-05-27 2020-09-01 腾讯科技(深圳)有限公司 Method and device for determining hot spot track in area, storage medium and electronic device
CN111667392B (en) * 2020-06-12 2023-06-16 成都国铁电气设备有限公司 Railway contact net defect hot spot area early warning method based on space-time clustering
CN111797295B (en) * 2020-06-19 2021-04-02 云从科技集团股份有限公司 Multi-dimensional space-time trajectory fusion method and device, machine readable medium and equipment
CN111897805B (en) * 2020-06-24 2022-11-11 东南大学 Hot spot path mining method based on longest common sub-track density clustering
CN113129328B (en) * 2021-04-22 2022-05-17 中国电子科技集团公司第二十九研究所 Target hotspot area fine analysis method

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103559213A (en) * 2013-10-10 2014-02-05 河南大学 Efficient spatial nearest neighbor query method for highway networks
CN103578265A (en) * 2012-07-18 2014-02-12 北京掌城科技有限公司 Method for acquiring taxi-hailing hot spot based on taxi GPS data
CN106383868A (en) * 2016-09-05 2017-02-08 电子科技大学 Road network-based spatio-temporal trajectory clustering method
WO2017107800A1 (en) * 2015-12-24 2017-06-29 阿里巴巴集团控股有限公司 Method of acquiring route hotspot of traffic road and device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103578265A (en) * 2012-07-18 2014-02-12 北京掌城科技有限公司 Method for acquiring taxi-hailing hot spot based on taxi GPS data
CN103559213A (en) * 2013-10-10 2014-02-05 河南大学 Efficient spatial nearest neighbor query method for highway networks
WO2017107800A1 (en) * 2015-12-24 2017-06-29 阿里巴巴集团控股有限公司 Method of acquiring route hotspot of traffic road and device
CN106383868A (en) * 2016-09-05 2017-02-08 电子科技大学 Road network-based spatio-temporal trajectory clustering method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
A Novel Passenger Hotspots Searching Algorithm for Taxis in Urban Area;Yuhan Dong,et al;《IEEE Computer society》;20170628;第175~180页 *
Discovering Hotspots:A Placement Strategy for Wi-Fi based Trajectory Monitoring within Buildings;Lorenz Schauer;《IEEE CONFERENCE》;20151111;第371~380页 *

Also Published As

Publication number Publication date
CN107301254A (en) 2017-10-27

Similar Documents

Publication Publication Date Title
CN107301254B (en) Road network hot spot area mining method
CN107103754B (en) Road traffic condition prediction method and system
US10323948B2 (en) GPS data repair
CN111475596B (en) Sub-segment similarity matching method based on multi-level track coding tree
CN104462190B (en) A kind of online position predicting method excavated based on magnanimity space tracking
CN109241126B (en) Spatio-temporal trajectory aggregation mode mining algorithm based on R-tree index
CN111046968B (en) Road network track clustering analysis method based on improved DPC algorithm
CN113723715B (en) Method, system, equipment and storage medium for automatically matching public transport network with road network
CN117093832B (en) Data interpolation method and system for air quality data loss
CN111024098A (en) Motor vehicle path fitting algorithm based on low-sampling data
Kriegel et al. Proximity queries in large traffic networks
Makris et al. A comparison of trajectory compression algorithms over AIS data
CN111027574A (en) Building mode identification method based on graph convolution
US20160019248A1 (en) Methods for processing within-distance queries
CN115292962B (en) Path similarity matching method and device based on track rarefaction and storage medium
US11988522B2 (en) Method, data processing apparatus and computer program product for generating map data
CN116452826A (en) Coal gangue contour estimation method based on machine vision under shielding condition
CN117516558A (en) Road network generation method, device, computer equipment and computer readable storage medium
WO2022127573A1 (en) User trajectory positioning method, electronic device and computer storage medium
Wang et al. Accurate Detection of Road Network Anomaly by Understanding Crowd's Driving Strategies from Human Mobility
CN114664104A (en) Road network matching method and device
CN109446264B (en) Urban mobile data analysis method based on flow visualization
Tian et al. Road crack detection algorithm based on YOLOv3
CN111723792A (en) Real-time positioning point identification method suitable for rigid-flexible contact network
CN115659162B (en) Method, system and equipment for extracting intra-pulse characteristics of radar radiation source signals

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant