CN110909788B - Statistical clustering-based road intersection position identification method in track data - Google Patents

Statistical clustering-based road intersection position identification method in track data Download PDF

Info

Publication number
CN110909788B
CN110909788B CN201911135825.3A CN201911135825A CN110909788B CN 110909788 B CN110909788 B CN 110909788B CN 201911135825 A CN201911135825 A CN 201911135825A CN 110909788 B CN110909788 B CN 110909788B
Authority
CN
China
Prior art keywords
sampling
points
cluster
point
sampling points
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911135825.3A
Other languages
Chinese (zh)
Other versions
CN110909788A (en
Inventor
邓敏
张建国
郑旭东
唐建波
刘慧敏
陈雪莹
黄金彩
张华剑
姚劲
张幼英
芦春霞
石岩
刘宝举
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hunan Botong Information Co ltd
Central South University
Original Assignee
Hunan Botong Information Co ltd
Central South University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hunan Botong Information Co ltd, Central South University filed Critical Hunan Botong Information Co ltd
Priority to CN201911135825.3A priority Critical patent/CN110909788B/en
Publication of CN110909788A publication Critical patent/CN110909788A/en
Application granted granted Critical
Publication of CN110909788B publication Critical patent/CN110909788B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S19/00Satellite radio beacon positioning systems; Determining position, velocity or attitude using signals transmitted by such systems
    • G01S19/38Determining a navigation solution using signals transmitted by a satellite radio beacon positioning system
    • G01S19/39Determining a navigation solution using signals transmitted by a satellite radio beacon positioning system the satellite radio beacon positioning system transmitting time-stamped messages, e.g. GPS [Global Positioning System], GLONASS [Global Orbiting Navigation Satellite System] or GALILEO
    • G01S19/42Determining position

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Remote Sensing (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Traffic Control Systems (AREA)

Abstract

The invention provides a method for identifying the position of a road intersection in track data based on statistical clustering, which comprises the steps of track data coordinate projection conversion and track data simplification processing; aiming at simplified track data, a multi-core parallel computing mode is adopted, sampling points in the area of the road intersection are identified according to the steering changes of the sampling points, and a steering sampling point set is formed; taking a steering sampling point set as input, automatically dividing and clustering by using a self-adaptive statistical clustering algorithm, and stripping road intersections at different positions; and finally, aiming at each steering sampling point cluster, calculating the central position and the radius of the minimum circumscribed circle covering the cluster through a minimum circumscribed circle fitting algorithm, and representing the central position and the area range of the road intersection by the central position and the radius of the minimum circumscribed circle.

Description

Statistical clustering-based road intersection position identification method in track data
Technical Field
The invention relates to the crossing field of computer vision and track data processing, in particular to a method for identifying road intersection positions in track data based on statistical clustering.
Background
The road network map is important basic geographic information data and an important data source on which applications such as travel and path planning depend. With the rapid development of urban construction, urban road networks are constantly updated and changed. The conventional map data updating technology and method (such as field mapping, remote sensing data mapping and the like) have the problems of long data updating period, high cost and the like when being applied to road network data updating, so that how to obtain timely updated road network map data is still a difficult problem to be solved at present. With the wide application of the Global Positioning System (GPS), more and more vehicles (such as taxis, buses and the like) are equipped with GPS positioning devices, so that information such as the position, speed, driving direction, data acquisition time and the like of the vehicle can be acquired in real time, and massive GPS vehicle trajectory data is formed. The vehicle track data records the running state of the vehicle in the road network, and simultaneously contains rich road network information (such as the geometric structure, road steering and other information of the road network), thereby providing possibility for real-time mapping and updating of the road network.
Many scholars have studied extracting road networks based on vehicle trajectory data and proposed many sophisticated algorithms, such as trogliang et al (2015), Huang et al (2018). Road intersections are important components of a road network and also important nodes which need to be considered in traffic navigation and path planning applications. Most of the existing road network extraction methods based on vehicle track data regard road intersections as simple nodes, and the internal geometric structure and topological connection relation of the intersections are not refinedAnd the fine modeling cannot meet the application requirement of fine path planning of the user at the intersection position. In order to finely model a road intersection, some scholars extract trajectory data at the intersection position to generate a road intersection map by curve fitting, such as Wang et al (2015), Deng et al (2018). The identification of the position and the range of the road intersection in the track data is a key problem which needs to be solved firstly when a fine model of the road intersection is constructed. Some scholars explore and try to identify intersection locations from trajectory data. For example, a shape descriptor is first defined in Fathi and krumm (2010) to measure the characteristics of a track sampling point, and then the characteristics are used as input to train a classifier through an Adaboost algorithm to identify the position of a road intersection. However, a large number of training samples need to be constructed in the method, the method calculates the multidimensional characteristics of a large number of track sampling points, so that the calculated amount is large, and the algorithm result has strong dependence on the sample quality and is difficult to popularize widely. Mariescu-Istodor and
Figure BDA0002279571470000021
(2018) a similar road intersection detector is provided, and a circle with a certain size is adopted to count the distribution characteristics of track points falling into the circle so as to identify the position of a road intersection. Liu et al (2013) propose an expanded road intersection model for the production of a lane road network map, and use high-precision track data to construct an intersection internal detail geometric structure and a topological relation. The road intersection detail construction method needs to determine the position and the range of a road intersection first. Wang et al (2015) adopt a local statistic of spatial autocorrelation, namely a G index, to detect the position of a road intersection, firstly extract track points with large steering change from track data, further change an angle into a non-spatial attribute of the track points, and adopt the G index to identify a hot spot area with steering change as the road intersection. Although the road intersection identification method based on the G index can identify a hot spot region with steering change (namely, a track point with large steering change), the G index cannot directly cluster the discrete track points to separate different road intersections, so that the method further depends onAnd clustering the track points by a clustering algorithm. Most of the methods need to set a threshold value and a clustering parameter of the G index, and the adaptability of the algorithm is not strong for different track data. Tang furlight et al (2017) propose a road intersection identification method based on turning point pair clustering aiming at the limitation of the existing road intersection identification method and are used for constructing the detailed structure of the intersection. The method identifies the position and the range of the intersection through the connectivity clustering of the local turning point pair, and needs to input neighborhood radius parameters to determine the spatial proximity relation of the turning point pair, and the clustering parameter setting is difficult to automatically estimate for the track point data with different densities.
Generally, an automatic detection algorithm for the position of a road intersection in the current track data is still lacking, and the existing method mainly has the following problems: firstly, the existing machine learning classification algorithm based on shape description characteristics has high requirements on sample quality when identifying the road intersection, and the classification algorithm is complex and has low efficiency; the road intersection detection algorithm depends on a clustering algorithm to strip different intersections, but the adopted clustering algorithm needs more parameter settings (such as a G index threshold value, a spatial neighborhood radius in connectivity clustering and the like), and the quality of an identification result has strong dependence on the parameter settings; and false clusters generated by noise interference and the like in the track data are difficult to eliminate, so that the quality of the final road intersection identification result is influenced.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides a method for identifying the position of a road intersection in track data based on statistical clustering. The technical problems solved by the invention mainly comprise: firstly, automatically detecting steering track points at road intersections in track data; self-adaptive clustering calculation of the central position and the spatial range of the road intersection is carried out on the track points; and the detection efficiency and the automation level of the positions of the road intersections in the existing track data are improved, and the problem of rapid positioning of the road intersections in the track data is solved.
The invention aims to solve the technical problems in the prior art, and discloses a method for identifying the positions of road intersections in track data based on statistical clustering, which comprises the following steps:
step 1, because a track data coordinate system collected by vehicle GPS equipment is a WGS-84 coordinate system, the spatial position of a sampling point is stored in longitude and latitude, and QGIS software is adopted to convert an original vehicle track data coordinate system into a plane projection rectangular coordinate system;
step 2, carrying out data processing on the track data, wherein the track data is simplified by using a Douglas-Peucker algorithm, and redundant sampling points on a straight-line segment of the track are removed;
and 3, calculating the steering change of front and rear sampling points on each vehicle track by adopting a parallel calculation mode, detecting the sampling points with larger steering change by setting a steering change threshold omega of the sampling points, marking the sampling points as candidate points of a road intersection region, eliminating the sampling points which are not near the intersection on the track, reserving the sampling points with larger steering change on the track, and combining the sampling points with larger steering change detected in all track data to form a steering sampling point set. Wherein the preset steering variation threshold ω is 45 degrees;
step 4, the steering sampling point set obtained in the step 3 is used as input, the steering sampling points are clustered through a self-adaptive statistical clustering algorithm, and intersections at different positions are stripped to enable each cluster to correspond to one road intersection;
and 5, taking the clustered sampling points obtained in the step 4 as input, calculating the minimum circumcircle of each clustering point through a minimum circle fitting algorithm for each cluster, returning the central position of the minimum circumcircle and the radius value of the circumcircle as the estimation of the central position and the area range of the road intersection, and outputting the detected central positions and the radius of the area range of all the road intersections.
Preferably, in step 1, the trajectory data coordinate system is converted by using QGIS software, wherein when projection parameters of the QGIS coordinate conversion are set, the rectangular planar projection coordinate system selects a gaussian-kluger projection with a 6-degree band as a projection model, the central longitude of the projection coordinate system selects a 6-degree band which can cover the most original trajectory points for projection conversion, and trajectory data after projection is obtained through software calculation.
Preferably, in step 2, the Douglas-Peucker algorithm is adopted to simplify the trajectory data, and the distance threshold parameter d of the Douglas-Peucker algorithm is 3 meters.
Preferably, the step 3 further comprises: the candidate points of the intersection area are identified by calculating the steering variation of the track sampling points, the implementation mode is as follows,
let the simplified trajectory data set in step 2 be S ═ T1,T2,…,TnIn which T isjRepresenting the jth vehicle track, n being the total number of input tracks, Tj={P1,P2,…,PmIn which P isiRepresenting a track TjM is the trace TjTotal number of middle sampling points, PiIs a five-membered group and is specifically represented by Pi=(xi,yi,ti,oi,vi) Wherein x isiAnd yiRespectively representing the sampling points PiX and Y coordinate values (unit: meter), tiRepresents the sampling time (unit: second), oiRepresenting a sample point PiThe vehicle heading (starting from true north in degrees), v, recorded at the locationiRepresenting a sample point PiThe vehicle travel speed recorded at the location (unit: km/h);
for a track T in Sj={P1,P2,…,Pm}, calculating TjTwo adjacent sampling points PiAnd Pi+1The difference value of the advancing directions of the vehicles is recorded as a sampling point PiSteering variation amount Δ θ (P)i) The specific calculation formula is as follows:
Figure BDA0002279571470000041
adding a class label attribute item to each sampling point if the sampling point PiSteering variation amount Δ θ (P)i) If the angle is larger than the preset angle threshold value omega, the angle will beThe sampling point PiIs marked with a class label of 1, otherwise the sample point P is samplediThe category label of (1) is marked as 0, the steering variation of the sampling points of all the tracks in the S is calculated, and a candidate point set of the road intersection area is detected by the sampling points of which the category labels are 1.
Preferably, in the step 3, a multi-core parallel computing technique is adopted to perform parallel computing, and firstly, according to the number k of CPU cores of the computer, the trajectory data is sequentially divided into k subsets, and the subsets are respectively allocated to corresponding CPUs to perform detection and computation of steering sampling points in each trajectory data.
Preferably, the step 4 further comprises: adopting an Adaptive statistical Clustering method based on Hotspot Detection, ASCHD to cluster the steering sampling point set calculated in the step 3, stripping road intersections at different positions, and realizing the following method,
setting the steering sampling point set D obtained in the step 3 as { P }1,P2,…,PKIn which P isiRepresenting the ith sampling point, and K represents the total number of the steering sampling points obtained in the step 3; firstly, constructing a Voronoi diagram by utilizing the spatial position of each sampling point in a set D, calculating the area of a Voronoi diagram unit corresponding to each sampling point, taking the area value as the non-spatial attribute value of the sampling point, and simultaneously defining the two sampling points as spatial proximity point pairs if the Voronoi diagram units of the two sampling points have a common edge, or else, defining the two sampling points as non-spatial proximity point pairs; then, a G index is calculated for each sampling point, and the specific calculation formula is as follows:
Figure BDA0002279571470000051
wherein z isjIs a sampling point PjIs the mean of the Voronoi diagram cell areas of all the sampling points, σ is the standard deviation of the Voronoi diagram cell areas of all the sampling points, wi,jIs a sampling point PiAnd PjK is the total number of the concentrated sampling points of the steering sampling points;
then, after the G index value of each sampling point in the D is obtained through calculation, sampling points with the G index larger than 0 are detected to form a clustering seed point; for each seed point PiMark it as a new cluster Ci ═ PiSearching for a sample point P whose spatial neighborhood is a pair of seed pointskIf Ci and PkIf the following statistic λ can be increased after the combination, then Ci and P are addedkCombining and merging, i.e. Ci ═ Pi,PkAnd if not, combining, continuously searching adjacent seed points to judge whether the statistic lambda is increased after combination, wherein the statistic lambda calculation formula is as follows:
Figure BDA0002279571470000052
wherein, CiRepresenting a set (or cluster) of sample points, PkIs CiIs one sampling point of (1), zkIs a sampling point PkIs the mean of the area of the Voronoi diagram cell for all sample points in D, σ is the standard deviation of the Voronoi diagram cell area for all sample points in D, and u is the set (or cluster) CiThe number of middle sampling points;
finally, the searching and merging process is iterated step by step until the adjacent seed points which can be merged continuously can not be found; removing the clustered and merged seed points from the seed point set, selecting one seed point from the uncombined seed point set, and repeatedly executing the processes of searching adjacent seed points and clustering and merging to generate a new cluster; the clustering process is stopped until all the seed points are merged or accessed, and the generated sampling point cluster C ═ C is returned1,C2,…,CRIn which C iskAnd R is the cluster number obtained by the calculation of the clustering algorithm for the kth cluster.
Preferably, the step 5 further comprises: the estimated values of the intersection center position and the area range radius are calculated by adopting a minimum circumcircle fitting algorithm, the realization mode is as follows,
setting the cluster obtained in the step 4 as C ═ C1,C2,…,CRIn which C iskFor the kth cluster, R is the cluster number obtained by the calculation of a clustering algorithm; ck={P1,P2,…,PMIn which P isiIs a cluster CkMiddle ith sample point, M represents middle cluster CkThe number of middle sampling points. Let xi,yiRespectively representing the sampling points PiX and Y coordinates (unit: meter); compute cluster CkCenter of minimum circumscribed circle (x)c k,yc k) The following were used:
Figure BDA0002279571470000061
cluster CkRadius of minimum circumscribed circle (r)k) The calculation formula is as follows:
Figure BDA0002279571470000062
for each cluster C in Ck(k is more than or equal to 1 and less than or equal to R), and calculating the center (x) of the minimum circumscribed circle of each cluster according to the step 1) and the step 2)c k,yc k) And the radius of the smallest circumscribed circle (r)k) Then (x)c k,yc k) Namely the estimated sum r of the central position of the k-th road intersectionkThe range radius of the kth intersection.
Compared with the prior art, the method and the device solve the problems that the existing method for identifying the road intersection in the track data is low in efficiency, multiple in algorithm parameters, serious in detection result quality depending on parameter setting and the like, improve the efficiency and the quality of detecting the position of the road intersection in the track data, and reduce the problems of error detection of the position of the intersection and the like caused by the problems of noise, uneven distribution of sampling points and the like in the track data; the method for detecting the position of the road intersection in the track data can provide input for fine modeling of the road intersection, is a key link for fine modeling of the road network intersection, and has great application prospect and practical value in high-precision road network model construction, road network data production, updating and other applications.
Drawings
The invention will be further understood from the following description in conjunction with the accompanying drawings. The components in the figures are not necessarily to scale, emphasis instead being placed upon illustrating the principles of the embodiments. In the drawings, like reference numerals designate corresponding parts throughout the different views.
FIG. 1 is a flow chart of the overall structure of the method for identifying the position of a road intersection in track data based on statistical clustering according to the present invention;
FIG. 2 is a flow chart of an embodiment of the present invention.
Detailed Description
The invention provides a method for identifying positions of road intersections in track data based on statistical clustering, which is mainly based on theories and technologies of pattern recognition, computational geometry and spatial clustering analysis. According to the method, the track sampling point set in the intersection area of the road is detected through the steering change characteristics of the sampling points in the track data, then the intersection sampling points at different positions are stripped by using a self-adaptive statistical clustering method, the minimum circumscribed circle fitting calculation is carried out on the sampling points in the intersection area of the different roads to obtain the central position of the intersection and the radius of the intersection area range, the track steering sampling points are detected through track data simplification and multi-core parallelization calculation, the automatic detection efficiency of the intersection position in massive track data is improved, the intersection stripping is carried out by using the self-adaptive clustering method, algorithm parameters are reduced, and the automation degree of the intersection position detection is improved.
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. The method provided by the invention can realize automatic operation flow by using a computer software technology.
Example one
The embodiment provides a method for identifying positions of intersections in track data based on statistical clustering as shown in fig. 1, and the specific implementation steps are as shown in fig. 2, and the method comprises the following steps:
step 1, track data collected by vehicle GPS equipment is a WGS-84 coordinate system, track data coordinates are expressed by longitude and latitude, and coordinate conversion is needed when the space distance between two track points is directly calculated. For the convenience of spatial distance calculation, firstly, a QGIS software is adopted to convert an original vehicle track data coordinate system into a plane projection rectangular coordinate system. When projection parameters of QGIS coordinate conversion are set, the rectangular plane projection coordinate system selects Gaussian-Kruge projection with a 6-degree band as a projection model, the central longitude of the projection coordinate system selects the 6-degree band which can cover the most original track points for projection conversion, and projected track data is obtained through calculation of QGIS software. Wherein, the selection process of the 6-degree band number is explained as follows:
let the set of all sampling points in the original trajectory data be G ═ P1,P2,…,PNIn which P isiIs the ith sample point and lati、loniRespectively represent PiThe latitude and longitude coordinates of the point, and N represents the total number of sample points. The calculation formula of the 6-degree band number (f) of the projection coordinate system is as follows:
Figure BDA0002279571470000081
where INT () represents rounding a value.
And 2, in order to improve the efficiency of processing the large-scale track data, firstly, simplifying the track data by adopting a Douglas-Peucker algorithm and removing redundant sampling points on a track straight-line segment. Wherein, the distance threshold parameter d of the Douglas-Peucker algorithm is 3 meters. The Douglas-Peucker algorithm reduces the trajectory data as follows:
1) for each track, connecting a straight line segment AB between the head sampling point and the tail sampling point (marked as A and B) of the track, wherein the straight line segment is a chord of the track;
2) calculating a point (marked as C) with the maximum distance from each sampling point to the chord AB on the track, and calculating the vertical distance delta d between the C and the straight line segment AB; if the distance Δ d is less than or equal to a preset distance threshold parameter d (d is 3 meters), taking the straight line segment AB as an approximation of the original trajectory, namely, a simplified trajectory; if the distance delta d is larger than a preset distance threshold parameter d (d is 3 meters), dividing the original track into two sub-track segments AC and CB at the C point, and respectively performing the processing of the step 1) and the step 2) on the two sub-track segments;
3) when all sub-track sections are processed, sequentially connecting all the division points to form a broken line, and taking the broken line as a track after the original track is simplified;
4) and processing all tracks according to the steps 1) -3), and stopping the track data simplifying process until all tracks are processed.
And 3, in order to improve the detection efficiency of the steering sampling points in the large-scale track data, a parallelization calculation mode is adopted, namely a multi-core parallel calculation technology is adopted to fully utilize multi-core CPU calculation resources. Firstly, according to the number of CPU cores (such as k CPUs) of a computer, the track data is divided into k subsets in sequence, and the subsets are respectively distributed to the corresponding CPUs to detect and calculate steering sampling points in the track data. The implementation steps of determining the steering sampling points by calculating the steering variable quantity between two adjacent sampling points in each track data are as follows:
let the simplified trajectory data set in step 2 be S ═ T1,T2,…,TnIn which T isjThe j-th vehicle track is shown, and n is the total number of the input tracks. Track Tj={P1,P2,…,PmIn which P isiRepresenting a track TjM is the trace TjTotal number of middle sampling points, PiIs a five-membered group and is specifically represented by Pi=(xi,yi,ti,oi,vi) Wherein x isiAnd yiRespectively representing the sampling points PiX and Y coordinate values (unit: meter), tiRepresents the sampling time (unit: second), oiRepresenting a sample point PiTo the recorded advancing direction of the vehicleDirection (starting from north direction, unit: degree), viRepresenting a sample point PiThe vehicle travel speed recorded at the location (unit: km/h);
1) first, the steering variation amount of the sampling point is calculated. For a track T in Sj={P1,P2,…,Pm}, calculating TjTwo adjacent sampling points (e.g. P)iAnd Pi+1) The difference value of the advancing directions of the vehicles is recorded as a sampling point PiSteering variation amount Δ θ (P)i) The specific calculation formula is as follows:
Figure BDA0002279571470000091
2) second, the steering sample points are identified. The concrete implementation steps are as follows: adding a class label attribute item (default value is set to 0) to each sampling point if the sampling point PiSteering variation amount Δ θ (P)i) If the sampling point P is larger than the preset angle threshold value omega, the sampling point P is measurediIs marked as 1 (i.e. the candidate point representing the road intersection area), otherwise, the sampling point P is marked asiThe category label of (1) is marked 0 (i.e., represents other points). And calculating the steering variation of the sampling points of all the tracks in the S, and detecting a candidate point set of which the sampling points with the class labels of 1 form a road intersection region.
3) And finally, combining all steering sampling points detected by the multi-core CPU distribution calculation into a set of sampling points.
Wherein, the angle threshold ω preset in step 3 is 45 degrees.
And 4, taking the steering sampling point set obtained by calculation in the step 3 as input, and dividing the steering sampling points into different clusters according to Spatial position proximity by adopting an Adaptive statistical Clustering method on Hotspot Detection (ASCHD). The ASCHD algorithm is specifically realized by the following steps:
setting the steering sampling point set D obtained in the step 3 as { P }1,P2,…,PKIn which P isiRepresents the ithSampling points, wherein K represents the total number of the steering sampling points obtained in the step 3;
1) and constructing a Voronoi diagram by utilizing the spatial position of each sampling point in the set D, calculating the area of a Voronoi diagram unit corresponding to each sampling point, and taking the area value as the non-spatial attribute value of the sampling point. Meanwhile, if the Voronoi diagram elements of two sampling points have a common edge, the two sampling points are defined as a spatial neighboring point pair, otherwise, the two sampling points are non-spatial neighboring point pairs. Then, a G index is calculated for each sampling point, and the specific calculation formula is as follows:
Figure BDA0002279571470000101
in the formula, zjIs a sampling point PjIs the mean of the Voronoi diagram cell areas of all the sampling points, σ is the standard deviation of the Voronoi diagram cell areas of all the sampling points, wi,jIs a sampling point PiAnd PjSpatial weight between (if sample point P is sampled)iAnd PjIs a pair of spatially adjacent points wi,j1, otherwise wi,j0), and K is the total number of the centralized sampling points of the turning sampling points;
2) and after the G index value of each sampling point in the D is obtained through calculation, sampling points with the G index larger than 0 are detected to form a clustering seed point. For each seed point (e.g., P)i) Mark it as a new cluster Ci ═ PiSearch for samples (e.g., P) whose spatial neighbors also belong to the seed pointk) If Ci and PkIf the following statistic λ can be increased after the combination, then Ci and P are addedkCombining and merging, i.e. Ci ═ Pi,PkAnd if not, continuing to search adjacent seed points to judge whether the statistic lambda is increased after combination. The statistic λ is calculated as follows:
Figure BDA0002279571470000102
wherein, CiRepresenting a set (or cluster) of sample pointsClass), PkIs CiIs one sampling point of (1), zkIs a sampling point PkIs the mean of the area of the Voronoi diagram cell for all sample points in D, σ is the standard deviation of the Voronoi diagram cell area for all sample points in D, and u is the set (or cluster) CiThe number of middle sampling points;
3) gradually iterating to search and combine processes, and stopping combining until adjacent seed points which can be combined continuously cannot be found; removing the clustered and merged seed points from the seed point set, selecting one seed point from the uncombined seed point set, and repeatedly executing the processes of searching adjacent seed points and clustering and merging to generate a new cluster; the clustering process is stopped until all the seed points are merged or accessed, and the generated sampling point cluster C ═ C is returned1,C2,…,CRIn which C iskAnd R is the cluster number obtained by the calculation of the clustering algorithm for the kth cluster.
And 5, taking the clusters obtained in the step 4 as input, calculating the minimum circumcircle of each cluster by a minimum circle fitting algorithm aiming at each cluster, and returning the central position of the minimum circumcircle of each cluster and the radius value of the circumcircle. The minimum circle fitting algorithm is realized by the following steps:
let the cluster obtained in step 4 be C ═ C1,C2,…,CRIn which C iskAnd R is the cluster number obtained by the calculation of the clustering algorithm for the kth cluster. Ck={P1,P2,…,PMIn which P isiIs a cluster CkMiddle ith sample point, M represents middle cluster CkThe number of middle sampling points. Let xi,yiRespectively representing the sampling points PiX-coordinate and Y-coordinate (unit: meter).
1) Compute cluster CkCenter of minimum circumscribed circle (x)c k,yc k) The following were used:
Figure BDA0002279571470000111
2) cluster CkRadius of minimum circumscribed circle (r)k) The calculation formula is as follows:
Figure BDA0002279571470000112
3) for each cluster C in Ck(k is more than or equal to 1 and less than or equal to R), and calculating the center (x) of the minimum circumscribed circle of each cluster according to the step 1) and the step 2)c k,yc k) And the radius of the smallest circumscribed circle (r)k) Then (x)c k,yc k) Namely the estimated sum r of the central position of the k-th road intersectionkThe range radius of the kth intersection.

Claims (6)

1. A method for identifying the positions of road intersections in track data based on statistical clustering is characterized by comprising the following steps:
step 1, because a track data coordinate system collected by vehicle GPS equipment is a WGS-84 coordinate system, the spatial position of a sampling point is stored in longitude and latitude, and QGIS software is adopted to convert an original vehicle track data coordinate system into a plane projection rectangular coordinate system;
step 2, carrying out data processing on the track data, wherein the track data is simplified by using a Douglas-Peucker algorithm, and redundant sampling points on a straight-line segment of the track are removed;
step 3, calculating the steering change of front and rear sampling points on each vehicle track by adopting a parallel calculation mode, detecting the sampling points with larger steering change by setting a steering change threshold omega of the sampling points, marking the sampling points as candidate points of a road intersection region, removing the sampling points which are not near the intersection on the track, reserving the sampling points with larger steering change on the track, combining the sampling points with larger steering change detected in all track data to form a steering sampling point set, wherein the preset steering change threshold omega is 45 degrees;
step 4, the steering sampling point set obtained in the step 3 is used as input, the steering sampling points are clustered through a self-adaptive statistical clustering algorithm, and intersections at different positions are stripped to enable each cluster to correspond to one road intersection;
the step 4 further comprises the following steps: adopting an adaptive statistical clustering algorithm ASCHD to cluster the steering sampling point set obtained by the calculation in the step 3, stripping road intersections at different positions, and realizing the following method,
setting the steering sampling point set D obtained in the step 3 as { P }1,P2,…,PKIn which P isiRepresenting the ith sampling point, and K represents the total number of the steering sampling points obtained in the step 3; firstly, constructing a Voronoi diagram by utilizing the spatial position of each sampling point in a set D, calculating the area of a Voronoi diagram unit corresponding to each sampling point, taking the area value as the non-spatial attribute value of the sampling point, and simultaneously defining the two sampling points as spatial proximity point pairs if the Voronoi diagram units of the two sampling points have a common edge, or else, defining the two sampling points as non-spatial proximity point pairs; then, a G index is calculated for each sampling point, and the specific calculation formula is as follows:
Figure FDA0002718889780000021
wherein z isjIs a sampling point PjIs the mean of the Voronoi diagram cell areas of all the sampling points, σ is the standard deviation of the Voronoi diagram cell areas of all the sampling points, wi,jIs a sampling point PiAnd PjK is the total number of the concentrated sampling points of the steering sampling points;
then, after the G index value of each sampling point in the D is obtained through calculation, sampling points with the G index larger than 0 are detected to form a clustering seed point; for each seed point PiMark it as a new cluster Ci ═ PiSearching for a sample point P whose spatial neighborhood is a pair of seed pointskIf Ci and PkIf the following statistic λ can be increased after the combination, then Ci and P are addedkCo-proceed withMerging, i.e. Ci ═ Pi,PkAnd if not, combining, continuously searching adjacent seed points to judge whether the statistic lambda is increased after combination, wherein the statistic lambda calculation formula is as follows:
Figure FDA0002718889780000022
wherein, CiRepresenting a set (or cluster) of sample points, PkIs CiIs one sampling point of (1), zkIs a sampling point PkIs the mean of the area of the Voronoi diagram cell for all sample points in D, σ is the standard deviation of the Voronoi diagram cell area for all sample points in D, and u is the set (or cluster) CiThe number of middle sampling points;
finally, the searching and merging process is iterated step by step until the adjacent seed points which can be merged continuously can not be found; removing the clustered and merged seed points from the seed point set, selecting one seed point from the uncombined seed point set, and repeatedly executing the processes of searching adjacent seed points and clustering and merging to generate a new cluster; the clustering process is stopped until all the seed points are merged or accessed, and the generated sampling point cluster C ═ C is returned1,C2,…,CRIn which C iskFor the kth cluster, R is the cluster number obtained by the calculation of a clustering algorithm;
and 5, taking the clustered sampling points obtained in the step 4 as input, calculating the minimum circumcircle of each clustering point through a minimum circle fitting algorithm for each cluster, returning the central position of the minimum circumcircle and the radius value of the circumcircle as the estimation of the central position and the area range of the road intersection, and outputting the detected central positions and the radius of the area range of all the road intersections.
2. The method for identifying positions of intersections in track data based on statistical clustering according to claim 1, wherein: in the step 1, a trajectory data coordinate system is converted by sampling QGIS software, wherein when projection parameters of the QGIS coordinate conversion are set, a rectangular plane projection coordinate system selects a Gaussian-Krueger projection with a 6-degree band as a projection model, a central longitude of the projection coordinate system selects a 6-degree band which can cover the most original trajectory points for projection conversion, and trajectory data after projection is obtained through software calculation.
3. The method for identifying positions of intersections in track data based on statistical clustering according to claim 1, wherein the step 2 further comprises: and simplifying the track data by adopting a Douglas-Peucker algorithm, wherein the distance threshold parameter d of the Douglas-Peucker algorithm is 3 meters.
4. The method for identifying positions of intersections in track data based on statistical clustering according to claim 1, wherein the step 3 further comprises: the candidate points of the intersection area are identified by calculating the steering variation of the track sampling points, the implementation mode is as follows,
let the simplified trajectory data set in step 2 be S ═ T1,T2,…,TnIn which T isjRepresenting the jth vehicle track, n being the total number of input tracks, Tj={P1,P2,…,PmIn which P isiRepresenting a track TjM is the trace TjTotal number of middle sampling points, PiIs a five-membered group and is specifically represented by Pi=(xi,yi,ti,oi,vi) Wherein x isiAnd yiRespectively representing the sampling points PiX and Y coordinate values (unit: meter), tiRepresents the sampling time (unit: second), oiRepresenting a sample point PiThe vehicle heading (starting from true north in degrees), v, recorded at the locationiRepresenting a sample point PiThe vehicle travel speed recorded at the location (unit: km/h);
for a track T in Sj={P1,P2,…,Pm}, calculating TjIn adjacent toTwo sampling points PiAnd Pi+1The difference value of the advancing directions of the vehicles is recorded as a sampling point PiSteering variation amount Δ θ (P)i) The specific calculation formula is as follows:
Figure FDA0002718889780000031
adding a class label attribute item to each sampling point if the sampling point PiSteering variation amount Δ θ (P)i) If the sampling point P is larger than the preset angle threshold value omega, the sampling point P is measurediIs marked with a class label of 1, otherwise the sample point P is samplediThe category label of (1) is marked as 0, the steering variation of the sampling points of all the tracks in the S is calculated, and a candidate point set of the road intersection area is detected by the sampling points of which the category labels are 1.
5. The method for identifying positions of intersections in track data based on statistical clustering according to claim 1 or 4, wherein the step 3 further comprises: firstly, according to the CPU core number k of the computer, dividing the track data into k subsets in sequence, and respectively allocating the subsets to corresponding CPUs to detect and calculate steering sampling points in each track data.
6. The method for identifying positions of intersections in track data based on statistical clustering according to claim 1, wherein: the step 5 further comprises: the estimated values of the intersection center position and the area range radius are calculated by adopting a minimum circumcircle fitting algorithm, the realization mode is as follows,
setting the cluster obtained in the step 4 as C ═ C1,C2,…,CRIn which C iskFor the kth cluster, R is the cluster number obtained by the calculation of a clustering algorithm; ck={P1,P2,…,PMIn which P isiIs a cluster CkMiddle ith sample point, M represents middle cluster CkMiddle sampling pointThe number of the cells; let xi,yiRespectively representing the sampling points PiX and Y coordinates (unit: meter); compute cluster CkCenter of minimum circumscribed circle (x)c k,yc k) The following were used:
Figure FDA0002718889780000041
cluster CkRadius of minimum circumscribed circle (r)k) The calculation formula is as follows:
Figure FDA0002718889780000042
for each cluster C in Ck(k is more than or equal to 1 and less than or equal to R), and calculating the center (x) of the minimum circumscribed circle of each cluster according to the step 1) and the step 2)c k,yc k) And the radius of the smallest circumscribed circle (r)k) Then (x)c k,yc k) Namely the estimated sum r of the central position of the k-th road intersectionkThe range radius of the kth intersection.
CN201911135825.3A 2019-11-19 2019-11-19 Statistical clustering-based road intersection position identification method in track data Active CN110909788B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911135825.3A CN110909788B (en) 2019-11-19 2019-11-19 Statistical clustering-based road intersection position identification method in track data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911135825.3A CN110909788B (en) 2019-11-19 2019-11-19 Statistical clustering-based road intersection position identification method in track data

Publications (2)

Publication Number Publication Date
CN110909788A CN110909788A (en) 2020-03-24
CN110909788B true CN110909788B (en) 2020-11-27

Family

ID=69817940

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911135825.3A Active CN110909788B (en) 2019-11-19 2019-11-19 Statistical clustering-based road intersection position identification method in track data

Country Status (1)

Country Link
CN (1) CN110909788B (en)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113591890A (en) * 2020-04-30 2021-11-02 华为技术有限公司 Clustering method and device
CN112364890B (en) * 2020-10-20 2022-05-03 武汉大学 Intersection guiding method for making urban navigable network by taxi track
CN112307286B (en) * 2020-11-09 2023-03-14 西南大学 Vehicle track clustering method based on parallel ST-AGNES algorithm
CN112632150B (en) * 2020-12-24 2024-04-16 北京嘀嘀无限科技发展有限公司 Method and device for determining turning point and electronic equipment
CN112836586A (en) * 2021-01-06 2021-05-25 北京嘀嘀无限科技发展有限公司 Intersection information determination method, system and device
WO2022165802A1 (en) * 2021-02-07 2022-08-11 华为技术有限公司 Road boundary recognition method and apparatus
CN113283669B (en) * 2021-06-18 2023-09-19 南京大学 Active and passive combined intelligent planning travel investigation method and system
CN113485997B (en) * 2021-07-27 2023-10-31 中南大学 Trajectory data deviation rectifying method based on probability distribution deviation estimation
CN114067563B (en) * 2021-11-08 2023-01-24 上海万位科技有限公司 Intersection identification method and corresponding storage medium, product, model and reminding method and equipment
CN114139099B (en) * 2021-11-23 2024-06-07 长沙理工大学 Road intersection information extraction method based on track density homogenization and hierarchical segmentation

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105138668A (en) * 2015-09-06 2015-12-09 中山大学 Urban business center and retailing format concentrated area identification method based on POI data
US10332020B2 (en) * 2015-11-19 2019-06-25 GM Global Technology Operations LLC Method and apparatus of differentiating drivers based on driving behaviors
CN105788273B (en) * 2016-05-18 2018-03-27 武汉大学 The method of urban intersection automatic identification based on low precision space-time trajectory data
CN105788274B (en) * 2016-05-18 2018-03-27 武汉大学 Urban intersection track level structure extracting method based on space-time track big data
CN107563803B (en) * 2017-08-24 2020-12-08 北京工商大学 Quotient circle dividing method based on distance consumption grid
CN108549388A (en) * 2018-05-24 2018-09-18 苏州智伟达机器人科技有限公司 A kind of method for planning path for mobile robot based on improvement A star strategies

Also Published As

Publication number Publication date
CN110909788A (en) 2020-03-24

Similar Documents

Publication Publication Date Title
CN110909788B (en) Statistical clustering-based road intersection position identification method in track data
CN107766808B (en) Method and system for clustering moving tracks of vehicle objects in road network space
CN108920481B (en) Road network reconstruction method and system based on mobile phone positioning data
Wang et al. Automatic intersection and traffic rule detection by mining motor-vehicle GPS trajectories
CN109059944B (en) Motion planning method based on driving habit learning
CN104330089B (en) A kind of method that map match is carried out using history gps data
CN108961758B (en) Road junction widening lane detection method based on gradient lifting decision tree
CN109241069A (en) A kind of method and system that the road network based on track adaptive cluster quickly updates
CN105788273A (en) Urban intersection automatic identification method based on low precision space-time trajectory data
CN110738856B (en) Mobile clustering-based urban traffic jam fine identification method
CN111210612B (en) Method for extracting bus route track based on bus GPS data and station information
CN108415975A (en) Taxi hot spot recognition methods based on BDCH-DBSCAN
CN110389995B (en) Lane information detection method, apparatus, device, and medium
CN110598917B (en) Destination prediction method, system and storage medium based on path track
CN109256028A (en) A method of it is automatically generated for unpiloted high-precision road network
CN112580479A (en) Geomagnetic indoor positioning system based on cavity convolution neural network
Zhao et al. Automatic calibration of road intersection topology using trajectories
Fu et al. Density adaptive approach for generating road network from GPS trajectories
Gong et al. A two-level framework for place recognition with 3D LiDAR based on spatial relation graph
CN112559909B (en) Business area discovery method based on GCN embedded spatial clustering model
CN112991722A (en) Method and system for predicting real-time intersection of bus at high-frequency gps point
Gao et al. A Spatial Flow Clustering Method Based on the Constraint of Origin-Destination Points’ Location
CN110598755B (en) OD flow clustering method based on vector constraint
CN113887590A (en) Target typical track and area analysis method
CN113485997B (en) Trajectory data deviation rectifying method based on probability distribution deviation estimation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant