CN110909788B

CN110909788B - Statistical clustering-based road intersection position identification method in track data

Info

Publication number: CN110909788B
Application number: CN201911135825.3A
Authority: CN
Inventors: 邓敏; 张建国; 郑旭东; 唐建波; 刘慧敏; 陈雪莹; 黄金彩; 张华剑; 姚劲; 张幼英; 芦春霞; 石岩; 刘宝举
Original assignee: Hunan Botong Information Co ltd; Central South University
Current assignee: Hunan Botong Information Co ltd; Central South University
Priority date: 2019-11-19
Filing date: 2019-11-19
Publication date: 2020-11-27
Anticipated expiration: 2039-11-19
Also published as: CN110909788A

Abstract

The invention provides a method for identifying the position of a road intersection in track data based on statistical clustering, which comprises the steps of track data coordinate projection conversion and track data simplification processing; aiming at simplified track data, a multi-core parallel computing mode is adopted, sampling points in the area of the road intersection are identified according to the steering changes of the sampling points, and a steering sampling point set is formed; taking a steering sampling point set as input, automatically dividing and clustering by using a self-adaptive statistical clustering algorithm, and stripping road intersections at different positions; and finally, aiming at each steering sampling point cluster, calculating the central position and the radius of the minimum circumscribed circle covering the cluster through a minimum circumscribed circle fitting algorithm, and representing the central position and the area range of the road intersection by the central position and the radius of the minimum circumscribed circle.

Description

Statistical clustering-based road intersection position identification method in track data

Technical Field

The invention relates to the crossing field of computer vision and track data processing, in particular to a method for identifying road intersection positions in track data based on statistical clustering.

Background

The road network map is important basic geographic information data and an important data source on which applications such as travel and path planning depend. With the rapid development of urban construction, urban road networks are constantly updated and changed. The conventional map data updating technology and method (such as field mapping, remote sensing data mapping and the like) have the problems of long data updating period, high cost and the like when being applied to road network data updating, so that how to obtain timely updated road network map data is still a difficult problem to be solved at present. With the wide application of the Global Positioning System (GPS), more and more vehicles (such as taxis, buses and the like) are equipped with GPS positioning devices, so that information such as the position, speed, driving direction, data acquisition time and the like of the vehicle can be acquired in real time, and massive GPS vehicle trajectory data is formed. The vehicle track data records the running state of the vehicle in the road network, and simultaneously contains rich road network information (such as the geometric structure, road steering and other information of the road network), thereby providing possibility for real-time mapping and updating of the road network.

Many scholars have studied extracting road networks based on vehicle trajectory data and proposed many sophisticated algorithms, such as trogliang et al (2015), Huang et al (2018). Road intersections are important components of a road network and also important nodes which need to be considered in traffic navigation and path planning applications. Most of the existing road network extraction methods based on vehicle track data regard road intersections as simple nodes, and the internal geometric structure and topological connection relation of the intersections are not refinedAnd the fine modeling cannot meet the application requirement of fine path planning of the user at the intersection position. In order to finely model a road intersection, some scholars extract trajectory data at the intersection position to generate a road intersection map by curve fitting, such as Wang et al (2015), Deng et al (2018). The identification of the position and the range of the road intersection in the track data is a key problem which needs to be solved firstly when a fine model of the road intersection is constructed. Some scholars explore and try to identify intersection locations from trajectory data. For example, a shape descriptor is first defined in Fathi and krumm (2010) to measure the characteristics of a track sampling point, and then the characteristics are used as input to train a classifier through an Adaboost algorithm to identify the position of a road intersection. However, a large number of training samples need to be constructed in the method, the method calculates the multidimensional characteristics of a large number of track sampling points, so that the calculated amount is large, and the algorithm result has strong dependence on the sample quality and is difficult to popularize widely. Mariescu-Istodor and

(2018) a similar road intersection detector is provided, and a circle with a certain size is adopted to count the distribution characteristics of track points falling into the circle so as to identify the position of a road intersection. Liu et al (2013) propose an expanded road intersection model for the production of a lane road network map, and use high-precision track data to construct an intersection internal detail geometric structure and a topological relation. The road intersection detail construction method needs to determine the position and the range of a road intersection first. Wang et al (2015) adopt a local statistic of spatial autocorrelation, namely a G index, to detect the position of a road intersection, firstly extract track points with large steering change from track data, further change an angle into a non-spatial attribute of the track points, and adopt the G index to identify a hot spot area with steering change as the road intersection. Although the road intersection identification method based on the G index can identify a hot spot region with steering change (namely, a track point with large steering change), the G index cannot directly cluster the discrete track points to separate different road intersections, so that the method further depends onAnd clustering the track points by a clustering algorithm. Most of the methods need to set a threshold value and a clustering parameter of the G index, and the adaptability of the algorithm is not strong for different track data. Tang furlight et al (2017) propose a road intersection identification method based on turning point pair clustering aiming at the limitation of the existing road intersection identification method and are used for constructing the detailed structure of the intersection. The method identifies the position and the range of the intersection through the connectivity clustering of the local turning point pair, and needs to input neighborhood radius parameters to determine the spatial proximity relation of the turning point pair, and the clustering parameter setting is difficult to automatically estimate for the track point data with different densities.

Generally, an automatic detection algorithm for the position of a road intersection in the current track data is still lacking, and the existing method mainly has the following problems: firstly, the existing machine learning classification algorithm based on shape description characteristics has high requirements on sample quality when identifying the road intersection, and the classification algorithm is complex and has low efficiency; the road intersection detection algorithm depends on a clustering algorithm to strip different intersections, but the adopted clustering algorithm needs more parameter settings (such as a G index threshold value, a spatial neighborhood radius in connectivity clustering and the like), and the quality of an identification result has strong dependence on the parameter settings; and false clusters generated by noise interference and the like in the track data are difficult to eliminate, so that the quality of the final road intersection identification result is influenced.

Disclosure of Invention

Aiming at the defects of the prior art, the invention provides a method for identifying the position of a road intersection in track data based on statistical clustering. The technical problems solved by the invention mainly comprise: firstly, automatically detecting steering track points at road intersections in track data; self-adaptive clustering calculation of the central position and the spatial range of the road intersection is carried out on the track points; and the detection efficiency and the automation level of the positions of the road intersections in the existing track data are improved, and the problem of rapid positioning of the road intersections in the track data is solved.

The invention aims to solve the technical problems in the prior art, and discloses a method for identifying the positions of road intersections in track data based on statistical clustering, which comprises the following steps:

step 1, because a track data coordinate system collected by vehicle GPS equipment is a WGS-84 coordinate system, the spatial position of a sampling point is stored in longitude and latitude, and QGIS software is adopted to convert an original vehicle track data coordinate system into a plane projection rectangular coordinate system;

step 2, carrying out data processing on the track data, wherein the track data is simplified by using a Douglas-Peucker algorithm, and redundant sampling points on a straight-line segment of the track are removed;

and 3, calculating the steering change of front and rear sampling points on each vehicle track by adopting a parallel calculation mode, detecting the sampling points with larger steering change by setting a steering change threshold omega of the sampling points, marking the sampling points as candidate points of a road intersection region, eliminating the sampling points which are not near the intersection on the track, reserving the sampling points with larger steering change on the track, and combining the sampling points with larger steering change detected in all track data to form a steering sampling point set. Wherein the preset steering variation threshold ω is 45 degrees;

step 4, the steering sampling point set obtained in the step 3 is used as input, the steering sampling points are clustered through a self-adaptive statistical clustering algorithm, and intersections at different positions are stripped to enable each cluster to correspond to one road intersection;

and 5, taking the clustered sampling points obtained in the step 4 as input, calculating the minimum circumcircle of each clustering point through a minimum circle fitting algorithm for each cluster, returning the central position of the minimum circumcircle and the radius value of the circumcircle as the estimation of the central position and the area range of the road intersection, and outputting the detected central positions and the radius of the area range of all the road intersections.

Preferably, in step 1, the trajectory data coordinate system is converted by using QGIS software, wherein when projection parameters of the QGIS coordinate conversion are set, the rectangular planar projection coordinate system selects a gaussian-kluger projection with a 6-degree band as a projection model, the central longitude of the projection coordinate system selects a 6-degree band which can cover the most original trajectory points for projection conversion, and trajectory data after projection is obtained through software calculation.

Preferably, in step 2, the Douglas-Peucker algorithm is adopted to simplify the trajectory data, and the distance threshold parameter d of the Douglas-Peucker algorithm is 3 meters.

Preferably, the step 3 further comprises: the candidate points of the intersection area are identified by calculating the steering variation of the track sampling points, the implementation mode is as follows,

let the simplified trajectory data set in step 2 be S ═ T₁,T₂,…,T_nIn which T is_jRepresenting the jth vehicle track, n being the total number of input tracks, T_j＝{P₁,P₂,…,P_mIn which P is_iRepresenting a track T_jM is the trace T_jTotal number of middle sampling points, P_iIs a five-membered group and is specifically represented by P_i＝(x_i,y_i,t_i,o_i,v_i) Wherein x is_iAnd y_iRespectively representing the sampling points P_iX and Y coordinate values (unit: meter), t_iRepresents the sampling time (unit: second), o_iRepresenting a sample point P_iThe vehicle heading (starting from true north in degrees), v, recorded at the location_iRepresenting a sample point P_iThe vehicle travel speed recorded at the location (unit: km/h);

for a track T in S_j＝{P₁,P₂,…,P_m}, calculating T_jTwo adjacent sampling points P_iAnd P_i+1The difference value of the advancing directions of the vehicles is recorded as a sampling point P_iSteering variation amount Δ θ (P)_i) The specific calculation formula is as follows:

adding a class label attribute item to each sampling point if the sampling point P_iSteering variation amount Δ θ (P)_i) If the angle is larger than the preset angle threshold value omega, the angle will beThe sampling point P_iIs marked with a class label of 1, otherwise the sample point P is sampled_iThe category label of (1) is marked as 0, the steering variation of the sampling points of all the tracks in the S is calculated, and a candidate point set of the road intersection area is detected by the sampling points of which the category labels are 1.

Preferably, in the step 3, a multi-core parallel computing technique is adopted to perform parallel computing, and firstly, according to the number k of CPU cores of the computer, the trajectory data is sequentially divided into k subsets, and the subsets are respectively allocated to corresponding CPUs to perform detection and computation of steering sampling points in each trajectory data.

Preferably, the step 4 further comprises: adopting an Adaptive statistical Clustering method based on Hotspot Detection, ASCHD to cluster the steering sampling point set calculated in the step 3, stripping road intersections at different positions, and realizing the following method,

setting the steering sampling point set D obtained in the step 3 as { P }₁,P₂,…,P_KIn which P is_iRepresenting the ith sampling point, and K represents the total number of the steering sampling points obtained in the step 3; firstly, constructing a Voronoi diagram by utilizing the spatial position of each sampling point in a set D, calculating the area of a Voronoi diagram unit corresponding to each sampling point, taking the area value as the non-spatial attribute value of the sampling point, and simultaneously defining the two sampling points as spatial proximity point pairs if the Voronoi diagram units of the two sampling points have a common edge, or else, defining the two sampling points as non-spatial proximity point pairs; then, a G index is calculated for each sampling point, and the specific calculation formula is as follows:

wherein z is_jIs a sampling point P_jIs the mean of the Voronoi diagram cell areas of all the sampling points, σ is the standard deviation of the Voronoi diagram cell areas of all the sampling points, w_i,jIs a sampling point P_iAnd P_jK is the total number of the concentrated sampling points of the steering sampling points;

then, after the G index value of each sampling point in the D is obtained through calculation, sampling points with the G index larger than 0 are detected to form a clustering seed point; for each seed point P_iMark it as a new cluster Ci ═ P_iSearching for a sample point P whose spatial neighborhood is a pair of seed points_kIf Ci and P_kIf the following statistic λ can be increased after the combination, then Ci and P are added_kCombining and merging, i.e. Ci ═ P_i,P_kAnd if not, combining, continuously searching adjacent seed points to judge whether the statistic lambda is increased after combination, wherein the statistic lambda calculation formula is as follows:

wherein, C_iRepresenting a set (or cluster) of sample points, P_kIs C_iIs one sampling point of (1), z_kIs a sampling point P_kIs the mean of the area of the Voronoi diagram cell for all sample points in D, σ is the standard deviation of the Voronoi diagram cell area for all sample points in D, and u is the set (or cluster) C_iThe number of middle sampling points;

finally, the searching and merging process is iterated step by step until the adjacent seed points which can be merged continuously can not be found; removing the clustered and merged seed points from the seed point set, selecting one seed point from the uncombined seed point set, and repeatedly executing the processes of searching adjacent seed points and clustering and merging to generate a new cluster; the clustering process is stopped until all the seed points are merged or accessed, and the generated sampling point cluster C ═ C is returned₁,C₂,…,C_RIn which C is_kAnd R is the cluster number obtained by the calculation of the clustering algorithm for the kth cluster.

Preferably, the step 5 further comprises: the estimated values of the intersection center position and the area range radius are calculated by adopting a minimum circumcircle fitting algorithm, the realization mode is as follows,

setting the cluster obtained in the step 4 as C ═ C₁,C₂,…,C_RIn which C is_kFor the kth cluster, R is the cluster number obtained by the calculation of a clustering algorithm; c_k＝{P₁,P₂,…,P_MIn which P is_iIs a cluster C_kMiddle ith sample point, M represents middle cluster C_kThe number of middle sampling points. Let x_i,y_iRespectively representing the sampling points P_iX and Y coordinates (unit: meter); compute cluster C_kCenter of minimum circumscribed circle (x)_c ^k,y_c ^k) The following were used:

cluster C_kRadius of minimum circumscribed circle (r)^k) The calculation formula is as follows:

for each cluster C in C_k(k is more than or equal to 1 and less than or equal to R), and calculating the center (x) of the minimum circumscribed circle of each cluster according to the step 1) and the step 2)_c ^k,y_c ^k) And the radius of the smallest circumscribed circle (r)^k) Then (x)_c ^k,y_c ^k) Namely the estimated sum r of the central position of the k-th road intersection^kThe range radius of the kth intersection.

Compared with the prior art, the method and the device solve the problems that the existing method for identifying the road intersection in the track data is low in efficiency, multiple in algorithm parameters, serious in detection result quality depending on parameter setting and the like, improve the efficiency and the quality of detecting the position of the road intersection in the track data, and reduce the problems of error detection of the position of the intersection and the like caused by the problems of noise, uneven distribution of sampling points and the like in the track data; the method for detecting the position of the road intersection in the track data can provide input for fine modeling of the road intersection, is a key link for fine modeling of the road network intersection, and has great application prospect and practical value in high-precision road network model construction, road network data production, updating and other applications.

Drawings

The invention will be further understood from the following description in conjunction with the accompanying drawings. The components in the figures are not necessarily to scale, emphasis instead being placed upon illustrating the principles of the embodiments. In the drawings, like reference numerals designate corresponding parts throughout the different views.

FIG. 1 is a flow chart of the overall structure of the method for identifying the position of a road intersection in track data based on statistical clustering according to the present invention;

FIG. 2 is a flow chart of an embodiment of the present invention.

Detailed Description

The invention provides a method for identifying positions of road intersections in track data based on statistical clustering, which is mainly based on theories and technologies of pattern recognition, computational geometry and spatial clustering analysis. According to the method, the track sampling point set in the intersection area of the road is detected through the steering change characteristics of the sampling points in the track data, then the intersection sampling points at different positions are stripped by using a self-adaptive statistical clustering method, the minimum circumscribed circle fitting calculation is carried out on the sampling points in the intersection area of the different roads to obtain the central position of the intersection and the radius of the intersection area range, the track steering sampling points are detected through track data simplification and multi-core parallelization calculation, the automatic detection efficiency of the intersection position in massive track data is improved, the intersection stripping is carried out by using the self-adaptive clustering method, algorithm parameters are reduced, and the automation degree of the intersection position detection is improved.

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. The method provided by the invention can realize automatic operation flow by using a computer software technology.

Example one

The embodiment provides a method for identifying positions of intersections in track data based on statistical clustering as shown in fig. 1, and the specific implementation steps are as shown in fig. 2, and the method comprises the following steps:

step 1, track data collected by vehicle GPS equipment is a WGS-84 coordinate system, track data coordinates are expressed by longitude and latitude, and coordinate conversion is needed when the space distance between two track points is directly calculated. For the convenience of spatial distance calculation, firstly, a QGIS software is adopted to convert an original vehicle track data coordinate system into a plane projection rectangular coordinate system. When projection parameters of QGIS coordinate conversion are set, the rectangular plane projection coordinate system selects Gaussian-Kruge projection with a 6-degree band as a projection model, the central longitude of the projection coordinate system selects the 6-degree band which can cover the most original track points for projection conversion, and projected track data is obtained through calculation of QGIS software. Wherein, the selection process of the 6-degree band number is explained as follows:

let the set of all sampling points in the original trajectory data be G ═ P₁,P₂,…,P_NIn which P is_iIs the ith sample point and lat_i、lon_iRespectively represent P_iThe latitude and longitude coordinates of the point, and N represents the total number of sample points. The calculation formula of the 6-degree band number (f) of the projection coordinate system is as follows:

where INT () represents rounding a value.

And 2, in order to improve the efficiency of processing the large-scale track data, firstly, simplifying the track data by adopting a Douglas-Peucker algorithm and removing redundant sampling points on a track straight-line segment. Wherein, the distance threshold parameter d of the Douglas-Peucker algorithm is 3 meters. The Douglas-Peucker algorithm reduces the trajectory data as follows:

1) for each track, connecting a straight line segment AB between the head sampling point and the tail sampling point (marked as A and B) of the track, wherein the straight line segment is a chord of the track;

2) calculating a point (marked as C) with the maximum distance from each sampling point to the chord AB on the track, and calculating the vertical distance delta d between the C and the straight line segment AB; if the distance Δ d is less than or equal to a preset distance threshold parameter d (d is 3 meters), taking the straight line segment AB as an approximation of the original trajectory, namely, a simplified trajectory; if the distance delta d is larger than a preset distance threshold parameter d (d is 3 meters), dividing the original track into two sub-track segments AC and CB at the C point, and respectively performing the processing of the step 1) and the step 2) on the two sub-track segments;

3) when all sub-track sections are processed, sequentially connecting all the division points to form a broken line, and taking the broken line as a track after the original track is simplified;

4) and processing all tracks according to the steps 1) -3), and stopping the track data simplifying process until all tracks are processed.

And 3, in order to improve the detection efficiency of the steering sampling points in the large-scale track data, a parallelization calculation mode is adopted, namely a multi-core parallel calculation technology is adopted to fully utilize multi-core CPU calculation resources. Firstly, according to the number of CPU cores (such as k CPUs) of a computer, the track data is divided into k subsets in sequence, and the subsets are respectively distributed to the corresponding CPUs to detect and calculate steering sampling points in the track data. The implementation steps of determining the steering sampling points by calculating the steering variable quantity between two adjacent sampling points in each track data are as follows:

let the simplified trajectory data set in step 2 be S ═ T₁,T₂,…,T_nIn which T is_jThe j-th vehicle track is shown, and n is the total number of the input tracks. Track T_j＝{P₁,P₂,…,P_mIn which P is_iRepresenting a track T_jM is the trace T_jTotal number of middle sampling points, P_iIs a five-membered group and is specifically represented by P_i＝(x_i,y_i,t_i,o_i,v_i) Wherein x is_iAnd y_iRespectively representing the sampling points P_iX and Y coordinate values (unit: meter), t_iRepresents the sampling time (unit: second), o_iRepresenting a sample point P_iTo the recorded advancing direction of the vehicleDirection (starting from north direction, unit: degree), v_iRepresenting a sample point P_iThe vehicle travel speed recorded at the location (unit: km/h);

1) first, the steering variation amount of the sampling point is calculated. For a track T in S_j＝{P₁,P₂,…,P_m}, calculating T_jTwo adjacent sampling points (e.g. P)_iAnd P_i+1) The difference value of the advancing directions of the vehicles is recorded as a sampling point P_iSteering variation amount Δ θ (P)_i) The specific calculation formula is as follows:

2) second, the steering sample points are identified. The concrete implementation steps are as follows: adding a class label attribute item (default value is set to 0) to each sampling point if the sampling point P_iSteering variation amount Δ θ (P)_i) If the sampling point P is larger than the preset angle threshold value omega, the sampling point P is measured_iIs marked as 1 (i.e. the candidate point representing the road intersection area), otherwise, the sampling point P is marked as_iThe category label of (1) is marked 0 (i.e., represents other points). And calculating the steering variation of the sampling points of all the tracks in the S, and detecting a candidate point set of which the sampling points with the class labels of 1 form a road intersection region.

3) And finally, combining all steering sampling points detected by the multi-core CPU distribution calculation into a set of sampling points.

Wherein, the angle threshold ω preset in step 3 is 45 degrees.

And 4, taking the steering sampling point set obtained by calculation in the step 3 as input, and dividing the steering sampling points into different clusters according to Spatial position proximity by adopting an Adaptive statistical Clustering method on Hotspot Detection (ASCHD). The ASCHD algorithm is specifically realized by the following steps:

setting the steering sampling point set D obtained in the step 3 as { P }₁,P₂,…,P_KIn which P is_iRepresents the ithSampling points, wherein K represents the total number of the steering sampling points obtained in the step 3;

1) and constructing a Voronoi diagram by utilizing the spatial position of each sampling point in the set D, calculating the area of a Voronoi diagram unit corresponding to each sampling point, and taking the area value as the non-spatial attribute value of the sampling point. Meanwhile, if the Voronoi diagram elements of two sampling points have a common edge, the two sampling points are defined as a spatial neighboring point pair, otherwise, the two sampling points are non-spatial neighboring point pairs. Then, a G index is calculated for each sampling point, and the specific calculation formula is as follows:

in the formula, z_jIs a sampling point P_jIs the mean of the Voronoi diagram cell areas of all the sampling points, σ is the standard deviation of the Voronoi diagram cell areas of all the sampling points, w_i,jIs a sampling point P_iAnd P_jSpatial weight between (if sample point P is sampled)_iAnd P_jIs a pair of spatially adjacent points w_i,j1, otherwise w_i,j0), and K is the total number of the centralized sampling points of the turning sampling points;

2) and after the G index value of each sampling point in the D is obtained through calculation, sampling points with the G index larger than 0 are detected to form a clustering seed point. For each seed point (e.g., P)_i) Mark it as a new cluster Ci ═ P_iSearch for samples (e.g., P) whose spatial neighbors also belong to the seed point_k) If Ci and P_kIf the following statistic λ can be increased after the combination, then Ci and P are added_kCombining and merging, i.e. Ci ═ P_i,P_kAnd if not, continuing to search adjacent seed points to judge whether the statistic lambda is increased after combination. The statistic λ is calculated as follows:

wherein, C_iRepresenting a set (or cluster) of sample pointsClass), P_kIs C_iIs one sampling point of (1), z_kIs a sampling point P_kIs the mean of the area of the Voronoi diagram cell for all sample points in D, σ is the standard deviation of the Voronoi diagram cell area for all sample points in D, and u is the set (or cluster) C_iThe number of middle sampling points;

3) gradually iterating to search and combine processes, and stopping combining until adjacent seed points which can be combined continuously cannot be found; removing the clustered and merged seed points from the seed point set, selecting one seed point from the uncombined seed point set, and repeatedly executing the processes of searching adjacent seed points and clustering and merging to generate a new cluster; the clustering process is stopped until all the seed points are merged or accessed, and the generated sampling point cluster C ═ C is returned₁,C₂,…,C_RIn which C is_kAnd R is the cluster number obtained by the calculation of the clustering algorithm for the kth cluster.

And 5, taking the clusters obtained in the step 4 as input, calculating the minimum circumcircle of each cluster by a minimum circle fitting algorithm aiming at each cluster, and returning the central position of the minimum circumcircle of each cluster and the radius value of the circumcircle. The minimum circle fitting algorithm is realized by the following steps:

let the cluster obtained in step 4 be C ═ C₁,C₂,…,C_RIn which C is_kAnd R is the cluster number obtained by the calculation of the clustering algorithm for the kth cluster. C_k＝{P₁,P₂,…,P_MIn which P is_iIs a cluster C_kMiddle ith sample point, M represents middle cluster C_kThe number of middle sampling points. Let x_i,y_iRespectively representing the sampling points P_iX-coordinate and Y-coordinate (unit: meter).

1) Compute cluster C_kCenter of minimum circumscribed circle (x)_c ^k,y_c ^k) The following were used:

2) cluster C_kRadius of minimum circumscribed circle (r)^k) The calculation formula is as follows:

3) for each cluster C in C_k(k is more than or equal to 1 and less than or equal to R), and calculating the center (x) of the minimum circumscribed circle of each cluster according to the step 1) and the step 2)_c ^k,y_c ^k) And the radius of the smallest circumscribed circle (r)^k) Then (x)_c ^k,y_c ^k) Namely the estimated sum r of the central position of the k-th road intersection^kThe range radius of the kth intersection.

Claims

1. A method for identifying the positions of road intersections in track data based on statistical clustering is characterized by comprising the following steps:

step 3, calculating the steering change of front and rear sampling points on each vehicle track by adopting a parallel calculation mode, detecting the sampling points with larger steering change by setting a steering change threshold omega of the sampling points, marking the sampling points as candidate points of a road intersection region, removing the sampling points which are not near the intersection on the track, reserving the sampling points with larger steering change on the track, combining the sampling points with larger steering change detected in all track data to form a steering sampling point set, wherein the preset steering change threshold omega is 45 degrees;

the step 4 further comprises the following steps: adopting an adaptive statistical clustering algorithm ASCHD to cluster the steering sampling point set obtained by the calculation in the step 3, stripping road intersections at different positions, and realizing the following method,

then, after the G index value of each sampling point in the D is obtained through calculation, sampling points with the G index larger than 0 are detected to form a clustering seed point; for each seed point P_iMark it as a new cluster Ci ═ P_iSearching for a sample point P whose spatial neighborhood is a pair of seed points_kIf Ci and P_kIf the following statistic λ can be increased after the combination, then Ci and P are added_kCo-proceed withMerging, i.e. Ci ═ P_i,P_kAnd if not, combining, continuously searching adjacent seed points to judge whether the statistic lambda is increased after combination, wherein the statistic lambda calculation formula is as follows:

finally, the searching and merging process is iterated step by step until the adjacent seed points which can be merged continuously can not be found; removing the clustered and merged seed points from the seed point set, selecting one seed point from the uncombined seed point set, and repeatedly executing the processes of searching adjacent seed points and clustering and merging to generate a new cluster; the clustering process is stopped until all the seed points are merged or accessed, and the generated sampling point cluster C ═ C is returned₁,C₂,…,C_RIn which C is_kFor the kth cluster, R is the cluster number obtained by the calculation of a clustering algorithm;

2. The method for identifying positions of intersections in track data based on statistical clustering according to claim 1, wherein: in the step 1, a trajectory data coordinate system is converted by sampling QGIS software, wherein when projection parameters of the QGIS coordinate conversion are set, a rectangular plane projection coordinate system selects a Gaussian-Krueger projection with a 6-degree band as a projection model, a central longitude of the projection coordinate system selects a 6-degree band which can cover the most original trajectory points for projection conversion, and trajectory data after projection is obtained through software calculation.

3. The method for identifying positions of intersections in track data based on statistical clustering according to claim 1, wherein the step 2 further comprises: and simplifying the track data by adopting a Douglas-Peucker algorithm, wherein the distance threshold parameter d of the Douglas-Peucker algorithm is 3 meters.

4. The method for identifying positions of intersections in track data based on statistical clustering according to claim 1, wherein the step 3 further comprises: the candidate points of the intersection area are identified by calculating the steering variation of the track sampling points, the implementation mode is as follows,

for a track T in S_j＝{P₁,P₂,…,P_m}, calculating T_jIn adjacent toTwo sampling points P_iAnd P_i+1The difference value of the advancing directions of the vehicles is recorded as a sampling point P_iSteering variation amount Δ θ (P)_i) The specific calculation formula is as follows:

adding a class label attribute item to each sampling point if the sampling point P_iSteering variation amount Δ θ (P)_i) If the sampling point P is larger than the preset angle threshold value omega, the sampling point P is measured_iIs marked with a class label of 1, otherwise the sample point P is sampled_iThe category label of (1) is marked as 0, the steering variation of the sampling points of all the tracks in the S is calculated, and a candidate point set of the road intersection area is detected by the sampling points of which the category labels are 1.

5. The method for identifying positions of intersections in track data based on statistical clustering according to claim 1 or 4, wherein the step 3 further comprises: firstly, according to the CPU core number k of the computer, dividing the track data into k subsets in sequence, and respectively allocating the subsets to corresponding CPUs to detect and calculate steering sampling points in each track data.

6. The method for identifying positions of intersections in track data based on statistical clustering according to claim 1, wherein: the step 5 further comprises: the estimated values of the intersection center position and the area range radius are calculated by adopting a minimum circumcircle fitting algorithm, the realization mode is as follows,

setting the cluster obtained in the step 4 as C ═ C₁,C₂,…,C_RIn which C is_kFor the kth cluster, R is the cluster number obtained by the calculation of a clustering algorithm; c_k＝{P₁,P₂,…,P_MIn which P is_iIs a cluster C_kMiddle ith sample point, M represents middle cluster C_kMiddle sampling pointThe number of the cells; let x_i,y_iRespectively representing the sampling points P_iX and Y coordinates (unit: meter); compute cluster C_kCenter of minimum circumscribed circle (x)_c ^k,y_c ^k) The following were used: