CN114353810A

CN114353810A - HMM efficient map matching method based on R tree and track segmentation

Info

Publication number: CN114353810A
Application number: CN202210023217.9A
Authority: CN
Inventors: 宋縯蛟; 芮小平
Original assignee: Hohai University HHU
Current assignee: Hohai University HHU
Priority date: 2022-01-10
Filing date: 2022-01-10
Publication date: 2022-04-15
Anticipated expiration: 2042-01-10
Also published as: CN114353810B

Abstract

The invention discloses an HMM efficient map matching method based on an R tree and track segmentation. Firstly, an R-tree spatial index method is adopted to establish spatial indexes for a road network, then GPS track data is segmented based on the position change rate of track points, candidate road sections to which sub tracks belong are quickly determined by utilizing the R-tree indexes, key points are selected from the sub tracks to replace the whole sub tracks to judge the road sections to which the sub tracks belong, and the map matching of each sub track is completed according to the result. The invention has the advantages that: the workload of road search and trace point traversal can be reduced simultaneously, and the algorithm efficiency is greatly improved.

Description

HMM efficient map matching method based on R tree and track segmentation

Technical Field

The invention relates to the technical field of map matching, in particular to an HMM efficient map matching method based on an R tree and track segmentation.

Background

The GPS track data is sequence data with time and space information, and has the advantages of low data acquisition cost, wide coverage range and tense characteristic. The research on the activity mode of the microscopic individual can be carried out through the GPS track data, the urban space structure research of a macroscopic activity system can also be carried out, and the GPS track data becomes an important data source for urban geographic big data research^[1-4]. Because of the error of the GPS data, the uncalibrated track data often deviates from the actual road, and when the uncalibrated track data is actually applied, the GPS data is firstly subjected to map matching to obtain track description combined with a road network^[5]. Map matching of GPS trajectory data is one of the fundamental jobs for further mining analysis by learners^[6-9]。

Traditional map matching algorithms only map tracks according to geometric and topological relationsPoint matching to road network^[10-11]This method only considers the spatial positions of the GPS points and the road, but does not consider the information of the connectivity, bidirectionality, etc. of the road, and therefore, a large error may be generated as a result. Newson and Krumm^[12]In 2009, a map matching algorithm based on a Hidden Markov Model (HMM for short) was proposed, which considers the historical information of vehicle driving, introduces observation probability and Hidden probability from the perspective of probability theory, well solves the problem of insufficient precision of pure geometric matching, has become a main method of GPS track map matching, and the work of many map matching algorithms at present improves the precision and efficiency of HMM map matching algorithms.

The method for improving the accuracy of the HMM map matching algorithm has various forms, and mainly improves the accuracy by optimizing probability calculation in the Viterbi algorithm, such as: chenhao tea^[13]The method of dynamic planning is used for improving the emission probability and the transition probability of the single track point into the joint probability of the road section; the traditional HMM map matching algorithm does not consider the effect of the driving direction on the model, and liu 26107^[14]Wangxingmeng^[15]Methods have been proposed that introduce the direction of travel (heading angle) into the transition probability calculation. Another scholars performs certain processing on the trajectory data before or after matching, such as: zhou^[16]Considering that the actual traffic condition is relatively complex, selecting matching error-prone points before matching and carrying out manual annotation, and carrying out HMM map matching on the rest points normally; wugang^[17]And the matching result of the HMM map is processed by using a genetic algorithm, so that the accuracy of the matched road section is further improved.

In practical application scenarios, especially in online map application, the timeliness requirement on the map matching of the trajectory data is very high, and therefore, many scholars focus more on studying how to improve the efficiency of map matching of the HMM, such as: huang^[18]A method for identifying frequent patterns from historical track data is provided; liu 26107^[14]、Xie^[19]The trajectory data is selected to be subjected to a segmentation process to reduce the number of calculations of the transition probability. Also the rate at which the learner searches for the roadOptimization, e.g. of the dawning^[15]Hash indexes are established for road network data, and the map matching efficiency can be improved to a certain extent. Another solution is to perform thinning processing on the trace points, that is, to sample the trace points, for example, one trace point is reserved every 50m, and only the thinned sample points are matched. This method can significantly improve the matching speed, but has the disadvantage that the correlation between the various attributes (speed, direction, etc.) of the track points and the road sections where the track points are located is not considered.

As described above, many scholars improve the HMM map matching algorithm with respect to the high efficiency requirement of the map matching algorithm, and the existing methods simplify the traversal of the GPS trajectory data to reduce the number of cycles for the trajectory points, or build an index for the road network data to speed up the search rate. In fact, the time complexity of the map matching algorithm mainly comes from two aspects of searching road network data and judging track data point by point, and if the two aspects are improved comprehensively, the algorithm efficiency can be greatly improved. Therefore, the invention provides an HMM efficient map matching method based on the adoption of an R tree and track segmentation, which is used for establishing an R tree index for road network data and simultaneously carrying out track segmentation processing on GPS track data, thereby realizing the purpose of simultaneously improving the efficiency of an algorithm from two aspects.

Reference to the literature

[1]Xi Liu,Gong Li,Gong Yongxi,et al.Revealing travel patterns and city structure with taxi trip data[J].Journal of Transport Geography,2015,4378-90.ISSN:0966-6923；

[2]Yang Zhou,Fang Zhixiang,Thill Jean-Claude,et al.Functionally critical locations in an urban transportation network:Identification and space–time analysis using taxi trajectories[J].Computers,Environment and Urban Systems,2015,5234-47.ISSN:0198-9715；

[3] An Incremental Map Matching algorithm [ J ]. Wuhan University report (Information Science edition) under Road Network topological constraint, 2017,42(01):77-83.[ Zhu D, Liu Y.an incorporated Map-Matching Method Based on Road Network Topology [ J ]. Geomics and Information Science of Wuhan University,2017,42(01):77-83. ];

[4] wutao, Tongjust, Gong Jianya road network update trajectory-Map matching method [ J ] Messaging, 2017,46(04): 507-;

[5]W.C,J.Z,W.M,et al.Map Matching Algorithm According to Pseudo-Zernike Moments[C].2010Second World Congress on Software Engineering,Hubei,China,2010；

[6]Yu Zheng.Trajectory Data Mining:An Overview[J].ACM Trans.Intell.Syst.Technol.,2015,6(4):29.ISSN:2157-6904；

[7] lujia article, Royueshong, Huangmeisongnian, etc. map matching methods based on rank learning and multi-source information [ J ]. Zhejiang University newspaper (Science Edition),2020,47 (01):27-35.[ Lu J P, Luo Y T, Huang Z S, et al.an information fusion map matching method based on sizing on ranking learning [ J ]. Journal of Zhejiang University, 2020,47(01):27-35. ];

[8]Ming Ren,Karimi Hassan-A.A fuzzy logic map matching for wheelchair navigation[J].GPS Solutions,2012,16(4):273-282.ISSN:1521-1886；

[9]Hu G.,J.Shao,F.Liu,et al.IF-Matching:Towards Accurate Map-Matching with Information Fusion[J].IEEE Transactions on Knowledge and Data Engineering,2017,29(1):114-127.ISSN:2375-026X；

[10]Yoonsik Bang,Kim Jiyoung,Yu Kiyun.An Improved Map-Matching Technique Based on the Fréchet Distance Approach for Pedestrian Navigation Services.2016；

[11]Ling Yuan,Li Dan,Hu Song.A map-matching algorithm with low-frequency floating car data based on matching path[J].EURASIP Journal on Wireless Communications and Networking,2018,2018(1):146.ISSN:1687-1499；

[12]Paul Newson,Krumm John.Hidden Markov map matching through noise and sparseness[A]//Seattle,Washington:Association for Computing Machinery,2009:336-343；

[13] chenhao, Yonghui, Zhang Ping, etc. handset Data movement trajectory Matching Based on hidden Markov Model and Dynamic Programming [ J ] Geography and Geography Information Science,2019,35 (03):1-8.[ Chen H, Xu C H, Zhang XP, et al high Markov Model and Dynamic Programming Based Map Matching Method for Mobile Trajectories Using Mobile Phone Data [ J ]. Geography and Geo-Information Science,2019,35(03):1-8. ];

[14] liu 261073, Prunus mume, Xudaoyu, etc. A Map Matching Algorithm [ J ] Based on HMM Model improvement, university of Beijing bulletin (Nature science edition), 2018,54(06): 1235-;

[15] a Map Matching Method facing mass Floating Car Data [ J ]. Earth Information Science report, 2015,17(10):1143 + 1151.[ Wang X M, Chi T H, Lin H, et al.AResearch of Map-Matching Method for Massive Floating Car Data [ J ]. Journal of Geo-Information Science 2015,17(10):1143 + 1151 ];

[16]Xibo Zhou,Ding Ye,Tan Haoyu,et al.HIMM:An HMM-Based Interactive Map-Matching System[A]//Candan S,Chen L,Pedersen TB,et al.Cham:Springer International Publishing,2017:3-18；

[17] wugang, Qiyujing, Wang Keren map Matching Algorithm [ J ]. University of northeast (Nature Science edition) Based On Hidden Markov models and Genetic Algorithm [ J ]. 472- ] [ Wu G, Qiu Y J, Wang G R.Map Matching Algorithm Based On high Markov Model and Genetic Algorithm [ J ]. Journal of Northastern University (Natural Science),2017,38(04):472- ];

[18]Huang Y.,W.Rao,Z.Zhang,et al.Frequent Pattern-Based Map-Matching on Low Sampling Rate Trajectories[A]//2018:266-273.ISSN:2375-0324；

[19]Yan Xie,Zhou Kai,Miao Fang,et al.High-Accuracy Off-Line Map-Matching of Trajectory Network Division Based on Weight Adaptation HMM[J].IEEE Access,2020,87256-7266.ISSN:2169-3536；

[20] wujun, mathematics of Mei [ M ] Beijing: people post press, 2012: 49-58;

[21]Antonin Guttman.R-Trees:A Dynamic Index Structure for Spatial Searching[J].SIGMOD Rec.,1984,14(2):47-57。

disclosure of Invention

The invention provides an HMM high-efficiency map matching method based on the adoption of an R tree and track segmentation, aiming at the problem that a large amount of track data cannot be efficiently processed by a traditional HMM map matching algorithm, and the method is used for carrying out track segmentation processing on GPS track data while establishing an R tree index on road network data, so that the efficiency is improved from two aspects.

In order to realize the purpose, the technical scheme adopted by the invention is as follows:

an HMM efficient map matching method based on an R tree and track segmentation comprises the following steps:

s1: establishing an R tree for road network data;

s2: dividing sub tracks of the GPS track data;

s3: traversing all sub-tracks, equally dividing all track points into four parts for each sub-track, reserving all points in the first part and the fourth part, sampling one point in every two points in the middle two parts, and arranging the points in sequence to generate a sample of the whole track;

s4: the best matching path for this sample is calculated using the viterbi algorithm:

s5: and taking the whole traced path as a matching path of the sub-track, and matching each point in the sub-track to the point with the minimum distance of the path to finish matching.

Further, S2 performs point-by-point processing on the GPS data according to the real-time position change rate of the trajectory data, and determines whether the point is in the current sub-trajectory according to the magnitude relationship between the inter-point position change rate and the average change rate. The rate of change of position between two points is divided into a longitude average rate of change and a latitude average rate of change. The average change rate of longitude and the average change rate of latitude are calculated by subtracting the absolute value of the longitude or latitude value of the previous point from the longitude or latitude value of the previous point, dividing the absolute value by the longitude or latitude of the previous point, and expressing the result as follows:

the specific process is as follows:

s21: creating a sub-track and taking the sub-track as a current sub-track, adding the first two track points from the first point or the current point, and setting the third track point as the current point;

s22: traversing the GPS track point from the current point i;

s23: the position change rate of the current point and the average position change rate of the current sub-track are calculated. The average position change rate is formulated as:

s24: judging the magnitude relation between the position change rate of the current point and the average position change rate of the current sub-track, and when the point i meets r (i)_lon≤1.5R(n)_lonAnd r (i)_lat≤1.5R(n)_latWhen the position change rate is not much different from the average value of the current sub-track, the current sub-track is considered to belong to, and the step goes to S25; otherwise, ending the current sub-track, and returning to S21;

s25: the current point is added to the current trajectory, and the next point is taken as the current point, returning to S22.

Further, the specific steps of S4 are as follows:

s41: starting from a sample starting point, searching a candidate road section by using an R tree index; comprehensively considering the difference of the distance between the track point and each road section and the advancing directionThe formula calculates the probability P (X) of selecting each road section_1j) The formula is as follows:

wherein d is_jThe length of a perpendicular line from a track point to a road is taken as the distance between the track point and the jth road section; theta_jThe included angle of the advancing direction between the track point and the jth road section is a positive value.

S42: starting from the second sample point, through the formula P (X)_ij)＝P(X_i-1)*P(X_i|X_i-1)*P(X_ij|X_i) (4) calculating an observation probability P (X) of each candidate link_ij) Until the last sample point;

P(X_ij) Probability of selecting a current road section; p (X)_i-1) Probability of being the last road segment; p (X)_i|X_i-1) Is the transition probability between two states; p (X)_ij|X_i) The probability of selecting the current road segment in the current state.

S43: and finding the candidate road section with the maximum observation probability in the last state, and backtracking the candidate road section to obtain the corresponding candidate road section of each state till the sample starting point.

Compared with the prior art, the invention has the advantages that:

the efficiency is optimized, so that the method has more advantages in timeliness. The efficiency is obviously improved under the condition of large GPS track data volume, and the method is suitable for application scenes with large track data volume.

Drawings

FIG. 1 is a schematic diagram of space division for a road network according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of an R-tree established according to an embodiment of the present invention;

FIG. 3 is a schematic view of a location change rate zone according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of GPS trajectory data segment matching according to an embodiment of the present invention;

FIG. 5 is a schematic illustration of a candidate road segment according to an embodiment of the invention;

FIG. 6 is a schematic diagram of the matching effect of some trace points according to the embodiment of the present invention;

FIG. 7 is a schematic diagram of the efficiency analysis of three matching algorithms according to the embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be further described in detail below with reference to the accompanying drawings by way of examples.

s1: establishing an R tree for road network data;

an R-tree (R-trees) is a balanced tree using a concept of space division, and a multidimensional space is divided into a plurality of levels by a Minimum Bounding Rectangle (MBR), so that the node access amount in space search can be greatly reduced. When constructing an R tree, each leaf node stores a pointer pointing to MBR of each feature in the space; the penultimate layer also maintains pointers to MBRs, such MBRs containing MBRs for one or more respective terrain; and by analogy, finally establishing an index in the whole space. In HMM-based map matching algorithms, there may be a large number of observations, i.e. possible matching road segments, per state. In the matching process, when traversing each GPS track point, all that is needed is to screen out candidate matching road sections, namely all nodes in the current state. In general, the road network is complicated and the road segments are many, so that the searching step takes a lot of time. If the R tree index is established by taking each road section as the minimum ground feature unit before matching, the search time of the candidate road section can be greatly shortened, and the algorithm efficiency is improved.

Fig. 1 shows a segmentation method for an exemplary road network. Wherein the uppermost space is divided into R1, R2; r3 to R6 are second-layer spaces; the rest is the third layer, i.e. the MBR of each section. The data structure of the R-tree is shown in FIG. 2;

in the map matching process, the large rectangle (namely, the MBR at the non-leaf node) where the GPS track point is located can be quickly found through the geographic coordinates of the GPS track point, and then all road sections in the rectangle are found according to the pointers stored in the R tree and serve as candidate road sections. In fig. 1, a, B, and C are three data points to be matched, which represent three situations when searching for a road segment to be matched. It is easy to see that the point a is located in the R8 rectangle, i.e. the MBR of the leaf node, the previous rectangle, i.e. R3, can be found according to the R tree, and then two road segments in R3 are listed as candidate road segments in the current state; if the point B is positioned in R3, the corresponding road sections of R7 and R8 are directly listed as candidates; and C is located in R1, and the road sections corresponding to R7, R8, R11 and R12 are all listed as candidate road sections.

S2: dividing sub tracks of the GPS track data;

the data source of the track points is continuous acquisition of a GPS receiver, and on the same road section, the adjacent track points have strong similarity in the aspects of driving direction, driving speed and the like; on different road sections, each attribute of the adjacent track points can be obviously changed. If adjacent track points with similar properties in the travel track form sub-tracks, thinning processing is carried out inside each sub-track, and the sub-tracks can be replaced by the sample to the greatest extent. And searching the candidate road section of the sample generated by rarefaction, and taking the candidate road section as the candidate road section of the whole sub-track, so that the searching efficiency can be greatly improved. Meanwhile, compared with simple rarefaction treatment, the method can represent the whole track more accurately.

The sub-track is usually divided by comparing the similarity of the track points of adjacent points, and the points with the similarity exceeding a specified threshold are regarded as the sub-track points. The calculation of similarity is largely affected by the positioning error of the locus points, so that the division of the sub-loci should also have a data cleansing function. And processing the GPS data point by point according to the position real-time change rate of the track data, and judging whether the point is in the current sub-track or not according to the size relation between the position change rate between the points and the average change rate. The rate of change of position between two points can be divided into a longitude average rate of change and a latitude average rate of change. Both calculation methods are that the absolute value of the warp (weft) value of the point is subtracted from the warp (weft) value of the previous point, and then the absolute value of the warp (weft) value of the previous point is divided, and the calculation method is expressed by a formula as follows:

the specific flow is as follows:

s22: traversing the GPS track point from the current point i;

s24: and judging the magnitude relation between the position change rate of the current point and the average position change rate of the current sub-track, as shown in fig. 3. When point i satisfies r (i)_lon≤1.5R(n)_lonAnd r (i)_lat≤1.5R(n)_latWhen the position i is in the region 1, the difference between the position change rate and the average value of the current sub-track is not large, the position i is considered to belong to the current sub-track, and the operation goes to S25; otherwise, ending the current sub-track, and returning to S21;

S3: traversing all sub-tracks, equally dividing all track points into four parts for each sub-track, reserving all points (namely reserving the head and the tail of the sub-track) in the first part and the fourth part, sampling one point for every two points in the middle two parts, and arranging the points in sequence to generate a sample of the whole track;

s41: starting from a sample starting point, searching a candidate road section by using an R tree index; comprehensively considering the difference of the distance between the track point and each road section and the advancing direction, calculating the probability P (X) of selecting each road section through a formula_1j) The formula is as follows:

P(X_ij) Probability of selecting a current road section; p (X)_i-1) Probability of being the last road segment; p (X)_i|X_i-1) The transition probability between two states is used for measuring the cost from a certain road section corresponding to a previous point to a certain road section corresponding to the point in the advancing process from the previous point to the point, and can be measured by the difference between the distance between two track points and the distance between the corresponding points on the two road sections; p (X)_ij|X_i) And selecting the probability of the current road section in the current state, wherein the probability is higher the farther the track point is away from the road section and the smaller the difference between the advancing direction and the road section direction is.

S43: finding the candidate road section with the maximum observation probability in the last state, and backtracking the candidate road section to obtain the corresponding candidate road section of each state till the sample starting point;

Fig. 4 is a schematic diagram of the map matching algorithm. a-b-c-d is a continuous track (only a few track points are drawn here for the sake of convenience of illustration), and two road segments are respectively arranged in R13 and R14 and are jointly surrounded by R6. According to the sub-track division rule, four sub-tracks of a, b, c and d are obtained. For a, b and d, according to the method, a non-leaf node rectangle, namely R6, can be searched, and the road sections in the non-leaf node rectangle are listed as candidates; for c, since it is not within any leaf node, the rectangle in R6 is directly listed as a candidate; then, calculating each candidate probability by using a Viterbi algorithm, finding the road section with the maximum probability, and backtracking to obtain the whole road section; and finally, matching each point in the track to the minimum distance point of the corresponding road section.

Simulation experiment and result analysis

A data source: the road network data used in the embodiment is OpenStreetMap data of beijing; the track data is a Beijing GPS track data set published by Microsoft 2012.

Software and hardware environment: in order to verify the applicability of the method, a computer with common configuration is selected for the simulation to carry out the experiment, and in terms of hardware environment, the model of the computer is Legion Y7000P 2019, the CPU is Intel (R) core (TM) i7-9750H CPU @2.60GHz 2.59GHz, and the memory is 16.0 GB. In the aspect of software environment, the version of the programming language Python is 3.8, and the electronic map is displayed by adopting ArcGIS software.

Simulation experiment

Establishing an R tree: and establishing an R tree index for the OpenStreetMap data. Statistically, there are 1827 segments, and the R-tree thus created has 1827 leaf nodes.

Dividing sub tracks: taking a track composed of 932 GPS points as an example, solving the longitude and latitude change rate according to the formula (1) is carried out, and partial results are shown in Table 1. The longitude and latitude rate is calculated from the second track point, so the longitude and latitude rate at the first point of the track (the first row of the table) is null.

TABLE 1 Latitude and longitude variation ratio

And (3) solving the longitude and latitude average change rate of the track according to the formula (2), wherein the latitude average change rate is 1.59561E-06, and the longitude average change rate is 6.78495E-07. And comparing the longitude and latitude change rate with the longitude and latitude average change rate to judge whether each point is the starting point of the sub-track. The analysis results are shown in table 2, wherein "<" indicates that the track point longitude and latitude change rate is smaller than the average change rate, and ">" indicates that the track point longitude and latitude change rate is larger than the average change rate. The track head point has no longitude and latitude change rate, so the longitude and latitude of the first line of the table is judged to be a null value, and the track head point is taken as a non-starting point for processing.

TABLE 2 determination of the starting point of a sub-track

After processing, 932 points of the whole track are divided into 26 sub-tracks, the longest sub-track covers 106 track points, and the shortest sub-track only has 3 track points.

Searching candidate road sections: the simulation experiment was carried out with 1 of the sub-trajectories, whose trajectory point coordinates and direction of travel are shown in table 3, where the angle of the direction of travel is measured from the true east in the counterclockwise direction and 90 ° is true north. Firstly, the sub-track is sampled according to the algorithm steps, and the samples shown in the table 4 are obtained. Subsequent experiments were demonstrated with this sample.

TABLE 3 certain sub-track

TABLE 4 samples of sub-traces

Fig. 5 shows a section of the trace, where the rectangle is the outer rectangle of the sub-trace, the enclosed 10 points are the sub-trace, and the points in the rectangle are the sampling results shown in table 4. The rectangle is a leaf node of the R-tree and a father node at the upper level of the leaf node. Searching candidate road sections of the sub-track by using the R tree, and finding that the outsourcing rectangle of the sub-track is intersected with the outsourcing rectangles of the

road sections

1, 4 and 5; further searching their parent nodes finds that the parent node's rectangle also includes the bounding rectangle for the

road segment

2, 3, 6. Therefore, the

links

1, 2, 3, 4, 5, 6 are candidates for the sub-trajectory. Therefore, the candidate road sections of the whole sub-track are determined, and the candidate road sections do not need to be searched by sub-track points, so that the algorithm efficiency is greatly improved.

And (3) calculating the probability of the candidate road section: the road segment direction can be calculated from the road segment data points as shown in table 5. First, the 6 state probabilities of the sample initial point are calculated using equation (3), e.g., θ is the number of candidate links 1 selected_jIs 2 DEG, d_jThe node probability is calculated to be 0.8312 when the node is 1.8 meters; then calculating the probability of the second state, and obtaining the probability of 36 nodes according to the formula (4); the same applies to the subsequent points, and until the last point, the node probability of the last state is shown in table 6, and only partial display is performed because of excessive nodes. The path list in the table records each node passed by reaching the node, and the probability list is the probability of reaching the tail state through the path. And selecting the maximum value from the road sections, and backtracking to find that the candidate road sections selected by each state are all the road sections 1. The candidate paths of all the states are connected to obtain a road section 1, and finally the sub-track is matched to the road section 1.

TABLE 5 candidate Link Direction

TABLE 6 Tail State probability computation

Simulation results and analysis

Fig. 6 shows the matching effect of the improved algorithm on two sampled road sections in beijing, wherein the small points are the matched track points.

As can be seen from fig. 6, even in a relatively complex road network, such as a campus, the track points are substantially all matched to the roads, which shows that the accuracy of the improved algorithm can meet the requirements of practical applications. The improved method provided by the invention aims to improve the map matching efficiency, and the efficiency analysis is carried out below.

In order to verify the algorithm efficiency, 1000, 5000, 10000, 50000, 100000, 500000 and 1000000 continuous track points are intercepted from actual GPS track data to perform experiments, and simulation experiments are respectively performed by using a traditional HMM map matching algorithm, an HMM map matching algorithm improved by using an R tree and an HMM map matching algorithm by using an R tree and track segmentation, and the analysis result is shown in fig. 7, wherein a bar graph represents the overall matching efficiency of different data volumes, and a broken line graph represents the single-point matching efficiency.

As can be seen from fig. 7, the single-point matching rate gradually decreases as the data amount increases, and the main reason is that the compiling time of the program accounts for a large proportion of the total running time when the data amount is small, and such an influence is averaged to various points when the data amount is large, and is almost negligible. When the data volume is small, the efficiency difference of the three algorithms is not large, but when the data volume exceeds 10000, only the R tree is used for improving the algorithm efficiency to a certain extent, and the HMM map matching algorithm using the R tree and track segmentation has very obvious advantages: at 50000, 100000, 500000, 1000000 data volumes, the running time of the algorithm using the two improvements is reduced by 34%, 46%, 61% and 69% respectively compared with the traditional HMM map matching algorithm. It can be seen that the present invention can significantly improve the map matching efficiency for both improvements of the HMM, and the advantage of this efficiency is more apparent as the amount of data increases. When data used in the intelligent transportation field is in the order of millions, millions and even higher, the improved HMM map matching algorithm provided by the invention is more efficient and practical in the application scene of a large amount of track data.

And (4) conclusion: the GPS track data is an important data source in the intelligent traffic field, and the map matching is a key ring for applying the track data. The method is important for improving the efficiency of the map matching algorithm in the face of mass track data. On the basis of a traditional HMM map matching algorithm, the invention combines an R tree spatial index and a track matching method, provides a rapid map matching method, and simultaneously performs efficiency optimization on two aspects of road network data processing and track data processing, so that the algorithm has more advantages on timeliness. Through verification, the efficiency of the improved algorithm is obviously improved under the condition of large GPS track data volume, and the method is suitable for application scenes with large track data volume.

It will be appreciated by those of ordinary skill in the art that the examples described herein are intended to assist the reader in understanding the manner in which the invention is practiced, and it is to be understood that the scope of the invention is not limited to such specifically recited statements and examples. Those skilled in the art can make various other specific changes and combinations based on the teachings of the present invention without departing from the spirit of the invention, and these changes and combinations are within the scope of the invention.

Claims

1. An HMM efficient map matching method based on an R tree and track segmentation is characterized by comprising the following steps:

s1: establishing an R tree for road network data;

s2: dividing sub tracks of the GPS track data;

s4: the optimal matching path of the track sample is calculated by using a Viterbi algorithm:

2. The HMM efficient map matching method based on R-tree and trajectory segmentation as claimed in claim 1, wherein:

s2, processing the GPS data point by point according to the real-time position change rate of the track data, and judging whether the point is in the current sub-track or not according to the size relationship between the position change rate between the points and the average change rate; the position change rate between two points is divided into longitude average change rate and latitude average change rate; the average change rate of longitude and the average change rate of latitude are calculated by subtracting the absolute value of the longitude or latitude value of the previous point from the longitude or latitude value of the previous point, dividing the absolute value by the longitude or latitude of the previous point, and expressing the result as follows:

the specific process is as follows:

s22: traversing the GPS track point from the current point i;

s23: calculating the position change rate of the current point and the average position change rate of the current sub-track; the average position change rate is formulated as:

3. The HMM efficient map matching method based on R-tree and trajectory segmentation as claimed in claim 1, wherein:

the specific steps of S4 are as follows:

wherein d is_jThe length of a perpendicular line from a track point to a road is taken as the distance between the track point and the jth road section; theta_jTaking a positive value as an included angle of a track point and a forward direction between the jth road section;

P(X_ij) Probability of selecting a current road section; p (X)_i-1) Probability of being the last road segment; p (X)_i|X_i-1) Is the transition probability between two states; p (X)_ij|X_i) Selecting a probability of a current road segment for a current state;