CN114036345A

CN114036345A - Method and device for processing track data and storage medium

Info

Publication number: CN114036345A
Application number: CN202111361080.XA
Authority: CN
Inventors: 殷冰涛; 韩淳; 刘熙
Original assignee: Xinghuan Zhongzhi Technology Beijing Co ltd
Current assignee: Xinghuan Zhongzhi Technology Beijing Co ltd
Priority date: 2021-11-17
Filing date: 2021-11-17
Publication date: 2022-02-11

Abstract

The embodiment of the invention discloses a method, equipment and a storage medium for processing track data, wherein the method comprises the following steps: determining a first key point in the track data set, and calculating the distance between the first key point and each other track data in the track data set; dividing the track data set into at least one track data subset according to the distance between the first key point and each other track data in the track data set; and recursively executing and determining a second key point in each track data subset, and dividing the corresponding track data subset according to the second key point until obtaining the track data subset which can not be segmented any more, and obtaining a target key point tree corresponding to the track data set. According to the technical scheme of the embodiment of the invention, the first key point is determined in the track data set, and the track data set is divided into a plurality of track data subsets, so that the structure of the constructed key point tree can be optimized, and the retrieval performance of the key point tree can be improved.

Description

Method and device for processing track data and storage medium

Technical Field

The embodiment of the invention relates to the technical field of data processing, in particular to a method and equipment for processing track data and a storage medium.

Background

Trajectory data, which refers to a set of points through which an entity passes in space; through the similarity analysis of mass track data, the future track of the entity is predicted, and the method has wide application prospect in the fields of route recommendation, driving assistance and the like.

At present, in the existing trajectory data similarity analysis method, a large amount of trajectory data is usually stored in a binary tree form, and each piece of trajectory data corresponds to one tree node in the binary tree; when the track data is searched, the binary tree is traversed through the nearest neighbor algorithm to acquire a plurality of stored track data matched with the searched track. However, in the prior art, when the data volume of the track data is large, the depth of generating the binary tree is very high, which results in the reduction of the retrieval efficiency of the track data, and meanwhile, the tolerance distance convergence speed of the conventional nearest neighbor algorithm is slow, which further reduces the retrieval efficiency of the track data.

Disclosure of Invention

Embodiments of the present invention provide a method, an apparatus, and a storage medium for processing trajectory data, which can optimize a structure of a generated key point tree and improve retrieval performance of the key point tree.

In a first aspect, an embodiment of the present invention provides a method for processing trajectory data, including:

determining a first key point in the track data set, and calculating the distance between the first key point and each other track data in the track data set;

dividing the track data set into at least one track data subset according to the distance between the first key point and each other track data in the track data set;

recursively executing to determine a second key point in each track data subset, and dividing the corresponding track data subset according to the second key point until obtaining a track data subset which can not be segmented any more, and obtaining a target key point tree corresponding to the track data set;

the target key point tree comprises a root node and at least one sub node; the root node corresponds to the first key point, and each sub-node corresponds to each second key point.

In a second aspect, an embodiment of the present invention further provides a computer device, including a processor and a memory, where the memory is used to store instructions that, when executed, cause the processor to:

In a third aspect, an embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored on the storage medium, and when the computer program is executed by a processor, the computer program implements the method for processing trajectory data according to any embodiment of the present invention.

According to the technical scheme provided by the embodiment of the invention, a first key point is determined in a track data set, and the distance between the first key point and each other track data in the track data set is calculated; dividing the track data set into a plurality of track data subsets according to the distance between the first key point and each other track data in the track data set; recursively executing to determine a second key point in each track data subset, and dividing the corresponding track data subset according to the second key point until obtaining the track data subset which can not be segmented any more, and obtaining a target key point tree corresponding to the track data set; by determining the first key point in the track data set and dividing the track data set into a plurality of track data subsets, the structure of the constructed key point tree can be optimized, the depth of the key point tree is reduced, and the retrieval performance of the key point tree can be improved.

Drawings

FIG. 1A is a flow chart of a method for processing trajectory data according to an embodiment of the invention;

FIG. 1B is a diagram illustrating a prior art structure of a priority tree;

FIG. 1C is a schematic diagram of prior art partitioning of a trace data set in a data space;

FIG. 2A is a flow chart of a method for processing trajectory data according to another embodiment of the present invention;

FIG. 2B is a diagram illustrating a structure of a target keypoint tree according to another embodiment of the present invention;

FIG. 3A is a flow chart of a method for processing trajectory data according to another embodiment of the present invention;

FIG. 3B is a diagram of a target keypoint tree search algorithm in another embodiment of the invention;

FIG. 3C is a schematic diagram of a distance tolerance according to another embodiment of the present invention;

FIG. 4 is a schematic structural diagram of a track data processing device according to another embodiment of the present invention;

fig. 5 is a schematic structural diagram of a computer device in another embodiment of the present invention.

Detailed Description

The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some of the structures related to the present invention are shown in the drawings, not all of the structures.

Before discussing exemplary embodiments in more detail, it should be noted that some exemplary embodiments are described as processes or methods depicted as flowcharts. Although a flowchart may describe the operations (or steps) as a sequential process, many of the operations can be performed in parallel, concurrently or simultaneously. In addition, the order of the operations may be re-arranged. A process may be terminated when its operations are completed, but may have additional steps not included in the figure. A process may correspond to a method, a function, a procedure, a subroutine, a subprogram, etc.

The term "trajectory data set" as used herein may be a data set composed of a plurality of trajectory data.

The term "first keypoint" as used herein may be one of the trajectory data determined in the trajectory data set, and may be a first priority point in the priority point tree.

The term "distance" as used herein may be the distance between two trajectory data in a metric space, and may be, for example, a Hausdorff distance; wherein, the smaller the distance is, the higher the similarity of the two track data is; the larger the distance, the lower the similarity between the two pieces of trajectory data.

The term "subset of trajectory data" as used herein may be a plurality of sets of trajectory data divided by the set of trajectory data after the first keypoint is removed.

The term "second key point" used herein may be one trajectory data respectively determined in each trajectory data subset, and may be a priority point corresponding to each child node in the priority point tree.

The term "target keypoint tree" as used herein may be a tree-like data structure generated from a trajectory data set, and may be a priority tree comprising multiple branches; the target key point tree comprises a root node and a plurality of sub nodes, wherein the root node corresponds to the first key point, and each sub node corresponds to each second key point.

Fig. 1A is a flowchart of a method for processing trajectory data according to an embodiment of the present invention, where the embodiment of the present invention is applicable to a case where a trajectory data set is stored in a key point tree form; the method may be performed by a processor in a computer device and may generally be integrated in a computer device. As shown in fig. 1A, the method specifically includes the following steps:

s110, determining a first key point in the track data set, and calculating the distance between the first key point and each other track data in the track data set.

The trajectory data set includes at least one trajectory data, and the trajectory data may be a moving trajectory of a vehicle, for example, a driving trajectory of a vehicle, or an activity trajectory of a person.

It will be appreciated that the trajectory data may be abstracted as one point in the metric space, in which case the similarity between different trajectories may be expressed as the distance between two points in the metric space. In this embodiment, one piece of trajectory data may be randomly selected as the first keypoint in the trajectory data set, and distances between the current first keypoint and other pieces of trajectory data in the trajectory data set may be calculated, respectively.

The values are noted that when the key point tree corresponding to the track data set is constructed, only one similarity measurement function is needed to establish the key point tree; the key point tree is a general index, and efficient indexing of data of any type can be realized without paying attention to the data type of the index data. It should be noted that, the construction of the key point tree may not be implemented by any similarity metric function, and the similarity metric function in the key point tree needs to satisfy the following three criteria at the same time: 1. the similarity distance is non-negative; 2. the switching law is satisfied; 3. satisfying the triangle inequality.

In the present embodiment, the similarity metric function may be a Hausdorff distance (Hausdorff distance); the Hausdorff distance is used for representing the distance between two subsets in a measurement space and is commonly used for calculating the similarity of the tracks; it should be noted that the smaller the hausdorff distance between two pieces of trajectory data is, the higher the similarity between the two pieces of trajectory data is; the larger the hausdorff distance between two pieces of trajectory data is, the lower the similarity between the two pieces of trajectory data is.

The values note that the selection of the first keypoint seriously affects the performance of the generated keypoint tree; the difference of the boundary values of the sub-trees of each path divided according to the first key point should be as uniform as possible, so as to improve the probability of successful pruning during data retrieval and improve the retrieval performance of the key point tree. In this embodiment, a plurality of trajectory data sets may be randomly selected in advance in the trajectory data set as candidate keypoints, scores corresponding to the candidate keypoints are calculated according to a preset evaluation rule, and a candidate keypoint with the highest score is finally selected as the first keypoint.

And S120, dividing the track data set into at least one track data subset according to the distance between the first key point and each other track data in the track data set.

It is noted that, in the prior art, as shown in fig. 1B, a track data set is usually divided into two paths continuously to finally obtain a corresponding key point tree (i.e., a priority point tree), i.e., a balanced binary tree structure. In the priority tree, each non-leaf node includes a VP Identification (ID) for identifying a priority point (VP), a sliced median mu, and two pointers pointing to left and right subtrees, respectively. Through the balanced binary tree structure, the priority point tree actually carries out continuous spherical bisection on the track data set; in the entire data space, the trajectory data set is divided into a number of spherical subspaces stacked on top of each other, centered on different first keypoints, as shown in fig. 1C. However, when there is a large amount of track data, the depth of the finally generated priority point tree will be very high, which will seriously affect the retrieval performance of the track data.

In order to solve the above problem, in this embodiment, a multi-path structure is adopted to replace two paths of structures of the existing priority point tree, and the priority point tree of the multi-path structure is finally obtained by continuously dividing the track data set, so that the depth of the priority point tree corresponding to a large amount of track data can be greatly reduced, and the retrieval efficiency of the priority point tree is improved.

In this embodiment, after the distances between the first key point and each of the other trajectory data are obtained, the maximum distance value may be determined in each distance; and equally dividing the maximum distance value according to the number of the track data subsets needing to be divided so as to determine a plurality of cutting values, and dividing the track data set into a plurality of track data subsets according to each cutting value. For example, if the maximum distance is 100 and the number of the trajectory data subsets to be divided is 4, the corresponding cut values may be 25, 50, and 75; specifically, track data with a distance of 25 or less is added to one track data subset, track data with a distance of more than 25 and 50 or less is added to one track data subset, track data with a distance of more than 50 and 75 or less is added to one track data subset, and track data with a distance of more than 75 and 100 or less is added to one track data subset, so as to finally obtain four track data subsets corresponding to the track data set.

Optionally, the other trajectory data in the trajectory data set may be sorted in an ascending order according to the distance from the first key point, and the trajectory data sorted in the ascending order is equally divided according to the total number of the other trajectory data, so as to obtain a plurality of trajectory data subsets corresponding to the trajectory data set; for example, if the number of the other track data sets is 100, and the number of the track data subsets to be divided is 4, the first 25 of the other track data sets sorted in the ascending order are taken as one track data subset, the 26 th to 50 th track data sets are taken as one track data subset, the 51 st to 75 th track data sets are taken as one track data subset, and the 76 th to 100 th track data sets are taken as one track data subset. It can be understood that, when the total number of other track data cannot be integrated, it is only necessary to ensure that the difference in the number of the track data among the track data subsets does not exceed one.

In an optional implementation manner of this embodiment, dividing the trajectory data set into at least one trajectory data subset according to a distance between the first key point and each other trajectory data in the trajectory data set may include:

according to the distance between the first key point and each other track data in the track data set, sequencing each other track data in an ascending order; determining at least one track data set cutting value according to the maximum distance value between the first key point and each other track data in the track data set; and dividing other track data sorted in an ascending order into at least one track data subset according to the cutting value of each track data set.

In this embodiment, the other trajectory data may be sorted in an ascending order according to the distance between the other trajectory data and the first key point, and the split values of the multiple trajectory data sets may be determined according to the maximum distance value and the number of trajectory data subsets to be split; and finally, according to the obtained segmentation values of the plurality of track data sets, directly segmenting each other track data after ascending sequencing to obtain a plurality of track data subsets.

In this embodiment, each piece of track data is sorted in advance according to the distance between each piece of other track data and the first key point, so that after the track data set segmentation value is determined, the sorted track data can be directly segmented to obtain a plurality of corresponding track data subsets, the segmentation efficiency of the track data set can be improved, and the construction efficiency of the target key point tree can be improved.

S130, recursively executing to determine second key points in each track data subset, and dividing the corresponding track data subset according to the second key points until obtaining the track data subset which can not be divided any more, and obtaining a target key point tree corresponding to the track data set.

The target key point tree comprises a root node and at least one sub node; the root node corresponds to the first key point, and each child node corresponds to each second key point.

In this embodiment, after the initial partitioning of the track data set is completed, recursive partitioning may be performed on each track data subset; specifically, one piece of track data may be determined in each track data subset as a corresponding second key point, the distance between the other track data in each track data subset and the corresponding second key point is calculated, and each track data subset is divided into a plurality of track data subsets again according to the distance; further, the operations are re-executed in each of the re-divided trajectory data subsets until the obtained trajectory data subset cannot be divided (only one trajectory data is included), and the complete segmentation of the trajectory data set is completed.

It should be noted that, the construction of the target key point tree is performed while the track data set is divided; specifically, the determined first key point is used as a root node of the target key point tree, and non-leaf nodes of the target key point tree at all depths are determined according to the confirmation sequence of the second key point; and finally, determining the acquired track data subset which can not be segmented any more as a leaf node, and completing the construction of the target key point tree.

In this embodiment, after a trajectory data set including a plurality of trajectory data is acquired, the trajectory data set is mapped into a tree-shaped data structure having multiple branches, so that a structure for generating a key point tree can be optimized, a depth for generating the key point tree is reduced, and retrieval performance of the key point tree is improved.

Fig. 2A is a flowchart of a method for processing trajectory data according to another embodiment of the present invention, which specifically introduces determining a first keypoint in a trajectory data set based on the above technical solution, as shown in fig. 2A, the method includes:

s210, randomly sampling is carried out in the track data set, a first preset number of candidate key points and a second preset number of reference points corresponding to the candidate key points are obtained, and the distance between each candidate key point and each corresponding reference point is calculated.

The first preset number is the number of preset collection candidate key points; and correspondingly, the second preset number is the number of the corresponding reference points of the collected candidate key points which are preset.

In this embodiment, after the trajectory data set is obtained, random sampling is performed in the trajectory data set to obtain a first preset number of trajectory data as candidate key points; and then, for each candidate key point, randomly sampling from the track data set again to obtain a second preset number of reference points corresponding to each candidate key point. It will be appreciated that random sampling of the trajectory data is all non-return sampling, i.e. once a certain trajectory data is sampled, the trajectory data is no longer included in the trajectory data set. After a plurality of candidate key points and a plurality of reference points corresponding to each candidate key point are obtained through sampling, the distance between each candidate key point and each corresponding reference point is respectively calculated.

And S220, dividing each reference point corresponding to each candidate key point into at least one reference point set according to the distance between each candidate key point and each corresponding reference point, and respectively calculating the boundary value difference of each reference point set corresponding to each candidate key point.

And the boundary value comprises the maximum distance value and the minimum distance value of the candidate point and each reference point in the corresponding reference point set.

In this embodiment, after the distances between each candidate keypoint and each corresponding reference point are obtained through calculation, for each candidate keypoint, the maximum distance value between each candidate keypoint and each corresponding reference point may be determined, and a plurality of division values may be determined according to the maximum distance value and the number of reference point sets to be divided; and then dividing each reference point according to the division value and the distance between each reference point and the corresponding candidate key point to acquire a plurality of reference point sets respectively corresponding to each candidate key point.

Further, in each reference point set, the maximum distance value and the minimum distance value between each reference point and the corresponding candidate key point are counted as the boundary value corresponding to the reference point set, and the difference value between the boundary values corresponding to each reference point set, that is, the difference value between the maximum distance value and the minimum distance value, is calculated.

S230, determining the grade of each candidate key point according to the boundary value difference value of each candidate key point corresponding to each reference point set; and determining the first key point with the highest score in the candidate key points according to the scores of the candidate key points.

In this embodiment, for each candidate keypoint, a sum of boundary value differences corresponding to each reference point set may be calculated, and the sum is used as a score of the corresponding candidate keypoint; or, the variance of the boundary value difference between each candidate keypoint and each reference point set may be calculated, and the variance is used as the score of the corresponding candidate keypoint. Further, after determining the score of each candidate keypoint, the highest score may be determined in each score, and the candidate keypoint corresponding to the highest score may be determined as the first keypoint.

It should be noted that, in the construction process of the key point tree, the selection of the first key point is crucial; according to the first key point, the difference value between the boundary values of each segmented track data subset should be as large as possible so as to improve the pruning success probability when the key point tree is retrieved; secondly, the boundary value difference values of different track data subsets should be as uniform as possible, so as to ensure that the pruning success probabilities aiming at different search target points are the same, and avoid performance jump.

In this embodiment, a plurality of candidate keypoints are pre-selected by a random sampling method, and a candidate keypoint with the highest score is selected from the candidate keypoints as a final first keypoint by a preset evaluation rule, so that the rationality of selecting the first keypoint can be ensured; the condition that effective pruning cannot be carried out due to the fact that the retrieval interval corresponding to the search target point simultaneously comprises a plurality of subtrees caused by random selection of the first key point can be avoided, the problem that retrieval performance is reduced due to the fact that the plurality of subtrees need to be retrieved can be avoided, and retrieval performance of generating the key point tree can be improved.

In an optional implementation manner of this embodiment, determining the score of each candidate keypoint according to a boundary value difference of each candidate keypoint with respect to each reference point set may include: respectively calculating the sum of the boundary value difference values of the candidate key points corresponding to the reference point sets and the variance of the boundary value difference values of the candidate key points corresponding to the reference point sets; and determining the scores of the candidate key points according to the sum and the variance.

For each candidate key point, the ratio of the sum and the variance of the boundary value difference corresponding to each reference point set may be used as a score corresponding to each candidate key point; alternatively, the product of the sum of the boundary value differences and the variance of each set of reference points may be used as the score corresponding to each candidate keypoint.

In another optional implementation manner of this embodiment, determining the score of each candidate keypoint according to the sum and the variance may include:

according to the formula: calculating a score p of each candidate key point, wherein the score p is SUM/ln (e + VAR); the SUM represents the SUM of the boundary value difference values of the candidate key points corresponding to the reference point sets, the VAR represents the variance of the boundary value difference values of the candidate key points corresponding to the reference point sets, ln represents a natural logarithm function, and e represents the base number of the natural logarithm function.

And S240, calculating the distance between the first key point and each other track data in the track data set.

And S250, dividing the track data set into at least one track data subset according to the distance between the first key point and each other track data in the track data set.

The number of the track data subsets is divided, and self-adaptive adjustment can be performed according to the total number of the track data in the track data; for example, when the total number of the trace data in the trace data set is large, the number of the divided trace data subsets may be increased appropriately; when the total number of the trajectory data in the trajectory data set is small, the number of the divided trajectory data subsets can be appropriately reduced.

In a specific example, when the number of the subsets of the divided trajectory data is 4, as shown in fig. 2B, the correspondingly generated target keypoint tree includes four branches; where R represents the first keypoint and C1, C2, C3, and C4 represent each non-leaf node of depth 2, respectively. Optionally, each non-leaf node may be further divided into four branches; in each non-leaf node, the VP ID of the current non-leaf node may be recorded, and the four branches of the partition are d1, d2, d3 and d4, respectively; meanwhile, the boundary values (the maximum value upper and the minimum value lower of the distance from the current non-leaf node) corresponding to each branch and the child node child having an inheritance relationship with the current non-leaf node can be recorded.

And S260, recursively executing and determining second key points in each track data subset, and dividing the corresponding track data subset according to the second key points until the track data subset which can not be segmented is obtained, and obtaining a target key point tree corresponding to the track data set.

According to the technical scheme provided by the embodiment of the invention, the random sampling is carried out in the track data set to obtain a first preset number of candidate key points and a second preset number of reference points corresponding to the candidate key points respectively, and the distances between the candidate key points and the corresponding reference points are calculated respectively; dividing each reference point corresponding to each candidate key point into a plurality of reference point sets according to the distance between each candidate key point and each corresponding reference point, and respectively calculating the boundary value difference value of each reference point set corresponding to each candidate key point; finally, determining the grade of each candidate key point according to the boundary value difference value of each candidate key point corresponding to each reference point set; determining a first key point with the highest score in each candidate key point according to the score of each candidate key point; by acquiring a certain number of candidate key points in advance and acquiring the candidate key point with the highest score as the first key point according to the preset evaluation rule, the retrieval performance of generating the target key point tree can be improved, and the retrieval efficiency of the target key point tree is improved.

Fig. 3A is a flowchart of a processing method of trajectory data according to another embodiment of the present invention, which specifically introduces searching trajectory data matching a search trajectory in a key point tree according to a search request of the search trajectory based on the above technical solution, and as shown in fig. 3A, the method includes:

s310, determining a first key point in the track data set, and calculating the distance between the first key point and each other track data in the track data set.

And S320, dividing the track data set into at least one track data subset according to the distance between the first key point and each other track data in the track data set.

S330, recursively executing and determining second key points in each track data subset, dividing the corresponding track data subset according to the second key points until the track data subset which can not be divided is obtained, and obtaining a target key point tree corresponding to the track data set.

And S340, when a retrieval request of the retrieval track is received, determining a third preset number of sub-nodes matched with the retrieval track in the target key point tree according to the preset tolerance distance.

The retrieval track may be input by a user and needs to search track data similar to stored track data. The tolerance distance refers to the allowable error distance between the retrieval track and the matched sub-node; specifically, if the distance between a certain child node in the target key point tree and the retrieval track is smaller than the tolerance distance, the child node is considered to be matched with the retrieval track. Correspondingly, presetting a tolerance distance, namely presetting a preset tolerance distance; for example, the preset tolerance distance may be 0, that is, only the sub-node having a distance of 0 from the search track is determined to be matched with the search track.

It should be noted that, when retrieving a retrieval track, the retrieval track may be abstracted into a target point; firstly, the distance between the target point and the first key point of the current target key point tree is calculated, and the distance is judged to fall within the boundary value range of which sub-tree, so as to determine the sub-tree corresponding to the target point. When the target key point tree is searched subsequently, only the subtree needs to be searched, and other subtrees do not need to be searched, so that the pruning of the target key point tree is realized.

In this embodiment, a pre-search may be performed once to find a third preset number of sub-nodes matching the retrieval trajectory in the target keypoint tree according to the preset tolerance distance set to 0. Since the preset tolerance distance is 0, only one sub-tree needs to be searched, the path length of the search is equal to the depth of the target key point tree, and since the depth of the target key point tree generated in the embodiment is smaller, the time consumption of the current search is smaller. And the third preset number is the preset number of the sub-nodes which need to be searched and are matched with the retrieval track.

In this embodiment, when searching for a sub-node matching the search trajectory, if the distance between the sub-node and the search trajectory is less than or equal to the preset tolerance distance, the sub-node may be considered to be matched with the search trajectory.

It can be understood that, when the preset tolerance distance is not 0, the search distance corresponding to the target point is a range value; taking the tolerance distance as u, the distance between the target point and the first key point as d as an example, the search distance range corresponding to the target point is [ d-u, d + u ]. In this case, the search distance range of the target point may fall within the boundary range of the plurality of subtrees, and the plurality of subtrees need to be searched simultaneously.

In a specific example, the target keypoint tree is a balanced binary tree structure, and a corresponding keypoint tree search algorithm is shown in fig. 3B. The preset tolerance distance is u, and the segmentation median corresponding to the first key point is mu; firstly, calculating the distance between a target point and a first key point; based on the triangle inequality, if the distance is not less than mu + u, it can be determined that a point with a distance less than u from the target point cannot exist in the left sub-tree, and then only the right sub-tree needs to be searched; if the distance is less than or equal to mu-u, only the left sub-tree needs to be searched; if mu-u < distance < mu + u, pruning cannot be performed, and the left and right subtrees need to be searched simultaneously.

It should be noted that the smaller the tolerance distance, the higher the probability of successful pruning, and the better the retrieval performance; for example, as shown in fig. 3C, a certain data space is divided into three subspaces of S1, S2 and S3 based on the first keypoint VP (the trajectory data set is divided into three trajectory data subsets), the target point falls in the subspace S2, T is the tolerated distance. In the scenario (1), the circular range centered on the target point and having the radius T covers all of S1, S2, and S3, so pruning cannot be performed, and search traversal in three subspaces is required. In the scenario (2), the circular range with the target point as the center and the radius of T only covers S2, so S1 and S3 can be excluded, and pruning can be successfully achieved.

The method has the advantages that when the tolerance distance is too small, excessive pruning is easily caused, the retrieval result is inaccurate, or the matched track data cannot be retrieved; thus, a suitable tolerance distance needs to be selected according to actual task requirements.

S350, updating the preset tolerance distance by adopting the retrieval track and the maximum distance matched with each sub-node; and returning to execute the operation of determining the third preset number of sub-nodes matched with the retrieval track in the target key point tree according to the preset tolerance distance until the target key point tree is searched, and acquiring the third preset number of target sub-nodes matched with the retrieval track.

In this embodiment, after a plurality of sub-nodes matched with the retrieval track are obtained, the maximum distance between the retrieval track and each matched sub-node is determined, and the preset tolerance distance is replaced by the maximum distance, so as to update the preset tolerance distance. Further, after the updating of the preset tolerance distance is completed, the target key point tree may be searched again according to the current preset tolerance distance, so as to obtain a third preset number of sub-nodes matched with the retrieval track again, and the obtained sub-nodes are used as target sub-nodes.

It should be noted that, when searching the target keypoint tree according to the updated preset tolerance distance, if all searches of the target keypoint tree have not been completed, a third preset number of matching sub-nodes are already obtained; updating the preset tolerance distance again by adopting the maximum distance between each matching sub-node and the retrieval track, and searching the target key point tree again according to the updated preset tolerance distance; and repeating the process until the target key point tree is searched, and acquiring a third preset number of target sub-nodes matched with the retrieval track.

In this embodiment, by pre-searching according to the initial preset tolerance distance and updating the initial preset tolerance distance according to the search result, the convergence speed of the tolerance distance can be increased, and the retrieval efficiency of matching the retrieved track with the stored track data can be increased.

In an optional implementation manner of this embodiment, determining, according to a preset tolerance distance, a third preset number of sub-nodes matched with the retrieval trajectory in the target keypoint tree may include:

and when the number of the sub-nodes matched with the retrieval track is determined to be smaller than a third preset number in the target key point tree according to a preset tolerance distance, searching the sub-nodes matched with the retrieval track in the adjacent leaf nodes of the sub-nodes matched with the retrieval track until the sub-nodes of the third preset number matched with the retrieval track are obtained.

It can be understood that when the third preset number is greater than the depth of the target key point tree, the sub-nodes of the third preset number matched with the retrieval track may not be obtained through one-time search; for example, if the preset tolerance distance is 0, the maximum number of the matched sub-nodes can be obtained as the depth of the target key point tree; at this time, backtracking can be performed according to the search path through a backtracking algorithm, so that matching search is sequentially performed on the leaf nodes adjacent to the search path until a third preset number of sub-nodes matched with the retrieval track are obtained.

In this embodiment, when a sufficient number of matching sub-nodes cannot be obtained through one pre-search, the adjacent leaf nodes are searched through a backtracking algorithm, so that a third preset number of matching sub-nodes can be obtained, and further, the updating of the preset tolerance distance can be ensured, so that the search trajectory searching method of this embodiment can be adapted to target key point trees of any depth.

Optionally, before searching the retrieval track, the number of nodes of the target keypoint tree may be predetermined, and when the number of nodes is small, that is, when the stored track data is small, the target keypoint tree may be directly traversed, and the distance between the retrieval track and each child node is calculated to determine a third preset number of child nodes closest to the retrieval track.

According to the technical scheme provided by the embodiment of the invention, after the target key points corresponding to the track data set are obtained, when a retrieval request of the retrieval track is received, a third preset number of sub-nodes matched with the retrieval track are determined in the target key point tree according to a preset tolerance distance; then updating the preset tolerance distance by adopting the retrieval track and the maximum distance matched with each sub-node; returning to execute the operation of determining a third preset number of sub-nodes matched with the retrieval track in the target key point tree according to the preset tolerance distance until the target key point tree is searched, and acquiring a third preset number of target sub-nodes matched with the retrieval track; the initial preset tolerance distance is updated according to the pre-search result, the convergence speed of the tolerance distance is improved, and the search efficiency of the target key point tree is improved.

Fig. 4 is a schematic structural diagram of a track data processing apparatus according to another embodiment of the present invention. As shown in fig. 4, the apparatus includes: a distance calculation module 410, a trajectory data set partitioning module 420, and a target keypoint tree acquisition module 430. Wherein the content of the first and second substances,

a distance calculating module 410, configured to determine a first key point in the track data set, and calculate a distance between the first key point and each other track data in the track data set;

a track data set dividing module 420, configured to divide the track data set into at least one track data subset according to a distance between the first key point and each other track data in the track data set;

a target keypoint tree obtaining module 430, configured to recursively determine a second keypoint in each trajectory data subset, and perform a dividing operation on the corresponding trajectory data subset according to the second keypoint until a trajectory data subset that cannot be further divided is obtained, so as to obtain a target keypoint tree corresponding to the trajectory data set;

Optionally, on the basis of the foregoing technical solution, the distance calculating module 410 includes:

the candidate key point acquisition unit is used for randomly sampling in the track data set, acquiring a first preset number of candidate key points and a second preset number of reference points corresponding to the candidate key points respectively, and calculating the distance between each candidate key point and each corresponding reference point respectively;

a boundary value difference calculation unit, configured to divide each reference point corresponding to each candidate key point into at least one reference point set according to the distance between each candidate key point and each corresponding reference point, and calculate a boundary value difference between each candidate key point and each reference point set; the boundary value comprises the maximum distance value and the minimum distance value of the candidate point and each reference point in the corresponding reference point set;

the first key point determining unit is used for determining the grade of each candidate key point according to the boundary value difference value of each candidate key point corresponding to each reference point set; and determining the first key point with the highest score in the candidate key points according to the scores of the candidate key points.

Optionally, on the basis of the above technical solution, the first keypoint determining unit is specifically configured to calculate a sum of boundary value differences between each candidate keypoint and each reference point set, and a variance of the boundary value differences between each candidate keypoint and each reference point set, respectively; and determining the scores of the candidate key points according to the sum and the variance.

Optionally, on the basis of the above technical solution, the first keypoint determining unit is specifically configured to: calculating a score p of each candidate key point, wherein the score p is SUM/ln (e + VAR); the SUM represents the SUM of the boundary value difference values of the candidate key points corresponding to the reference point sets, the VAR represents the variance of the boundary value difference values of the candidate key points corresponding to the reference point sets, ln represents a natural logarithm function, and e represents the base number of the natural logarithm function.

Optionally, on the basis of the foregoing technical solution, the trajectory data set partitioning module 420 includes:

the ascending sequencing unit is used for ascending sequencing the other track data according to the distance between the first key point and the other track data in the track data set;

the track data set segmentation value determining unit is used for determining at least one track data set segmentation value according to the maximum distance value between the first key point and each other track data in the track data set;

and the track data subset dividing unit is used for dividing other track data which are sorted in an ascending order into at least one track data subset according to the cutting value of each track data set.

Optionally, on the basis of the above technical solution, the processing apparatus of trajectory data further includes:

the sub-node determining module is used for determining a third preset number of sub-nodes matched with the retrieval track in the target key point tree according to the preset tolerance distance when the retrieval request of the retrieval track is received;

the target sub-node acquisition module is used for updating the preset tolerance distance by adopting the retrieval track and the maximum distance matched with each sub-node; and returning to execute the operation of determining the third preset number of sub-nodes matched with the retrieval track in the target key point tree according to the preset tolerance distance until the target key point tree is searched, and acquiring the third preset number of target sub-nodes matched with the retrieval track.

Optionally, on the basis of the above technical scheme, the sub-node determining module is specifically configured to, when it is determined that the number of sub-nodes matched with the retrieval trajectory is smaller than a third preset number in the target keypoint tree according to a preset tolerance distance, search, in adjacent leaf nodes of each sub-node matched with the retrieval trajectory, sub-nodes matched with the retrieval trajectory until a third preset number of sub-nodes matched with the retrieval trajectory are obtained.

The device can execute the processing method of the track data provided by all the embodiments of the invention, and has the corresponding functional modules and beneficial effects for executing the method. For technical details which are not described in detail in the embodiments of the present invention, reference may be made to the methods provided in all the aforementioned embodiments of the present invention.

Fig. 5 is a schematic structural diagram of a computer apparatus according to another embodiment of the present invention, as shown in fig. 5, the computer apparatus includes a processor 510, a memory 520, an input device 530, and an output device 540; the number of the processors 510 in the computer device may be one or more, and one processor 510 is taken as an example in fig. 5; the processor 510, the memory 520, the input device 530 and the output device 540 in the computer apparatus may be connected by a bus or other means, and the connection by the bus is exemplified in fig. 5.

The memory 520 is a computer-readable storage medium, and can be used for storing software programs, computer-executable programs, and modules, such as program instructions/modules corresponding to a processing method of trajectory data in any embodiment of the present invention (for example, the distance calculating module 410, the trajectory data set partitioning module 420, and the target keypoint tree obtaining module 430 in a processing apparatus of trajectory data). The processor 510 executes various functional applications and data processing of the computer device by executing software programs, instructions and modules stored in the memory 520, namely, implements a method for processing trajectory data as described above. That is, the program when executed by the processor implements:

The memory 520 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required for at least one function; the storage data area may store data created according to the use of the terminal, and the like. Further, the memory 520 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some examples, memory 520 may further include memory located remotely from processor 510, which may be connected to a computer device through a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The input device 530 may be used to receive input numeric or character information and generate key signal inputs related to user settings and function control of the computer apparatus, and may include a keyboard and a mouse, etc. The output device 540 may include a display device such as a display screen.

Embodiments of the present invention further provide a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the method according to any of the embodiments of the present invention. Of course, the embodiment of the present invention provides a computer-readable storage medium, which can perform related operations in a method for processing trajectory data according to any embodiment of the present invention. That is, the program when executed by the processor implements:

From the above description of the embodiments, it is obvious for those skilled in the art that the present invention can be implemented by software and necessary general hardware, and certainly, can also be implemented by hardware, but the former is a better embodiment in many cases. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which can be stored in a computer-readable storage medium, such as a floppy disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a FLASH Memory (FLASH), a hard disk or an optical disk of a computer, and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device) to execute the methods according to the embodiments of the present invention.

It should be noted that, in the embodiment of the track data processing apparatus, the units and modules included in the track data processing apparatus are only divided according to functional logic, but are not limited to the above division as long as the corresponding functions can be implemented; in addition, specific names of the functional units are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present invention.

It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims

1. A method for processing trajectory data, comprising:

2. The method of claim 1, wherein determining a first keypoint in a trajectory dataset comprises:

randomly sampling in the track data set to obtain a first preset number of candidate key points and a second preset number of reference points corresponding to the candidate key points respectively, and calculating the distance between each candidate key point and each corresponding reference point respectively;

dividing each reference point corresponding to each candidate key point into at least one reference point set according to the distance between each candidate key point and each corresponding reference point, and respectively calculating the boundary value difference value of each reference point set corresponding to each candidate key point; the boundary value comprises the maximum distance value and the minimum distance value of the candidate point and each reference point in the corresponding reference point set;

determining the grade of each candidate key point according to the boundary value difference value of each candidate key point corresponding to each reference point set; and determining the first key point with the highest score in the candidate key points according to the scores of the candidate key points.

3. The method of claim 2, wherein determining the score for each candidate keypoint based on the boundary value difference for each set of reference points for each candidate keypoint comprises:

respectively calculating the sum of the boundary value difference values of the candidate key points corresponding to the reference point sets and the variance of the boundary value difference values of the candidate key points corresponding to the reference point sets;

and determining the scores of the candidate key points according to the sum and the variance.

4. The method of claim 3, wherein determining a score for each candidate keypoint based on the sum and variance comprises:

5. The method of claim 1, wherein dividing the trajectory data set into at least one subset of trajectory data based on a distance of the first keypoint from each other trajectory data in the trajectory data set comprises:

according to the distance between the first key point and each other track data in the track data set, sequencing each other track data in an ascending order;

determining at least one track data set cutting value according to the maximum distance value between the first key point and each other track data in the track data set;

and dividing other track data sorted in an ascending order into at least one track data subset according to the cutting value of each track data set.

6. The method of claim 1, further comprising:

when a retrieval request of a retrieval track is received, determining a third preset number of sub-nodes matched with the retrieval track in the target key point tree according to a preset tolerance distance;

updating the preset tolerance distance by adopting the retrieval track and the maximum distance matched with each sub-node; and returning to execute the operation of determining the third preset number of sub-nodes matched with the retrieval track in the target key point tree according to the preset tolerance distance until the target key point tree is searched, and acquiring the third preset number of target sub-nodes matched with the retrieval track.

7. The method of claim 6, wherein determining a third preset number of sub-nodes in the target keypoint tree for which the search trajectory matches according to a preset tolerance distance comprises:

8. A computer device comprising a processor and a memory, the memory to store instructions that, when executed, cause the processor to:

9. The computer device of claim 8, wherein the processor is configured to determine the first keypoint in the trajectory data set by:

10. The computer device of claim 9, wherein the processor is configured to determine the score for each candidate keypoint based on the difference between the boundary values of the candidate keypoint and the reference point set by:

11. The computer device of claim 10, wherein the processor is configured to determine the score for each candidate keypoint based on the sum and variance by:

12. The computer device of claim 8, wherein the processor is configured to divide the trajectory data set into at least one subset of trajectory data based on a distance of the first keypoint from each other trajectory data in the trajectory data set by:

13. The computer device of claim 8, wherein the processor is further configured to:

14. The computer device of claim 13, wherein the processor is configured to determine a third preset number of sub-nodes in the target keypoint tree for which the search trajectory matches according to a preset tolerance distance by:

15. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out a method of processing trajectory data according to any one of claims 1 to 7.