CN110598917B

CN110598917B - Destination prediction method, system and storage medium based on path track

Info

Publication number: CN110598917B
Application number: CN201910788582.7A
Authority: CN
Inventors: 余明辉; 王昌栋; 詹增荣
Original assignee: Guangzhou Panyu Polytechnic
Current assignee: Guangzhou Panyu Polytechnic
Priority date: 2019-08-23
Filing date: 2019-08-23
Publication date: 2020-11-24
Anticipated expiration: 2039-08-23
Also published as: CN110598917A

Abstract

The invention discloses a destination prediction method, a destination prediction system and a storage medium based on a path track, wherein the method comprises the following steps: performing numerical value conversion processing on path data in an order to be predicted according to a preset conversion rule to obtain corresponding path track representation data; inputting the corresponding path track representation data into a local sensitive Hash model, and calculating according to an LSH algorithm to obtain a corresponding matching result; and clustering the corresponding matching results according to the number of destinations by adopting a K-Means algorithm, calculating the path similarity of the matching results in each cluster according to a similarity measurement formula, and selecting the path with the highest similarity in each cluster as a multi-destination prediction result. The method can match the approximate path set through the Hash model, adopt different methods aiming at different problems of predicting a unique destination, screening multiple destinations and the like, improve the destination prediction capability based on the path track, and provide reference for applications in which personal information of users or drivers is unavailable.

Description

Destination prediction method, system and storage medium based on path track

Technical Field

The invention relates to the technical field of intelligent traffic information processing, in particular to a destination prediction method and system based on a path track and a storage medium.

Background

With the development of GPS and 4G networks, modern mobile devices, such as smart phones, are basically built with GPS receivers and navigation systems, which can locate users with high accuracy. These devices generate a large amount of Location data that can be used for a variety of Location-Based Services (LBS) including route planning, real-time feedback of road conditions, recommendations of eating, shopping, or tourist attractions, Location-Based social network analysis, and the like. The application of LBS greatly facilitates people's daily life, one of the most popular applications being various taxi taking software. The software platforms collect a large amount of order and track information every day, collected data provide unprecedented opportunities for people to mine behavior characteristics of drivers and users, and real-time intelligent decision systems such as order scheduling, taxi demand prediction and route planning can be built in different applications.

In daily life, as the demand for applications based on location information increases, more and more research on predicting the current route destination through the driver's travel trajectory is also being conducted. The use of user destination prediction in ad placement is apparent. When a user is riding a taxi, the LBS provider may collect location information from their GPS device of their cell phone or taxi, predict the most likely destination and recommend to the user advertisements for restaurants or malls near the destination, which may attempt to recommend a sightseeing spot if the user is determined to be a traveler. Destination prediction may also be used to assist in route determination, crowd anomaly detection, and the like. Additionally, in a navigation system, the prediction of a person's destination may help determine whether the person deviates from an expected route. As a potential application, by predicting places where most people go in a certain time period, an administrator can judge the crowd scale of the places and take corresponding preventive measures; even in extreme cases, when a certain route is acquired, the investigation personnel of some government agencies can prejudge the destination of the suspect and arrange countermeasures in advance.

Due to the importance of the destination prediction problem in the above applications, extensive research has been conducted on the same, wherein one direction is destination prediction based on trajectory data. Most existing methods are based on various (hidden) markov chain models. A typical approach is to divide the area evenly into grid cells or the road into segments and use the cells or segments as states for the markov process, with the historical trajectories being used to train the state transition probability matrix for the markov chain. However, using a (first order) markov chain model, the assumption that the vehicle is traveling in a memory-free random driving manner is implicit, which is clearly contradictory to our practical experience, i.e., the true trajectory is not completely random.

Also, in the course of research and practice on the prior art, the inventors of the present invention have found that in the destination prediction problem, many times we need not have only one value, in applications such as advertising, we may need to get multiple different destinations. Most of the existing methods use probabilistic reasoning to calculate the probability of each destination and return the top k values with the highest probability. When predicting the destination of the ongoing trip, the conditional probabilities of reaching certain places are sequentially calculated, and the positions corresponding to the top k values (i.e., the top k most likely places) are returned as the prediction result. Obviously, this method does not take further processing on the probability calculation results. The prediction probabilities for the various destinations may be small and close to each other, where the first k most probable results returned may be very close to each other geographically. In this case, the k results are not good for the real life application since other places which are not close to the k positions geographically but have similar probability are ignored.

Disclosure of Invention

The technical problem to be solved by the embodiments of the present invention is to provide a destination prediction method, a destination prediction system and a storage medium based on a path trajectory, which can adopt different prediction methods for predicting a unique destination and screening multiple destinations, thereby improving the prediction accuracy.

To solve the above problem, an embodiment of the present invention provides a destination prediction method based on a path trajectory, including at least the following steps:

performing numerical value conversion processing on path data in an order to be predicted according to a preset conversion rule to obtain corresponding path track representation data;

inputting the corresponding path track representation data into a local sensitive Hash model, and calculating according to an LSH algorithm to obtain a corresponding matching result;

and clustering the corresponding matching results according to the number of destinations by adopting a K-Means algorithm, calculating the path similarity of the matching results in each cluster according to a similarity measurement formula, and selecting the path with the highest similarity in each cluster as a multi-destination prediction result.

Further, the method for predicting a destination based on a path trajectory further includes:

screening sample orders in a preset time range before and after the departure time of the order to be predicted;

clustering the sample orders according to the starting point positions by adopting a K-Means algorithm, and recording corresponding clustering centers;

finding the nearest cluster center according to the starting point position of the order to be predicted, and determining the cluster to which the order belongs;

screening the corresponding matching results, and reserving sample orders which are clustered in the same cluster with the order to be predicted;

and calculating the path similarity of the matching result according to a similarity measurement formula, and selecting the path with the highest similarity in the same cluster as a single destination prediction result.

Further, the numerical value conversion processing specifically includes:

dividing the map into a plurality of grids according to the longitude and latitude lines, and labeling the grids according to the sequence of the longitude and latitude;

and performing numerical value conversion on the two-dimensional point sequence in the path track data, reading line by line in a line scanning mode, and splicing and converting the two-dimensional point sequence into a one-dimensional array.

Further, the preset conversion rule specifically includes: keeping the appearance sequence of each coordinate point in the original sequence in the path track data; replacing each coordinate point in the original sequence by using the grid label; if continuous coordinate points appear in the grid, only one coordinate point is reserved.

Further, the locally sensitive hash model includes an Offline phase and an Online phase, where the Offline phase specifically includes: carrying out MinHash conversion on sample tracks with any length through h designed Hash functions to obtain h-dimensional vectors; averagely dividing h-dimensional data into b bands by an LSH algorithm, wherein each Band contains r results of hash functions; for each Band, storing the orders with the same r hash values into the same Bucket, wherein the index value of the Bucket is the corresponding r hash values; storing the calculation result of the LSH stage as the matching of the order to be predicted in the subsequent OffLine stage;

the Online stage specifically comprises the following steps: for an order to be predicted, in the Query process, traversing each Band, finding a corresponding Bucket according to the hash value of the order track in the Band, and taking all sample orders stored in the Buckets as the matching result of the order to be predicted in the hash model.

Further, the similarity measure formula is specifically SIM (S1, S2) ═ COMm (S1, S2)/len (S1)

Wherein S1 is the path of the order to be forecasted, S2 is the forecasted path, COMm (S1, S2) is the maximum common length of discontinuity of the path S2 relative to the path S1, and len (S1) is the length of the path S1.

One embodiment of the present invention provides a destination prediction system based on a path trajectory, including:

the path preprocessing module is used for performing numerical conversion processing on path data in the order to be predicted according to a preset conversion rule to obtain corresponding path track representation data;

the local sensitive hash model module is used for inputting the corresponding path track representation data into a local sensitive hash model and calculating according to an LSH algorithm to obtain a corresponding matching result;

and the multi-destination prediction module is used for clustering the corresponding matching results according to the number of destinations by adopting a K-Means algorithm, calculating the path similarity of the matching results in each cluster according to a similarity measurement formula, and selecting the path with the highest similarity in each cluster as a multi-destination prediction result.

Further, the destination prediction system based on the path trajectory further includes:

the sample screening module is used for screening sample orders in a preset time range before and after the departure time of the order to be predicted in the OffLine stage and the OnLine stage; screening the corresponding matching results, and reserving sample orders which are clustered in the same cluster with the order to be predicted;

the single-destination prediction module is used for clustering the sample orders according to the starting point positions by adopting a K-Means algorithm and recording corresponding clustering centers; finding the nearest cluster center according to the starting point position of the order to be predicted, and determining the cluster to which the order belongs; and calculating the path similarity of the matching result according to a similarity measurement formula, and selecting the path with the highest similarity in the same cluster as a single destination prediction result.

Further, the locally sensitive hash model module comprises an Offline unit and an Online unit, wherein,

the Offline unit is used for carrying out MinHash conversion on sample tracks with any length through h designed Hash functions to obtain h-dimensional vectors; averagely dividing h-dimensional data into b bands by an LSH algorithm, wherein each Band contains r results of hash functions; for each Band, storing the orders with the same r hash values into the same Bucket, wherein the index value of the Bucket is the corresponding r hash values; storing the calculation result of the LSH stage as the matching of the order to be predicted in the subsequent OffLine stage;

and the Online unit is used for traversing each Band in the Query process of the order to be predicted, finding out a corresponding Bucket according to the hash value of the order track in the Band, and taking all sample orders stored in the Buckets as the matching result of the order to be predicted in the hash model.

An embodiment of the present invention further provides a computer-readable storage medium, which includes a stored computer program, wherein when the computer program runs, the apparatus on which the computer-readable storage medium is located is controlled to execute the method for predicting a destination based on a path trajectory as described above.

The embodiment of the invention has the following beneficial effects:

the embodiment of the invention provides a destination prediction method, a destination prediction system and a storage medium based on a path track, wherein the method comprises the following steps: performing numerical value conversion processing on path data in an order to be predicted according to a preset conversion rule to obtain corresponding path track representation data; inputting the corresponding path track representation data into a local sensitive Hash model, and calculating according to an LSH algorithm to obtain a corresponding matching result; and clustering the corresponding matching results according to the number of destinations by adopting a K-Means algorithm, calculating the path similarity of the matching results in each cluster according to a similarity measurement formula, and selecting the path with the highest similarity in each cluster as a multi-destination prediction result. According to the method, the tracks can be subjected to dimension reduction processing and matched with approximate path sets through a least-Hash-based locality sensitive Hash algorithm, different methods are adopted aiming at different problems of predicting a unique destination, screening multiple destinations and the like, effect verification of different methods of multi-parameter evaluation is designed, the destination prediction capability based on the track of the tracks is improved, and the method does not contain any personal attribute characteristic information, and provides reference for application that personal information of users or drivers is unavailable.

Drawings

Fig. 1 is a schematic flowchart of a destination prediction method based on a path trajectory according to an embodiment of the present invention;

fig. 2 is a schematic structural diagram of a destination prediction system based on a path trajectory according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

First, an application scenario that the present invention can provide, such as destination prediction based on path trajectory data, will be described.

The first embodiment of the present invention:

please refer to fig. 1.

As shown in fig. 1, the method for predicting a destination based on a path trajectory according to this embodiment at least includes the following steps:

s101, performing numerical conversion processing on path data in an order to be predicted according to a preset conversion rule to obtain corresponding path track representation data;

specifically, for step S101, all trace points are two-dimensional for a path trace represented by WGS-84 coordinates. In order to input the path trajectory into the hash model, a two-dimensional point sequence is converted into a one-dimensional array, that is, a row-by-row reading and splicing is converted into a one-dimensional array in a row-by-row scanning manner. After the path representation mode is defined, the input format of the model, namely the one-dimensional array, is determined.

S102, inputting the corresponding path track representation data into a local sensitive Hash model, and calculating according to an LSH algorithm to obtain a corresponding matching result;

specifically, for step S102, the whole model can be divided into an OffLine phase of the advance training data and an OnLine phase of the order to be predicted. The flow of the hash model plays a role in both the OffLine stage and the OnLine stage. In the OffLine phase, all sample orders are partitioned into buckets in a hash model, which is referred to as the "LSH process"; in the OnLine stage, for an order to be predicted, a sample order with a similar path high probability is obtained according to a bucket on which the order falls through a hash model, and the process is called a Query process. These orders tend to be multiple but very small parts compared to all sample orders.

S103, clustering the corresponding matching results according to the number of destinations by adopting a K-Means algorithm, calculating the path similarity of the matching results in each cluster according to a similarity measurement formula, and selecting the path with the highest similarity in each cluster as a multi-destination prediction result.

Specifically, for step S103, for multiple destination prediction problems, a new measurement mode is designed for selecting paths with high similarity and expecting the end points of the orders to be far apart, the matching results (sample order sets) of the hash submodels are clustered according to the number of destinations by using K-Means, and the path with the highest similarity in each cluster is taken as the result.

In a preferred embodiment, the method for predicting a destination based on a path trajectory further includes:

In particular, for single destination prediction problems, the deviation of the predicted destination from the actual destination is heavily studied. And further screening the Query result of the hash model by adding factors of time or geographical distance, and then reducing the predicted error from the screening of the sample by using an improved similarity measurement mode. And marking a class label for auxiliary screening for each sample order in the OffLine stage, and finishing the screening process and the similarity comparison for the sample to be predicted in the OnLine stage.

It should be noted that the screening destination is essentially a set of conditions, and thus the form is variable, and there are different methods or means for different problems. The method can play a role in both the OffLine stage and the OnLine stage, and the screening condition mainly considers the influence of time attributes and the pre-clustering effect.

In a preferred embodiment, the numerical value conversion process specifically includes:

In a preferred embodiment, the preset conversion rule specifically includes: keeping the appearance sequence of each coordinate point in the original sequence in the path track data; replacing each coordinate point in the original sequence by using the grid label; if continuous coordinate points appear in the grid, only one coordinate point is reserved.

Specifically, it is assumed that the earth is divided into individual grids by longitude and latitude lines on the earth according to a certain degree, and the marks are artificially marked according to the sequence of the longitude and latitude lines. For a path S ═ { a, b, c, d, e, f, g, h, i, j, k }, the path is numerically converted according to a preset rule through the labels of the grid, and is converted into a one-dimensional array; the preset rules include: the sequence of the appearance of each point of the original sequence is reserved; replacing each (coordinate) point with a grid label; if a situation occurs where successive points are in the grid, i.e. there are successive multiple repeated values, only one value is retained.

In a preferred embodiment, the locality sensitive hash model includes an Offline phase and an Online phase, wherein,

the Offline stage specifically includes: carrying out MinHash conversion on sample tracks with any length through h designed Hash functions to obtain h-dimensional vectors; averagely dividing h-dimensional data into b bands by an LSH algorithm, wherein each Band contains r results of hash functions; for each Band, storing the orders with the same r hash values into the same Bucket, wherein the index value of the Bucket is the corresponding r hash values; storing the calculation result of the LSH stage as the matching of the order to be predicted in the subsequent OffLine stage;

Specifically, the locality sensitive hash model is divided into an Offline phase and an Online phase.

In the Offline stage, for sample tracks with any length, the sample tracks are converted into h-dimensional vectors through h designed hash functions (MinHash process). Next, the data in h dimension is divided into b Bands on average by the LSH algorithm, each Band contains the results of r hash functions, and h ═ b × r is satisfied here. For each Band, storing the orders with the same r hash values into the same Bucket, wherein the index value of the Bucket is the corresponding r hash values. And storing the result of the LSH stage so as to match the order to be predicted in the subsequent OffLine stage.

In the Online stage, for an order to be predicted, the local sensitive hash model traverses each Band in the Query process, finds a corresponding Bucket according to the hash value of the order track in the Band, and notices that at most one Bucket corresponds to each Band. And finally, all the sample orders stored in the Buckets are the matching results of the orders to be predicted in the hash submodel. It should be noted that, in the embodiment of the present invention, the LSH model parameter of the present technology employs a similarity threshold of 50% and 256 hash functions.

In a preferred embodiment, the similarity measure formula is SIM (S1, S2) ═ COMm (S1, S2)/len (S1)

Specifically, in this embodiment, two ways of calculating the similarity are adopted:

(1) jaccard similarity, which is the similarity used in screening similar paths in MinHash and LSH. However, this calculation has two significant disadvantages. Firstly, the lengths of the two paths are calculated when the similarity is calculated, namely all elements contained in the two paths after dimension reduction are used; secondly, the time sequence of the original path is disturbed by the hash process, which is not beneficial to accurately predicting the destination. For the second problem, there are two solutions: firstly, path screening is carried out on initial point clusters; secondly, a timing sequence considered metric is customized, and the metric also overcomes the first problem.

(2) In the SIMm calculation mode of the longest discontinuous similarity, if a path is regarded as a character string, a task can be converted into the solution of the maximum discontinuous public length of the two character strings, and the solution is usually realized by dynamic programming. Note that path S2 has a maximum discontinuity common length COMm with respect to S1 (S1, S2). Since a similar path of the path S1 needs to be searched, and the Jaccard similarity takes into account the lengths of the two matching paths during calculation, if the path S1 is small and the matching path is long, the path is affected by the long path. In order to improve the rationality, the similarity calculation is modified by considering the problem of the path sequence and emphasizing the similarity of other paths relative to the path S1, and the improved similarity measurement formula is as follows:

SIM(S1,S2)＝COMm(S1,S2)/len(S1)

here, len (S1) is the length of the path S1.

The embodiment provides a destination prediction method based on a path track, which comprises the following steps: performing numerical value conversion processing on path data in an order to be predicted according to a preset conversion rule to obtain corresponding path track representation data; inputting the corresponding path track representation data into a local sensitive Hash model, and calculating according to an LSH algorithm to obtain a corresponding matching result; and clustering the corresponding matching results according to the number of destinations by adopting a K-Means algorithm, calculating the path similarity of the matching results in each cluster according to a similarity measurement formula, and selecting the path with the highest similarity in each cluster as a multi-destination prediction result. According to the method, the tracks can be subjected to dimension reduction processing and matched with approximate path sets through a least-Hash-based locality sensitive Hash algorithm, different methods are adopted aiming at different problems of predicting a unique destination, screening multiple destinations and the like, effect verification of different methods of multi-parameter evaluation is designed, the destination prediction capability based on the track of the tracks is improved, and the method does not contain any personal attribute characteristic information, and provides reference for application that personal information of users or drivers is unavailable.

Second embodiment of the invention

Please refer to fig. 2.

As shown in fig. 2, an embodiment of the present invention further provides a destination prediction system based on a path trajectory, including:

the path preprocessing module 100 is configured to perform numerical conversion processing on path data in an order to be predicted according to a preset conversion rule to obtain corresponding path trajectory representation data;

specifically, for the path preprocessing module 100, all trace points are two-dimensional for a path trace represented by WGS-84 coordinates. In order to input the path trajectory into the hash model, a two-dimensional point sequence is converted into a one-dimensional array, that is, a row-by-row reading and splicing is converted into a one-dimensional array in a row-by-row scanning manner. After the path representation mode is defined, the input format of the model, namely the one-dimensional array, is determined.

The locality sensitive hash model module 200 is configured to input the corresponding path trajectory representation data to a locality sensitive hash model, and obtain a corresponding matching result according to calculation of an LSH algorithm;

specifically, for the partially sensitive hash model module 200, the whole model can be divided into an OffLine phase for training data in advance and an OnLine phase for an order to be predicted. The flow of the hash model plays a role in both the OffLine stage and the OnLine stage. In the OffLine phase, all sample orders are partitioned into buckets in a hash model, which is referred to as the "LSH process"; in the OnLine stage, for an order to be predicted, a sample order with a similar path high probability is obtained according to a bucket on which the order falls through a hash model, and the process is called a Query process. These orders tend to be multiple but very small parts compared to all sample orders.

And the multi-destination prediction module 300 is configured to cluster the corresponding matching results according to the number of destinations by using a K-Means algorithm, calculate the path similarity of the matching results in each cluster according to a similarity measurement formula, and select a path with the highest similarity in each cluster as a multi-destination prediction result.

Specifically, for the multi-destination prediction module 300, for a plurality of destination prediction problems, a new measurement mode is designed by picking out paths with high similarity and expecting the end points of the orders to be far away, clustering is performed on the matching results (sample order sets) of the hash submodels by using K-Means according to the number of destinations, and the path with the highest similarity in each cluster is taken as the result.

For similarity calculation, two similarity calculation methods are adopted in this embodiment, including:

SIM(S1,S2)＝COMm(S1,S2)/len(S1)

here, len (S1) is the length of the path S1.

In a preferred embodiment, the system for predicting a destination based on a path trajectory further includes:

the sample screening module 400 is used for screening sample orders in a preset time range before and after the departure time of the order to be predicted in the OffLine stage and the Online stage; screening the corresponding matching results, and reserving sample orders which are clustered in the same cluster with the order to be predicted;

specifically, the sample screening module 400 includes sample screening for hash results and cluster selection for hash model matching results.

Wherein sample screening for hash results is applied to a single-destination prediction problem. The sample order set obtained after LSH has higher similarity with the order to be predicted under the condition of neglecting time factors, and the sample orders are not directly subjected to order screening according to the size of the similarity SIMm, but part of the sample orders are removed firstly. Two further angles for screening sample orders are provided herein and describe the manner of use in the present technique: 1. angle based on time. The departure time of each order to be predicted can be obtained, and sample orders before and after the departure time of the order to be predicted can be selected according to a certain time range. 2. Based on the angle of the space. And clustering all the sample orders by using K-Means according to the starting point positions, and recording corresponding clustering centers. And finding the nearest cluster center of the order to be predicted according to the position of the starting point, determining the cluster to which the order belongs, and further screening the result obtained by the LSH (least squares) to only keep the sample order in the same cluster with the order to be predicted.

And the cluster selection of the hash model matching result is applied to the multi-destination prediction problem. Since order screening directly according to the size of similarity SIMm has a large possibility of obtaining several results with close distances, it is not favorable for application in life. Again, K-Means are used here, but the clustering objects are all results from LSH, the number of clusters is equal to the number of required destinations, and finally the destination with the highest SIMm in each cluster is selected as the result of multi-destination prediction.

The single-destination prediction module 500 is used for clustering the sample orders according to the starting point positions by adopting a K-Means algorithm and recording corresponding clustering centers; finding the nearest cluster center according to the starting point position of the order to be predicted, and determining the cluster to which the order belongs; and calculating the path similarity of the matching result according to a similarity measurement formula, and selecting the path with the highest similarity in the same cluster as a single destination prediction result.

In particular, for the single destination prediction module 500, the deviation of the predicted destination from the actual destination is of great concern for the single destination prediction problem. And further screening the Query result of the hash sub-model by adding factors of time or geographical distance, and then reducing the prediction error from the screening of the sample by using an improved similarity measurement mode. And marking a class label for auxiliary screening for each sample order in the OffLine stage, and finishing the screening process and the similarity comparison for the sample to be predicted in the OnLine stage. The method comprises the following specific steps: screening sample orders in a preset time range before and after the departure time of the order to be predicted; clustering the sample orders according to the starting point positions by adopting a K-Means algorithm, and recording corresponding clustering centers; finding the nearest cluster center according to the starting point position of the order to be predicted, and determining the cluster to which the order belongs; screening the corresponding matching results, and reserving sample orders which are clustered in the same cluster with the order to be predicted; and calculating the path similarity of the matching result according to a similarity measurement formula, and selecting the path with the highest similarity in the same cluster as a single destination prediction result.

In a preferred embodiment, the locally sensitive hash model module 200 includes an Offline unit and an Online unit, wherein,

The present embodiment provides a destination prediction system based on a path trajectory, including: performing numerical value conversion processing on path data in an order to be predicted according to a preset conversion rule to obtain corresponding path track representation data; inputting the corresponding path track representation data into a local sensitive Hash model, and calculating according to an LSH algorithm to obtain a corresponding matching result; and clustering the corresponding matching results according to the number of destinations by adopting a K-Means algorithm, calculating the path similarity of the matching results in each cluster according to a similarity measurement formula, and selecting the path with the highest similarity in each cluster as a multi-destination prediction result. According to the method, the tracks can be subjected to dimension reduction processing and matched with approximate path sets through a least-Hash-based locality sensitive Hash algorithm, different methods are adopted aiming at different problems of predicting a unique destination, screening multiple destinations and the like, effect verification of different methods of multi-parameter evaluation is designed, the destination prediction capability based on the track of the tracks is improved, and the method does not contain any personal attribute characteristic information, and provides reference for application that personal information of users or drivers is unavailable.

Another embodiment of the present invention further provides a computer-readable storage medium, which includes a stored computer program, wherein when the computer program runs, the apparatus on which the computer-readable storage medium is located is controlled to execute the method for predicting a destination based on a path trajectory as described above.

In the above embodiments of the present invention, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.

In the embodiments provided in the present application, it should be understood that the disclosed technology can be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the modules may be a logical division, and in actual implementation, there may be another division, for example, multiple modules or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, units or modules, and may be in an electrical or other form.

The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical modules, may be located in one place, or may be distributed on a plurality of modules. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.

In addition, functional modules in the embodiments of the present invention may be integrated into one processing module, or each of the modules may exist alone physically, or two or more modules are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode.

The foregoing is directed to the preferred embodiment of the present invention, and it is understood that various changes and modifications may be made by one skilled in the art without departing from the spirit of the invention, and it is intended that such changes and modifications be considered as within the scope of the invention.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like.

Claims

1. A destination prediction method based on a path track is characterized by at least comprising the following steps:

clustering the corresponding matching results according to the number of destinations by adopting a K-Means algorithm, calculating the path similarity of the matching results in each cluster according to a similarity measurement formula, and selecting the path with the highest similarity in each cluster as a multi-destination prediction result;

2. The method for predicting a destination based on a path trajectory according to claim 1, wherein the numerical conversion process specifically includes:

3. The method for predicting a destination based on a path trajectory according to claim 1, wherein the preset conversion rule specifically comprises: keeping the appearance sequence of each coordinate point in the original sequence in the path track data; replacing each coordinate point in the original sequence by using the grid label; if continuous coordinate points appear in the grid, only one coordinate point is reserved.

4. The method according to claim 1, wherein the locality sensitive hash model comprises an Offline phase and an Online phase, wherein,

5. The method of claim 1, wherein the similarity measure formula is a similarity measure formula

SIM(S1,S2)＝COMm(S1,S2)/len(S1)

6. A destination prediction system based on a path trajectory, comprising:

the multi-destination prediction module is used for clustering the corresponding matching results according to the number of destinations by adopting a K-Means algorithm, calculating the path similarity of the matching results in each cluster according to a similarity measurement formula, and selecting the path with the highest similarity in each cluster as a multi-destination prediction result;

7. The path-trajectory-based destination prediction system of claim 6, wherein the locally sensitive hash model module comprises an Offline unit and an Online unit, wherein,

8. A computer-readable storage medium, comprising a stored computer program, wherein the computer program, when executed, controls an apparatus in which the computer-readable storage medium is located to perform the method for path-trajectory-based destination prediction according to any one of claims 1 to 5.