CN108829694A - The optimization method of flexible polymer K-NN search G tree on road network - Google Patents

The optimization method of flexible polymer K-NN search G tree on road network Download PDF

Info

Publication number
CN108829694A
CN108829694A CN201810342316.7A CN201810342316A CN108829694A CN 108829694 A CN108829694 A CN 108829694A CN 201810342316 A CN201810342316 A CN 201810342316A CN 108829694 A CN108829694 A CN 108829694A
Authority
CN
China
Prior art keywords
point
tree
road network
node
distance
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810342316.7A
Other languages
Chinese (zh)
Inventor
姚斌
过敏意
陈中普
沈耀
陈�全
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Jiaotong University
Original Assignee
Shanghai Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Jiaotong University filed Critical Shanghai Jiaotong University
Priority to CN201810342316.7A priority Critical patent/CN108829694A/en
Publication of CN108829694A publication Critical patent/CN108829694A/en
Pending legal-status Critical Current

Links

Abstract

The invention discloses a kind of optimization methods of the flexible polymer K-NN search G tree on road network, include the following steps:One, establish G tree index;Two, it defines and initializes;Three, if queue is sky, terminate;Otherwise go out team and obtain x, into the 4th step;Four, if x is leaf node, for point v all inside x, calculated using optimization method(including initialization;Judge whether D is less thanAnd whether queue is empty;Team obtains out<dis,e>, judge whether e is point on road network), final result is updated, returns to third step after traversal;Otherwise enter the 5th step;Five, traverse the child node c of x, calculate all the points in Q to c minimum potential range, before obtainingThe maximum value max of minimum range or and sum, be denoted as τ;Six, if τ is less than r*, the child nodes of c are joined the team, third step is returned to;If τ is greater than or equal to r*, terminate.The present invention can effectively improveEfficiency reduce cost to promote inquiry velocity.

Description

The optimization method of flexible polymer K-NN search G tree on road network
Technical field
The invention belongs to computer fields, and in particular on the querying method of spatial database more particularly to a kind of road network Flexible polymer K-NN search G tree optimization method.
Background technique
Polymerizeing K-NN search (Aggregate nearest neighbor, hereinafter referred to as ANN) is in spatial database Classical inquiry, have wide application scenarios, such as based on location-based service etc..A given group polling point set Q, ANN is in data A point is found in point set V, so that the polymerization distance of this all the points into Q is minimum.This aggregate function is usually max Or sum.ANN problem theorem in Euclid space [referring to D.Papadias, Q.Shen, Y.Tao, and K.Mouratidis, “Group nearest neighbor queries,”in Data Engineering,2004.Proceedings.20th International Conference on.IEEE, 2004, pp.301-312.] and road network on [referring to D.Papadias, Q.Shen,Y.Tao,and K.Mouratidis,“Group nearest neighbor queries,”in Data Engineering,2004.Proceedings.20th International Conference on.IEEE,2004, Pp.301-312.] it is studied.
Many times, consider that the partial query point in Q is then more meaningful.Consider the example in Fig. 1, set of data points is V={ v1,v2,…,v8,v9, (circle), inquiry point set is Q={ q1,q2,q3,q4(triangle).Pay attention to v3And q3,v5And q4 The same position is shared respectively;q1Positioned at (v2,v3) on, q2Positioned at (v3,v6) on.Assuming that V is the position candidate for building harbour, Q It is small cargo collecting and distributing centre, and each collecting and distributing centre can store 1 ton of cargo daily.A candidate point is found in present V, is received Collect all cargos of Q, and makes polymerization distance minimum.At this moment the result of max-ANN is exactly v2, distance is 16;The knot of sum-ANN Fruit is also v2, distance is 52.Because of v2Opposite is the "center" of Q, so we can intuitively understand this result.But If harbour only needs 2 tons of cargos daily, i.e., only needs to consider 50% small freight collecting and distributing centre, rather than consider institute in Q There is query point.More precisely, more generally inquiry is to allow a user to specify a parameterTarget is sought in V Look for a point so that the point into Q certainThe polymerization distance of a point is minimum, and this inquiry is known as flexible polymer most by we NN Query (flexible aggregate nearest neighbor, hereinafter referred to as FANN).If we enableThen max-FANN's the result is that v3, distance is 2;The result of sum-FANN is also v3, distance is 4.
FANN problem on present invention research road network.FANN inquiry be earliest proposed in theorem in Euclid space [referring to Y.Li, F.Li, K.Yi,B.Yao,and M.Wang,“Flexible aggregate similarity search,”in Proceedings of the 2011 ACM SIGMOD international conference on management of data.ACM,2011,pp.1009–1020.].It compares and theorem in Euclid space, many operations on road network are all more complicated.Such as Determine that the shortest distance of point-to-point transmission can determine in constant time in theorem in Euclid space, and the operation depends on most in road network Short-circuit algorithm.In order to propose more efficient FANN algorithm in road network, it is necessary to using the topological structure of road network, thus to not Possible candidate point carries out beta pruning.
It is reported that currently without other on road network about the research work of FANN.We are not to the research of FANN ANN is [referring to D.Papadias, Q.Shen, Y.Tao, and K.Mouratidis, " Group nearest on road network neighbor queries,” in Data Engineering,2004.Proceedings.20th International Conference on.IEEE, 2004, pp. 301-312.] simple extension.[D.Papadias,Q.Shen,Y.Tao,and K.Mouratidis,“Group nearest neighbor queries,”in Data Engineering, 2004.Proceedings.20th International Conference on.IEEE, 2004, pp.301-312.] in IER algorithm relies on R tree, but R tree shows and bad on road network.[D.Yan,Z.Zhao,and W. Ng,"Efficient algorithms for finding optimal meeting point on road networks,”Proceedings of The VLDB Endowment, vol.4, no.11,2011.] method that has used convex closure to carry out beta pruning to impossible point, But its scalability is bad.[M.Safar,"Group k-nearest neighbors queries in spatial network databases,”Journal of geographical systems,vol.10,no.4,pp.407–416, 2008.][L.Zhu,Y.Jing,W.Sun,D.Mao,and P.Liu,“Voronoi-based aggregate nearest neighbor query processing in road networks,”in Proceedings of the 18th SIGSPATIAL International Conference on Advances in Geographic Information Systems.ACM, 2010, pp.518-521.] subregion is carried out to road network using Voronoi diagram, but they often result in division It is unbalanced, so as to cause inefficient.Further, since the parameter being newly addedThe result of FANN can be more difficult to find.In Q ArbitrarilyPoint can become target, and scale is reachable
Therefore, a kind of method for needing to research and develop FANN problem that can solve on road network.
Summary of the invention
The technical problem to be solved in the present invention is that providing a kind of optimization of the flexible polymer K-NN search G tree on road network Method, this method significantly less can reduceCall number reduce cost to promote inquiry velocity.
In order to solve the above technical problems, the present invention adopts the following technical scheme that:
The present invention provides a kind of optimization method of the flexible polymer K-NN search G tree on road network, specifically rightCalculating process optimize, include the following steps:
The first step establishes G tree index to entire road network;
Second step, definition and initialization:
It defines road network G=(V, E, W), wherein V indicates vertex, and E indicates side, and W indicates the weight on side, δ (vi, vj) indicate vi To vjRoad network distance;Q is query set (query objects), size M;FANN is query-defined to be:One FANN is looked into Inquiry is a five-tupleReturn to a tripleSo that:
Wherein p*It is to make flexible polymer apart from the smallest point in V,It is the optimal elastic subset of Q, r*It is exactly at this time Flexible polymer distance;
DefinitionFor flexible polymer function, the subset Q that it receives point p, a V that one belongs to V is used as input, Return to onePair as a result, meet:
WhereinBe a subset of Q and
Initialization:By r*It is initialized as infinity;A priority query is constructed, the root node of G tree is joined the team;
Third step judges whether queue is empty;If queue is sky, terminate;Otherwise go out team and obtain x;
4th step judges whether x is leaf node;If x is leaf node, for v all inside x, calculateFinal result is updated if necessary, and third step is returned to after traversal;Otherwise, into the 5th step;The calculatingUsing the optimization method included the following steps:
1) initializing variable is sky apart from list D;Safeguard a minimum priority query, store q to G tree node or The distance of road network point will by distance-taxis<The root node of 0, G tree>It joins the team;Calculate the inquiry point list about Q;
2) if the size of D is less thanAnd queue is not sky, is entered step 3);Otherwise calculate D maximum value max or Person and sum, as r*
3) go out team to obtain<dis,e>If e is the point on road network, dis is put into D, returns step 2);Otherwise, e is Point on G tree traverses the point v in the inquiry point list of e, calculates the distance of p to v, and v is joined the team, and returns step 2);
5th step, traverses the child node c of x, calculate all the points in Q to c minimum potential range, before obtainingMost narrow spacing From maximum value max or and sum, be denoted as τ;
6th step, judges whether τ is greater than or equal to r*;If τ is less than r*, the child nodes of c are joined the team, third step is returned to; If τ is greater than or equal to r*, then terminate.
As the technical solution that optimizes of the present invention, in the first step, it is described is established by G tree and is indexed for entire road network be specially:It is first First original image carries out subgraph division, and each subgraph mutually disjoints, and similar division is then carried out to subgraph, by this recursive Number of the mode inside subgraph comprising data point is less than the threshold value of setting;Calculate the distance of each G tree boundary point of graph Matrix.
As the technical solution that the present invention optimizes, the distance matrix is in construction using the realization side of δ on following G tree Method:
Given road network point u and v, it is assumed that the leaf node where it is respectively CuAnd Cv
Work as Cu=Cv, local dijkstra's algorithm is executed in the leaf node first;If in algorithm implementation procedure not Comprising any boundary point, it is considered that local dijkstra's algorithm is efficient enough;Otherwise, stop dijkstra's algorithm, under use The formula in face calculates δ (u, v):
δ (u, v)=min { δ (u, b1)+δ(b1, b2)+δ (v, b2)|b1, b2∈Bc}
Wherein BcIt is CuOr CvBoundary point set;
Work as Cu≠Cv, it is clear that the boundary point of leaf node where must travel respectively from the path that u reaches v enables CAFor CuAnd CvThe public father node of minimum, then the shortest path from u to v is bound to bottom-uply from CuTo CA, then from push up to Lowerly from CATo Cv, it is formulated as:
δ (u, v)=min (δ (u, b1)+δ (u, b2)+…+
δ(bm-1, bm)+…+δ(bn, v))
Wherein b1, b2 ..., bn are Cu, the boundary point of .., Cv respectively.
As the technical solution that optimizes of the present invention, the implementation method of δ, is solved, general using dynamic programming method on the G tree General objective δ (u, v) is decomposed into a series of sub-goals, and by storing pilot process, the value of δ (u, v) is obtained in linear session.
As the technical solution that optimizes of the present invention, in the step 1) of the 4th step, the inquiry about Q is calculated in the initialization Point list, i.e., the point q in each Q, the node in which G tree include each node of it and G tree, which child node packet Include Q.
As the technical solution that optimizes of the present invention, in the 5th step, the minimum potential range of all the points to c are G trees in the Q Minimum potential range of the node to road network point.
As the technical solution that optimizes of the present invention, in the 5th step, the τ is lower bound of the dynamic threshold as r*, i.e., The polymerization distance of any p has to be larger than τ, so if τ is greater than or equal to r*, then terminate.
As the technical solution that optimizes of the present invention, in the 5th step, including the implementation method of θ (u, v), it is specific as follows:Use θ (u, v) carries out beta pruning, it is directly regarded as Euclidean distance by lower bound of the θ (u, v) as distance;It is closed using triangle is not equal System:Assuming that w is third point, then δ (u, v) >=δ (w, u)-δ (w, v) and δ (u, v) >=δ (w, v)-δ (w, u) are set up simultaneously, Therefore θ (u, v)=max | δ (w, u)-δ (w, v) |, dε(u,v)}。
As the technical solution that optimizes of the present invention, in the 5th step, in the implementation method of the θ (u, v), in order to further make The boundary of θ (u, v) is tighter, and some road signs are arranged in advance, using the point in road sign successively as third point, according to triangle etc. Relationship takes maximum one.
As the technical solution that optimizes of the present invention, in second step, the element of the priority query, queue storage is binary Group<C, d>, wherein c is G tree node, and the calculation of d is as follows:Calculate Q in all the points to c minimum potential range, before obtainingThe maximum value max of minimum range or and sum, as d, that is, the τ of the 5th step, priority is according to the big minispread of d.
Compared with prior art, the invention has the advantages that:
1, top-down traversal is carried out by G tree, can rapidly accesses entire road network.
2, the distance matrix of G tree storage allows the calculating of G tree node to the distance of road network point complete in linear session At.Significantly less it can reduce in this wayCall number reduce cost to promote inquiry velocity.
It 3, will by G treeIt is changed into a KNN problem;Because calculating the convenient of shortest path on G tree, the G tree Optimization method efficiency is very high.
Detailed description of the invention
Present invention will be further explained below with reference to the attached drawings and examples.
Fig. 1 is the example schematic of FANN.
Fig. 2 is the structural map of an example for G tree.
Fig. 3 is the stored matrices schematic diagram of distance.
Fig. 4 is the flow chart of the optimization method of the flexible polymer K-NN search G tree on road network of the present invention.
Fig. 5 is the present inventionOptimization algorithm flow chart.
Fig. 6 is the Path extension schematic diagram of " realization of δ on G tree " in the present invention.
Fig. 7 is the efficiency comparative result schematic diagram of the variation A of G-max algorithm of the present invention and Baseline rudimentary algorithm.
Fig. 8 is the efficiency comparative result schematic diagram of the variation A of G-sum algorithm of the present invention and Baseline rudimentary algorithm.
Fig. 9 is the efficiency comparative result schematic diagram of the variation M of G-max algorithm of the present invention and Baseline rudimentary algorithm.
Figure 10 is the efficiency comparative result schematic diagram of the variation M of G-sum algorithm of the present invention and Baseline rudimentary algorithm.
Figure 11 is the variation of G-max algorithm and Baseline rudimentary algorithm of the present inventionEfficiency comparative result schematic diagram.
Figure 12 is the variation of G-sum algorithm and Baseline rudimentary algorithm of the present inventionEfficiency comparative result schematic diagram.
Specific embodiment
In conjunction with the accompanying drawings, the present invention is further explained in detail.These attached drawings are simplified schematic diagram, only with Illustration illustrates basic structure of the invention, therefore it only shows the composition relevant to the invention.
1 problem definition
Road network can be expressed as the undirected figure for having weight, and G (V, E, W), wherein V is vertex set, and E is the collection on side It closes, W is the mapping of E to positive real number, indicates the weight on side.Enabling δ is the distance function being defined on G, δ (vi, vj) indicate viIt arrives vjRoad network distance.It is worth noting that, the weight on side needs not be equal to the Euclidean distance of point-to-point transmission.For example, it can be through Spend the time of side needs.Obviously, if the weight on side and Euclidean distance are proportional, conversion is just very simple.We are using similar In [M.L.Yiu, N.Mamoulis, and D.Papadias, " Aggregate nearest neighbor queries in road networks,”IEEE Transactions on Knowledge and Data Engineering,vol.17, No.6, pp.820-833,2005.] method (normalization) handle arbitrary weight.Firstly, we calculate a ratio system Number:
, wherein dε(vi, vj)) indicate viTo vjEuclidean distance.Then we are by all weights multiplied by ratio system above Number.In this way, Euclidean distance is still its lower bound.
We indicate inquiry point set (query objects), size M using Q.Indicate elastic parameter, wherein | V | =N, | Q |=M,Between (0,1).For ease of description, it will be assumed that all query points are on the vertex of figure, i.e.,G is enabled to indicate an aggregate function, it is defined on a point p and a point set P, it can be most in the present invention Big value max or and sum:
Wherein | P |=k, viBelong to P.
In this way, we can define flexible polymer functionThe subset Q that it receives point p, a V that one belongs to V makees For input, one is returnedPair as a result, meet:
WhereinBe a subset of Q and
Our target is that a point p is found in V*So that rpIt is minimum.One FANN, which is inquired, to be with formal definition:One A FANN inquiry is a five-tupleReturn to a tripleSo that:
Wherein p*It is to make flexible polymer apart from the smallest point in V,It is the optimal elastic subset of Q, r*It is exactly at this time Flexible polymer distance.
Given G, Q and parameterTarget be in V searching one point so that the point into Q certainA point Polymerization distance (usually sum or max) is minimum.
2. violence method
Firstly, we first discussRealization.A p and Q is given, we at most haveMultiple selections are to determine However, it is not necessary to go to consider every kind of possibility.Looking back dijkstra's algorithm, (dijkstra's algorithm is by Dutch computer science Family Dick Si Tela is proposed in nineteen fifty-nine, therefore is called Dijkstra algorithm.It is from a vertex to remaining each vertex Shortest path first, solution is shortest route problem in digraph.Dijkstra's algorithm is mainly characterized by Center extends layer by layer outward, until expanding to terminal) implementation procedure:In its each spread step, it has been chosen The nearest point having not visited of point, and the neighbours of the point are updated to the distance of starting point.This process can also be appliedFirstly, enabling p is starting point, dijkstra's algorithm is called, until having in QA point is marked as accessing.At this time These labeled points are exactlyIt is exactly r that it, which polymerize distance,p.It is not difficult to find that we can also beRegard as about The kNN of p and Q is inquired, wherein
According to the definition of above-mentioned FANN, we can design the violence solution of FANN:We run the p in each VAlgorithm.In the process, we safeguard a smallest rp?.It is noted that we can use the rpOr It is realized using Euclidean distance as lower bound and introduces a wheel iteration in advance.Similar strategy is to any useAlgorithm have Effect.
Now it is contemplated that the time complexity of violence method.BecauseThere is an identical complexity with Dijkstra, i.e. O (| E |+N lgN) (assuming that the most rickle used is Fibonacci heap), wherein | E | it is the number on side in road network.Therefore, total time For O (N | E |+N2 lgN)。
Intuitively, we optimize the violence algorithm there are two types of method:First is that carrying out beta pruning (i.e. to the point in V as far as possible It reducesCall number), second is that improveEfficiency.Content herein below will focus on discussion both methods.
3. the algorithm based on G tree
We realize FANN algorithm using index structure G tree.1) it can meet simultaneously cuts the point in V as far as possible Branch;2) it improvesEfficiency.
The construction of G tree:One subgraph of given figure G (V, E, W), we are first according to the position of their abutment points subgraph Midpoint is divided into internal point and boundary node.For an internal point, its all abutment points same height belonging to the point In figure.For a boundary point, its subgraph of at least one abutment points not belonging to it.G tree is a balanced tree.Each Nonleaf node have B (>=2) a child nodes, each leaf node include at most T element.Recursive side can be used in we Method constructs G tree.We divide figure herein, and the specific method is as follows:Firstly, being obtained by deleting some sides or point The figure of coarseness;Then, figure is divided into small-scale;Finally, re-mapping back original image.Fig. 2 is an example for G tree.
The storage of G tree:Storage model is the key that G tree.To reduce space expense, point data only is stored in leaf node. Each node is identified by an ID, and stores the ID on all boundaries, the ID of father node and the ID of child nodes.It should be noted that , the ID of road network point and the ID of tree node be not or not the same field.We calculate in advance in same layer and store some distances. Specifically, non-leaf nodes safeguards the mutual distance of the boundary point of its child;Leaf node safeguards its boundary point and is included Road network point distance.
By taking Fig. 2 as an example, the boundary point of G1 is { v3, v4 }, and the boundary point of G2 is { v6, v7 }.Therefore G0 will safeguard v3, v4, The distance between v6, v7 }.The boundary point of G3 is { v3 }, it is therefore desirable to safeguard the distance between { v1, v2, v3 }.We use Matrix indicates that (top half of matrix is omitted in we to pre-stored distance, because of δ (vi, vj)=δ (vj, vi)), such as Fig. 3 institute Show.
The search of FANN on G tree:Two o'clock u and v on given V, we indicate the minimum between u and v using θ (u, v) Potential range.Similarly, a node C for giving G tree, enables BCFor the set of boundary point, and define θ (u, C) be v to C most Small potential range:
We are the FANN algorithm description in G tree in following algorithm 1.We start queue from top to getting off to traverse G tree In be put into the root node of G tree.When reaching leaf node, it includes all road networks points will be processed, and updates result ( 7-9 row).If it is non-leaf nodes is reached, if its lower bound can be to all road networks of its inside greater than current optimal value Other G tree points carry out beta pruning (terminating algorithm) in point and queue, its child nodes is otherwise put into queue (10-16 Row).
Input:
Output:
The max-FANN algorithm of 1 G tree of algorithm
As shown in figure 4, the optimization method of the flexible polymer K-NN search G tree on road network of the present invention, includes the following steps:
The first step establishes G tree index to entire road network (figure) first.Specifically, subgraph division is carried out to original image first, Each subgraph mutually disjoints, and similar division is then carried out to subgraph, includes inside subgraph by this recursive mode The number of data point is less than the threshold value of setting.Then the distance matrix for calculating each G tree boundary point of graph (is constructed apart from square Need to use the realization algorithm of δ on hereafter G tree when battle array).
Second step, initialization.r*It is initialized as infinity;A priority query is constructed, the root node of G tree is joined the team. The element of the priority query, queue storage is binary group<C, d>, wherein c is G tree node, and the calculation of d is as follows: Calculate Q in all the points to c minimum potential range, before obtainingThe maximum value max of minimum range, as d, that is, the The τ of five steps, priority is according to the big minispread of d.
Third step judges whether queue is empty;If queue is sky, terminate algorithm;Otherwise go out team and obtain x.
4th step judges whether x is leaf node;If x is leaf node, for v all inside x, calculate(on G tree i.e. hereafterRealization, useOptimization algorithm, see below algorithm 2), update if necessary most Eventually as a result, returning to third step after traversal;Otherwise, into the 5th step.
5th step, traverses the child node c of x, calculate all the points in Q to c minimum potential range (θ's (u, v) i.e. hereafter Realize), before obtainingThe maximum value max of minimum range, is denoted as τ.The minimum potential range of all the points to c are G trees in the Q Minimum potential range of the node to road network point.The τ is a dynamic threshold as r*Lower bound, i.e., the polymerization distance of any p τ is had to be larger than, so if τ is greater than or equal to r*, then terminate.
6th step, judges whether τ is greater than or equal to r*;If τ is less than r*, the child nodes of c are joined the team, third step is returned to. If τ is greater than or equal to r*, then terminate algorithm.
On G treeRealization:We can beRealization process regard the kNN about p and Q, i.e. distance p as more Close query point is paid the utmost attention to.Because we have stored the mapping of road network point id to tree node ID, we are easily determined Which node stores some specific road network point.For example, have in Fig. 2 we can determine whether the node comprising v1 G6, G2, G0}.Therefore, give a Q, we can determine whether 1) for leaf node, it includes Q in query point;2) for n omicronn-leaf Child node, it includes child's nodes of the query point in Q.This is called inquiry point list by we.Such as in Fig. 2, it is assumed that Q= { v1, v4, v5, v8, v9 }, the inquiry point list in G3 are { v1 }, and it is { v4, v5 }, the query point classification of G1 that G4, which inquires point list, For { G3, G4 }.
The node C of given road network point q and tree, we are defined as σ (u, C) distance of q to C:
In the calculating process of obvious σ (u, C), many paths are shared, it is possible to more efficient calculating is put using this.
WeOptimization realize description algorithm 2 again.We safeguard a priority query, storage<Dis, obj>And It sorts by dis, wherein obj is a point in Q or a node on G tree, dis are the distances of q to obj.Firstly, We<0,root>It joins the team (the 3rd row), then we iteratively take object (the 5th row) from queue.If that team is G out The node of tree, all elements in the query object list of the node are put into queue by we.If team is road network point out, that It centainly belongs to Q, then final result (the 7th row) is added.In addition, out team necessarily more than the element value (distance) in queue It is small, in this way we it is easily verified that algorithm correctness.
Input:
Output:rp
Algorithm 2Optimization
As shown in figure 5, of the inventionOptimization implementation method (algorithm 2), include the following steps:
1) it initializes.It is sky apart from list D;It safeguards a minimum priority query, stores q to G tree node or road network The distance of point, will<The root node of 0, G tree>It joins the team;Inquiry point list (the point q in i.e. each Q, which the G tree about Q calculated In node include it;And each node of G tree, which child node include Q).
2) if the size of D is less thanAnd queue is not sky, is entered step 3);Otherwise the max or sum of D are calculated, As rp
3) go out team to obtain<dis,e>If e is the point (being certainly also the point in Q) on road network, dis is put into D, then return To step 2);Otherwise, e is the point on G tree, traverses the point v in the inquiry point list of e, calculates the distance of q to v, and v is entered Team returns step 2).
Advantage:Top-down traversal is carried out by G tree, can rapidly access entire road network;G tree storage apart from square Battle array allows G tree node to the calculating completion in linear session again of the distance of road network point.Significantly less it can reduce in this way Call number.It, will by G treeIt is changed into a KNN problem.Equally because calculating the convenient of shortest path on G tree, on It is very high to state efficiency of algorithm.
The realization of δ on G tree:This method can be applied in G tree index constitution step, i.e., the above-mentioned first step.Given road network Point u and v, it is assumed that the leaf node where it is respectively CuAnd Cv.Work as Cu=Cv, our executive boards first in the leaf node The dijkstra's algorithm in portion.If not including any boundary point in algorithm implementation procedure, it is considered that part Dijkstra Algorithm is efficient enough.Otherwise, we stop dijkstra's algorithm, calculate σ (u, v) using following formula:
δ (u, v)=min { δ (u, b1)+δ(b1, b2)+δ (v, b2)|b1, b2∈Bc}
Wherein BcIt is CuOr CvBoundary point set.
Work as Cu≠Cv, it is clear that the boundary point of leaf node where must travel respectively from the path that u reaches v.Enable CAFor CuAnd CvThe public father node of minimum.Shortest path so from u to v is bound to bottom-uply from CuTo CA, then from push up to Lowerly from CATo Cv.Such as the shortest path of v1 to v6 is bound to by G3, G1, G2, the boundary point of G5, process in Fig. 2 As shown in Figure 6.We can be expressed as with formula:
δ (u, v)=min (δ (u, b1)+δ (u, b2)+…+
δ(bm-1, bm)+…+δ(bn, v))
Wherein b1, b2 ..., bn are Cu, the boundary point of .., Cv respectively.Dynamic programming method solution can be used in we.Tool For body, our general objective δ (u, v) can be decomposed into a series of sub-goals, by storing pilot process, it can online The value of δ (u, v) is obtained in the property time.
The realization of θ (u, v):This method can apply the 5th step in above-mentioned steps.We have seen that in above-mentioned algorithm Beta pruning is carried out using θ (u, v).Lower bound of the θ (u, v) as distance, it directly can of course be regarded as Euclidean distance by we. But many times, this lower bound is not tight enough.We can use the triangle relationships such as not:Assuming that w is third point, then δ (u, v) >=δ (w, u)-δ (w, v) and δ (u, v) >=δ (w, v)-δ (w, u) is set up simultaneously.Therefore θ (u, v)=max | δ (w, u)- δ (w, v) |, dε(u, v).In order to further make the boundary of θ (u, v) tighter, some road signs are arranged in we in advance at random, in road sign Point be successively used as " third point ", according to the triangle relationships such as not, take maximum one.The number foot of general road sign It is enough small, therefore pretreated cost also very little.
4 experiments
4.1 setting
We realize algorithm above using standard C++, and the running experiment on a Linux machine, machine are matched Setting is 64 Intel Xeon 3.30GHz CPU, 16GB RAM.We use the LRU cache of a 1M size.All roads Network data is all from real world.As shown in the table:
For the FANN problem in road network, there are many factors for influencing expense.In our experiment, we are primarily upon 3 most important:
● the coverage rate of A, Q
● the size of M, Q
Elastic parameter
We change these three variables one by one.When changing one of them, other two are remained unchanged.A, M andDefault Value is 0.6,60,0.6 respectively.It is limited for length, in addition to illustrating SF, LKS, CTR and USA number when measuring scalability According to collection, we default the result for only showing SF data set.Furthermore we have studied G trees for 5.4 parts on different data sets (opposite) optimized parameter, default choice B=6, T=200 (SF), T=300 (LKS), T=400 (CTR) and T=500 (USA). 4.2 efficiency
Firstly, we investigate the efficiency comparative result schematic diagram of inventive algorithm Yu Baseline rudimentary algorithm. Baseline rudimentary algorithm refers to:The road network point in V is traversed, is then run based on dijkstra's algorithm
Change A:We change A using 0.1,0.2,0.4,0.6,0.8 and 1.0.As a result see Fig. 7, Fig. 8.
Change M:We change M using 10,20,30,40,60,80,100.As a result see Fig. 9, Figure 10.
VariationWe are changed using 0.1,0.2,0.4,0.6,0.8,1.0The result is shown in Figure 11, Figure 12.
Taking the above-mentioned ideal embodiment according to the present invention as inspiration, through the above description, relevant staff is complete Various changes and amendments can be carried out without departing from the scope of the technological thought of the present invention' entirely.The technology of this invention Property range is not limited to the contents of the specification, it is necessary to which the technical scope thereof is determined according to the scope of the claim.

Claims (10)

1. a kind of optimization method of the flexible polymer K-NN search G tree on road network, which is characterized in that include the following steps:
The first step establishes G tree index to entire road network;
Second step, definition and initialization:
It defines road network G=(V, E, W), wherein V indicates vertex, and E indicates side, and W indicates the weight on side, δ (vi, vj) indicate viTo vj's Road network distance;Q is query set (query objects), size M;FANN is query-defined to be:One FANN inquiry is one Five-tupleReturn to a tripleSo that:
Wherein p*It is to make flexible polymer apart from the smallest point in V,It is the optimal elastic subset of Q, r*It is exactly bullet at this time Property polymerization distance;
DefinitionFor flexible polymer function, it receives the subset Q of point p, a V that one belongs to V as input, returns OnePair as a result, meet:
WhereinBe a subset of Q and
Initialization:By r*It is initialized as infinity;A priority query is constructed, the root node of G tree is joined the team;
Third step judges whether queue is empty;If queue is sky, terminate;Otherwise go out team and obtain x;
4th step judges whether x is leaf node;If x is leaf node, for v all inside x, calculateSuch as It is necessary to update final result, third step is returned to after traversal;Otherwise, into the 5th step;The calculatingUsing including The optimization method of following steps:
1) initializing variable is sky apart from list D;It safeguards a minimum priority query, stores q to G tree node or road network The distance of point will by distance-taxis<The root node of 0, G tree>It joins the team;Calculate the inquiry point list about Q;
2) if the size of D is less thanAnd queue is not sky, is entered step 3);Otherwise calculate D maximum value max or and Sum, as r*
3) go out team to obtain<dis,e>If e is the point on road network, dis is put into D, returns step 2);Otherwise, e is on G tree Point, traverse the point v in the inquiry point list of e, calculate the distance of p to v, and v is joined the team, return step 2);
5th step, traverses the child node c of x, calculate all the points in Q to c minimum potential range, before obtainingMinimum range Maximum value max or and sum, be denoted as τ;
6th step, judges whether τ is greater than or equal to r*;If τ is less than r*, the child nodes of c are joined the team, third step is returned to;If τ is greater than or equal to r*, then terminate.
2. the method as described in claim 1, which is characterized in that described to establish G tree index specifically to entire road network in the first step For:Original image carries out subgraph division first, and each subgraph mutually disjoints, similar division is then carried out to subgraph, is passed by this Number of the mode returned inside subgraph comprising data point is less than the threshold value of setting;Calculate each G tree boundary point of graph Distance matrix.
3. method according to claim 2, which is characterized in that the distance matrix is in construction using the reality of δ on following G tree Existing method:
Given road network point u and v, it is assumed that the leaf node where it is respectively CuAnd Cv
Work as Cu=Cv, local dijkstra's algorithm is executed in the leaf node first;If do not included in algorithm implementation procedure Any boundary point, it is considered that local dijkstra's algorithm is efficient enough;Otherwise, stop dijkstra's algorithm, using following Formula calculates δ (u, v):
δ (u, v)=min { δ (u, b1)+δ(b1, b2)+δ (v, b2)|b1, b2∈Bc}
Wherein BcIt is CuOr CvBoundary point set;
Work as Cu≠Cv, it is clear that the boundary point of leaf node where must travel respectively from the path that u reaches v enables CAFor CuWith CvThe public father node of minimum, then the shortest path from u to v is bound to bottom-uply from CuTo CA, then it is top-down from CATo Cv, it is formulated as:
δ (u, v)=min (δ (u, b1)+δ (u, b2)+…+
δ(bm-1, bm)+…+δ(bn, v))
Wherein b1, b2 ..., bn are Cu, the boundary point of .., Cv respectively.
4. method as claimed in claim 3, which is characterized in that the implementation method of δ on the G tree, using dynamic programming method It solves, general objective δ (u, v) is decomposed into a series of sub-goals, by storing pilot process, δ (u, v) is obtained in linear session Value.
5. the method as described in claim 1, which is characterized in that in the step 1) of the 4th step, calculate in the initialization about Q Inquiry point list, i.e., the point q in each Q, the node in which G tree includes each node of it and G tree, which sub- knot Point includes Q.
6. the method as described in claim 1, which is characterized in that in the 5th step, the minimum of all the points to c may be away from the Q From being minimum potential range of the G tree node to road network point.
7. the method as described in claim 1, which is characterized in that in the 5th step, the τ is a dynamic threshold as r*Under The polymerization distance on boundary, i.e., any p has to be larger than τ, so if τ is greater than or equal to r*, then terminate.
8. the method as described in claim 1, which is characterized in that in the 5th step, including the implementation method of θ (u, v), specifically such as Under:Beta pruning is carried out using θ (u, v), it is directly regarded as Euclidean distance by lower bound of the θ (u, v) as distance;Utilize triangle The relationships such as not:Assuming that w is third point, then δ (u, v) >=δ (w, u)-δ (w, v) and δ (u, v) >=δ (w, v)-δ (w, u) are simultaneously It sets up, therefore θ (u, v)=max | δ (w, u)-δ (w, v) |, dε(u, v) }.
9. method according to claim 8, which is characterized in that in the 5th step, in the implementation method of the θ (u, v), in order into One step keeps the boundary of θ (u, v) tighter, and some road signs are arranged in advance, using the point in road sign successively as third point, according to triangle The relationships such as not, take maximum one.
10. the method as described in claim 1, which is characterized in that in second step, the priority query, the member of queue storage Element is binary group < c, d >, and wherein c is G tree node, and the calculation of d is as follows:The minimum for calculating all the points to c in Q may be away from From before obtainingThe maximum value max of minimum range or and sum, as d, that is, the τ of the 5th step, priority is according to the big of d Minispread.
CN201810342316.7A 2018-04-17 2018-04-17 The optimization method of flexible polymer K-NN search G tree on road network Pending CN108829694A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810342316.7A CN108829694A (en) 2018-04-17 2018-04-17 The optimization method of flexible polymer K-NN search G tree on road network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810342316.7A CN108829694A (en) 2018-04-17 2018-04-17 The optimization method of flexible polymer K-NN search G tree on road network

Publications (1)

Publication Number Publication Date
CN108829694A true CN108829694A (en) 2018-11-16

Family

ID=64154097

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810342316.7A Pending CN108829694A (en) 2018-04-17 2018-04-17 The optimization method of flexible polymer K-NN search G tree on road network

Country Status (1)

Country Link
CN (1) CN108829694A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109684087A (en) * 2018-12-17 2019-04-26 北京中科寒武纪科技有限公司 Operation method, device and Related product
CN111397632A (en) * 2020-04-13 2020-07-10 清研捷运(天津)智能科技有限公司 Block preprocessing path planning method for large-scale road network

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109684087A (en) * 2018-12-17 2019-04-26 北京中科寒武纪科技有限公司 Operation method, device and Related product
CN111397632A (en) * 2020-04-13 2020-07-10 清研捷运(天津)智能科技有限公司 Block preprocessing path planning method for large-scale road network

Similar Documents

Publication Publication Date Title
Liu et al. Finding top-k optimal sequenced routes
Li et al. G*-tree: An efficient spatial index on road networks
CN105760503B (en) A kind of method of quick calculating node of graph similarity
CN108932347B (en) Spatial keyword query method based on social perception in distributed environment
CN109635069B (en) Geographic space data self-organizing method based on information entropy
CN105550332A (en) Dual-layer index structure based origin graph query method
CN108829694A (en) The optimization method of flexible polymer K-NN search G tree on road network
Otay et al. A novel pythagorean fuzzy AHP and TOPSIS method for the wind power farm location selection problem
Liu et al. Multi-constraint shortest path using forest hop labeling
CN108829695A (en) Flexible polymer K-NN search G-max method on road network
Abbasifard et al. Efficient indexing for past and current position of moving objects on road networks
Liu et al. FHL-cube: multi-constraint shortest path querying with flexible combination of constraints
Aljubayrin et al. Skyline trips of multiple POIs categories
CN105138527A (en) Data classification regression method and data classification regression device
CN106020724A (en) Neighbor storage method based on data mapping algorithm
CN108763292A (en) Flexible polymer K-NN search A-sum methods on road network
Gothwal et al. The survey on skyline query processing for data-specific applications
Ahmadi et al. K-closest pairs queries in road networks
Stai et al. Hyperbolic embedding for efficient computation of path centralities and adaptive routing in large-scale complex commodity networks
CN105843555A (en) Stochastic gradient descent based spectral hashing method in distributed storage
Nikbazm et al. Agent-based resource discovery in cloud computing using bloom filters
CN108763294A (en) Flexible polymer K-NN search G-sum methods on road network
John et al. Dynamic sorting and average skyline method for query processing in spatial-temporal data
CN109446294B (en) Parallel mutual subspace Skyline query method
Tang et al. Supporting continuous skyline queries in dynamically weighted road networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20181116