CN114896480A - Top-K space keyword query method based on road network index - Google Patents

Top-K space keyword query method based on road network index Download PDF

Info

Publication number
CN114896480A
CN114896480A CN202210356274.9A CN202210356274A CN114896480A CN 114896480 A CN114896480 A CN 114896480A CN 202210356274 A CN202210356274 A CN 202210356274A CN 114896480 A CN114896480 A CN 114896480A
Authority
CN
China
Prior art keywords
spatial
road network
keyword
key
node
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210356274.9A
Other languages
Chinese (zh)
Inventor
曾志新
唐洁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China University of Technology SCUT
Original Assignee
South China University of Technology SCUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT filed Critical South China University of Technology SCUT
Priority to CN202210356274.9A priority Critical patent/CN114896480A/en
Publication of CN114896480A publication Critical patent/CN114896480A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/907Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/909Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using geographical or spatial information, e.g. location
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9027Trees

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Library & Information Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a Top-K space keyword query method based on road network index, which comprises the following steps: 1) defining the related concepts of the road network; 2) performing weighting calculation by using the shortest passing time and a TF-IDF model to obtain a spatial text score function capable of measuring the spatial proximity and text similarity of the object; 3) constructing a road network index TKG-tree by using side and vertex information contained in a road network, wherein the road network index comprises a transit time matrix recorded with spatial index information and an inverted document recorded with text index information; 4) taking the spatial text score as an index of Top-K result sorting, and finding a spatial keyword object for ranking Top-K by using the constructed road network index TKG-tree; according to the invention, the road network index is constructed in a way of dividing subgraphs, the problem of high index storage cost is solved, the spatial text score of the node is taken as an upper boundary pruning object, and the speed of searching the Top-K spatial keyword object can be improved.

Description

Top-K space keyword query method based on road network index
Technical Field
The invention relates to the technical field of road network index and Top-K query, in particular to a Top-K space keyword query method based on road network index.
Background
The Top-k space keyword query problem is a big research hotspot of a spatial database, and aims to query a plurality of objects which are close to a user position and meet the query preference of the user, and the query intention is expressed by keywords. The Top-k space keyword query problem not only considers the matching degree of the object and the keyword, but also considers the space distance between the user and the query object, and can be divided into two different distances, namely a Euclidean space and a road network space, according to the difference of the measurement modes of the space distance. The Euclidean space represents that the distance between two points is measured by the straight line distance, and the distance between two points in the road network space is measured by the shortest distance on the road network.
However, it is very difficult to support efficient Top-K queries over the internet. The main bottleneck is that the cost of calculating the point-to-point shortest path is particularly high, unlike the method of calculating the Euclidean distance between two points. In addition, even if the shortest road distance can be obtained quickly, it is time consuming to calculate the Top-K result if there is no efficient pruning algorithm and strategy. Although there is a lot of work to study the Top-K query processing problem on the road network. However, none of the existing methods support large-scale road networks. The main problem of these methods is that indexing usually results in too high storage cost or too high preprocessing time cost.
Disclosure of Invention
The invention aims to overcome the defects and shortcomings of the prior art, provides a Top-K space keyword query method based on a road network index, constructs the road network index in a subgraph dividing mode, relieves the problem of high index storage cost, divides the space text of nodes into upper bound pruned objects, and can improve the speed of searching the Top-K space keyword objects.
In order to achieve the purpose, the technical scheme provided by the invention is as follows: a Top-K space keyword query method based on a road network index comprises the following steps:
1) defining the related concept of road network, including space keyword object o, road network graph G and shortest transit time function tau between two vertexes * (t);
2) Using the shortest transit time function tau obtained in step 1) * (t) measuring the spatial proximity of the spatial keyword object o to obtain a spatial score function, which is marked as f s (o); measuring the text similarity of the spatial keyword object o by using a TF-IDF model to obtain a text score function, and marking as f d (o); using textScore function f d (o) and a spatial score function f s (o) weighting to obtain a spatial text score function, which is recorded as
Figure BDA0003583053460000022
3) Constructing a road network index TKG-tree by using the edge and vertex information contained in the road network defined in the step 1); the TKG-tree is a balanced tree structure cut from a graph, and a root node of the tree structure corresponds to the whole road network graph G and has the following properties: each descendant node n of the root node i All correspond to a sub-graph G i The lowest level of nodes is called leaf nodes; the TKG-tree has two hyper-parameters f and mu which respectively represent the branch number of non-leaf nodes and the top point number upper limit of leaf nodes; each node n i All contain a transit time matrix M i And an inverted document D in which object keyword information o.key is recorded i
4) Using the spatial text score function obtained in step 2)
Figure BDA0003583053460000021
From query point v as an indicator of Top-K result ranking q Starting, finding the spatial key object o of the ranking Top-K by using the road network index TKG-tree constructed in the step 3).
Further, in step 1), the following concepts are defined:
a. in the edges of the road network, the length of the edge is expressed by the passing time, and the passing time of the edge changes along with the difference of time t, so the passing time function of the edge is recorded as omega (t); among the road network vertices, some vertices carry text information in the form of a keyword, and a vertex having spatial position information and text keyword information is referred to as a spatial keyword object o, and is expressed as:
o=(v o ,o.key)
in the formula, v o Representing the vertex position of the object, and representing the key information of the object by o.key;
b. modeling a road network as an undirected graph, represented as:
Figure BDA0003583053460000031
in the formula, G is a road network graph and represents an undirected road network structure formed by crossing a plurality of edges; v. of 1 Denotes the 1 st vertex, v n Denotes the nth vertex, V ═ V 1 ,v 2 ,...v n Represents the set of all the vertexes in the road network; e.g. of the type 1 Denotes the 1 st edge, e n Denotes the nth side, E ═ E 1 ,e 2 ,...e n Representing a set of edges in a road network; and omega 1 (t) represents the transit time function of the 1 st side, ω n (t) represents a transit time function of the nth side, W ═ ω 1 (t),ω 2 (t),...ω n (t) then represents the set of transit time functions associated with the corresponding edge;
c. in a road network, a path ρ from a starting point to an end point is represented by a series of adjacent connected vertex sequences<v i ,...v j >Wherein v is i Denotes the ith vertex, v j Represents the jth vertex, and P is used to represent the set of all paths from the starting point to the end point; noting the transit time function of the path ρ as τ (t), the shortest transit time function τ from the start point to the end point will be * (t) is defined as follows:
τ * (t)=min{τ(t)|ρ∈P}
in the formula, τ (t) represents a transit time function of the route ρ from the time t; tau is * (t) represents the shortest transit time function of all routes from the start point to the end point, starting at time t.
Further, the step 2) comprises the following steps:
2.1) defining a query parameter query of a Top-K space keyword based on a road network space:
query=<v q ,q.key,t,k>
in the formula, v q Representing query points, q.key representing a keyword phrase of the query, t representing the searching time, and k representing the number of returned spatial keyword objects o;
2.2) Using the shortest transit time function τ * (t) to measure the spatial key object o and the query point v q Spatial proximity of (a), defining a spatial scoring function f s (o) the following:
Figure BDA0003583053460000041
in the formula (I), the compound is shown in the specification,
Figure BDA0003583053460000042
representing the vertex v at which the spatial key object o is located o To a query point v q The shortest transit time in between;
2.3) using TF-IDF model to measure the text similarity between the spatial keyword object o and the keyword group q.key of the query, defining a text score function f d (o) the following:
Figure BDA0003583053460000043
in the formula, key and q.key respectively represent keyword and keyword phrase of query, tf (key) represents the frequency of occurrence of the keyword key in the spatial keyword object o, idf (key) represents the frequency reciprocal of occurrence of the keyword key in all the spatial keyword objects o; the importance of a key increases in direct proportion to the number of times it appears in a spatial key object o, but at the same time decreases in inverse proportion to the frequency with which it appears in all spatial key objects o;
2.4) Using the text score function f d (o) and a spatial score function f s (o) obtaining a spatial text scoring function
Figure BDA0003583053460000046
The calculation formula is as follows:
Figure BDA0003583053460000044
in the formula, alphaWhich represents a weight factor, is given by the weight factor,
Figure BDA0003583053460000045
representing a spatial text scoring function, which is defined by a spatial scoring function f s (o) and a text scoring function f d And (o) performing weighted calculation.
Further, in step 3), the road network index TKG-tree is constructed by a graph cutting method, first taking the entire road network graph G as a root node, then cutting G into f subgraphs with the same size as the root node, and taking the subgraphs as child nodes of the root node, and then continuing to cut the subgraphs recursively until the number of vertices included in the last subgraph does not exceed μ; running Floyd algorithm to obtain node n according to topological relation of road network i The transit time matrix M i Matrix M i Records the node n i Corresponding subgraph G i Shortest transit time function tau between vertices within * (t); for the node n i Corresponding subgraph G i Key information o.key of all spatial key objects o in the node n is obtained i In the reverse arrangement document D i
Further, in step 4), after inputting the query parameters, the spatial text score function is used
Figure BDA0003583053460000051
Searching as an index of Top-K result sorting; defining a maximum priority queue Q and a result set R, from a query point v q Starting searching on the sub-graph, adding all spatial keyword objects o in the sub-graph into a priority queue Q and according to a spatial text score function
Figure BDA0003583053460000052
Sorting; using n x Recording the highest level node currently visited on the TKG-tree, and marking the current search range as the subgraph G corresponding to the node x At the same time, defining the spatial text score of the node
Figure BDA0003583053460000053
Pruning as upper bound:
Figure BDA0003583053460000054
in the formula (f) s (node) representing a query point v q To the current sub-graph G x The shortest passing time of the network is the passing time matrix M of the road network index TKG-tree i Calculating to obtain;
Figure BDA0003583053460000055
the inverted document D which represents the maximum text score that the spatial keyword object o can reach and uses the road network index TKG-tree i Calculating to obtain; sequentially dequeuing the spatial keyword objects o in the queue Q, and screening out the upper bound of the spatial text score ratio
Figure BDA0003583053460000056
Adding a high spatial keyword object o into a result set R; if queue Q is empty and result set R is less than k, n will be x Updating the node to the parent node of the original node, and expanding the current search range to n x After the nodes are updated, the corresponding subgraphs are simultaneously updated
Figure BDA0003583053460000057
To n x The space text after the node is updated scores, the operation is repeated until k results exist in the result set R, and the algorithm is automatically ended; maximum priority queue Q and upper bound
Figure BDA0003583053460000058
Ensure the slave query point v q The result is globally optimal by the kth, so that the Top-K space keyword object o can be correctly found.
Compared with the prior art, the invention has the following advantages and beneficial effects:
1. the method considers the influence of time factors on the distance of the Top-K road network, and has more practical application value.
2. Compared with the Top-K query method of other road network spaces, the method has the advantages of shorter preprocessing time and lower storage cost.
3. The invention has millisecond-level query response time on data sets of different scales and has good universality and expansibility.
Drawings
FIG. 1 is a schematic logic flow diagram of the method of the present invention.
FIG. 2 is a diagram of a road network index TKG-tree used in the present invention.
Detailed Description
The present invention will be described in further detail with reference to examples and drawings, but the present invention is not limited thereto.
Referring to fig. 1, the embodiment provides a Top-K space keyword query method based on a road network index, which specifically includes the following steps:
1) the space keyword object, the road network graph and the shortest transit time are defined:
a. in the edges of the road network, the length of the edge is expressed by the passing time, and the passing time of the edge changes along with the difference of time t, so the passing time function of the edge is recorded as omega (t); among the road network vertices, some vertices carry text information in the form of a keyword, and a vertex having spatial position information and text keyword information is referred to as a spatial keyword object o, and is expressed as:
o=(v o ,o.key)
in the formula, v o Representing the vertex position of the object, and representing the keyword information of the object by o.key;
b. modeling a road network as an undirected graph, represented as:
Figure BDA0003583053460000061
in the formula, G is a road network graph and represents an undirected road network structure formed by crossing a plurality of edges; v. of 1 Denotes the 1 st vertex, v n Denotes the nth vertex, V ═ V 1 ,v 2 ,...v n Represents the set of all the vertexes in the road network; e.g. of the type 1 Denotes item 1Edge, e n Denotes the nth side, E ═ E 1 ,e 2 ,...e n Representing a set of edges in a road network; and omega 1 (t) represents the transit time function of the 1 st side, ω n (t) represents a transit time function of the nth side, W ═ ω 1 (t),ω 2 (t),...ω n (t) then represents the set of transit time functions associated with the corresponding edge;
c. in a road network, a path ρ from a starting point to an end point can be represented by a series of adjacent connected vertex sequences<v i ,...v j >Wherein v is i Denotes the ith vertex, v j Representing the jth vertex, and P can be used to represent the set of all paths from the start point to the end point. Noting the transit time function of the path ρ as τ (t), the shortest transit time function τ from the start point to the end point will be * (t) is defined as follows:
τ * (t)=min{τ(t)|ρ∈P}
in the formula, τ (t) represents a transit time function of the route ρ from the time t; tau is * (t) represents the shortest transit time function of all routes from the start point to the end point, starting at time t.
2) Using the shortest transit time function tau obtained in step 1) * (t) measuring the spatial proximity of the spatial key object o; measuring the text similarity of the spatial keyword object o by using a TF-IDF model; the space text score function is obtained by weighting calculation by using a text score function and a space score function, and the method comprises the following steps:
2.1) defining a query parameter query of a Top-K space keyword based on a road network space:
query=<v q ,q.key,t,k>
in the formula, v q Representing query points, q.key representing a keyword phrase of the query, t representing the searching time, and k representing the number of returned spatial keyword objects o;
2.2) Using the shortest transit time function τ * (t) to measure the spatial key object o and the query point v q Spatial proximity of (a), defining a spatial scoring function f s (o) the following:
Figure BDA0003583053460000071
in the formula (I), the compound is shown in the specification,
Figure BDA0003583053460000072
representing the vertex v at which the spatial key object o is located o To a query point v q The shortest transit time in between.
2.3) using TF-IDF model to measure the text similarity between the spatial keyword object o and the keyword group q.key of the query, defining a text score function f d (o) the following:
Figure BDA0003583053460000081
in the formula, key and q.key respectively represent keyword and keyword phrase of query, tf (key) represents the frequency of occurrence of the keyword key in the spatial keyword object o, idf (key) represents the reciprocal of the frequency of occurrence of the keyword key in all the spatial keyword objects o; the importance of a key increases in direct proportion to the number of times it appears in a spatial key object o, but at the same time decreases in inverse proportion to the frequency with which it appears in all spatial key objects o.
2.4) Using the text score function f d (o) and a spatial score function f s (o) obtaining a spatial text score function
Figure BDA0003583053460000082
The calculation formula is as follows:
Figure BDA0003583053460000083
in the formula, α represents a weight factor,
Figure BDA0003583053460000084
representing a spatial text scoring function, which is defined by a spatial scoring function f s (o) and text scoreFunction f d And (o) obtaining a weighted calculation.
3) Constructing a road network index TKG-tree by using the side and vertex information contained in the road network defined in the step 1); the road network index TKG-tree is constructed by a graph cutting method, and as shown in FIG. 2, the number f of branches of a non-leaf node is set to be 2, and the maximum vertex number mu of a leaf node is set to be 4; firstly, the whole road network graph G is taken as a root node, then the G is cut into f sub-graphs with the same size, the sub-graphs are taken as child nodes of the root node, and then the sub-graphs are cut continuously and recursively until the number of vertexes contained in the last sub-graph does not exceed mu; running Floyd algorithm to obtain node n according to topological relation of road network i The transit time matrix M i Matrix M i Records the node n i Corresponding subgraph G i Shortest transit time function tau between vertices within * (t); for the node n i Corresponding subgraph G i Key information o.key of all spatial key objects o in the node n is obtained i In the reverse arrangement document D i
4) Using the spatial text score function obtained in step 2)
Figure BDA0003583053460000091
From query point v as an indicator of Top-K result ranking q Starting, finding a spatial key word object of the ranking Top-K by using the road network index TKG-tree constructed in the step 3); defining a maximum priority queue Q and a result set R, from a query point v q Starting searching on the sub-graph, adding all spatial keyword objects o in the sub-graph into a priority queue Q and according to a spatial text score function
Figure BDA0003583053460000092
Sorting; using n x Recording the highest level node currently visited on the TKG-tree, and marking the current search range as the subgraph G corresponding to the node x At the same time, the spatial text score of the node is defined
Figure BDA0003583053460000093
Cut as upper boundBranching:
Figure BDA0003583053460000094
in the formula (f) s (node) representing a query point v q To the current sub-graph G x The shortest passing time of the network is the passing time matrix M of the road network index TKG-tree i Calculating to obtain;
Figure BDA0003583053460000095
the inverted document D which represents the maximum text score that the spatial keyword object o can reach and uses the road network index TKG-tree i Calculating to obtain; sequentially dequeuing the spatial keyword objects o in the queue Q, and screening out the upper bound of the spatial text score ratio
Figure BDA0003583053460000096
Adding a high spatial keyword object o into a result set R; if queue Q is empty and result set R is less than k, n will be x Updating the node to the parent node of the original node, and expanding the current search range to n x After the nodes are updated, the corresponding subgraphs are simultaneously updated
Figure BDA0003583053460000097
To n x And (4) scoring the updated space text of the node, and repeating the operation until the Top-K space keyword object o is correctly found.
The above embodiments are preferred embodiments of the present invention, but the present invention is not limited to the above embodiments, and any other changes, modifications, substitutions, combinations, and simplifications which do not depart from the spirit and principle of the present invention should be construed as equivalents thereof, and all such changes, modifications, substitutions, combinations, and simplifications are intended to be included in the scope of the present invention.

Claims (5)

1. The Top-K space keyword query method based on the road network index is characterized by comprising the following steps of:
1) defining the related concepts of road network, including spatial key word object o, road network graph G andshortest transit time function tau between two vertices * (t);
2) Using the shortest transit time function tau obtained in step 1) * (t) measuring the spatial proximity of the spatial keyword object o to obtain a spatial score function, which is marked as f s (o); measuring the text similarity of the spatial keyword object o by using a TF-IDF model to obtain a text score function, and marking as f d (o); using a text scoring function f d (o) and a spatial score function f s (o) weighting to obtain a spatial text score function, which is recorded as
Figure FDA0003583053450000012
3) Constructing a road network index TKG-tree by using the edge and vertex information contained in the road network defined in the step 1); the TKG-tree is a balanced tree structure cut from a graph, and a root node of the tree structure corresponds to the whole road network graph G and has the following properties: each descendant node n of the root node i All correspond to a sub-graph G i The lowest level of nodes is called leaf nodes; the TKG-tree has two hyper-parameters f and mu which respectively represent the branch number of non-leaf nodes and the top point number upper limit of leaf nodes; each node n i All contain a transit time matrix M i And an inverted document D in which object keyword information o.key is recorded i
4) Using the spatial text score function obtained in step 2)
Figure FDA0003583053450000011
From query point v as an indicator of Top-K result ranking q Starting, finding the spatial key object o of the ranking Top-K by using the road network index TKG-tree constructed in the step 3).
2. The method for querying Top-K spatial keywords based on the road network index as claimed in claim 1, wherein in step 1), the following concepts are defined:
a. in the edges of the road network, the length of the edge is expressed by the passing time, and the passing time of the edge changes along with the difference of time t, so the passing time function of the edge is recorded as omega (t); among the road network vertices, some vertices carry text information in the form of a keyword, and a vertex having spatial position information and text keyword information is referred to as a spatial keyword object o, and is expressed as:
o=(v o ,o.key)
in the formula, v o Representing the vertex position of the object, and representing the key information of the object by o.key;
b. modeling a road network as an undirected graph, represented as:
Figure FDA0003583053450000021
in the formula, G is a road network graph and represents an undirected road network structure formed by crossing a plurality of edges; v. of 1 Denotes the 1 st vertex, v n Denotes the nth vertex, V ═ V 1 ,v 2 ,...v n Represents the set of all the vertexes in the road network; e.g. of the type 1 Denotes the 1 st edge, e n Denotes the nth side, E ═ E 1 ,e 2 ,...e n Representing a set of edges in a road network; and omega 1 (t) represents the transit time function of the 1 st side, ω n (t) represents a transit time function of the nth side, W ═ ω 1 (t),ω 2 (t),...ω n (t) then represents the set of transit time functions associated with the corresponding edge;
c. in a road network, a path ρ from a starting point to an end point is represented by a series of adjacent connected vertex sequences<v i ,...v j >Wherein v is i Denotes the ith vertex, v j Represents the jth vertex, and P is used to represent all path sets from the starting point to the end point; noting the transit time function of the path ρ as τ (t), the shortest transit time function τ from the start point to the end point will be * (t) is defined as follows:
τ * (t)=min{τ(t)|ρ∈P}
wherein τ (t) represents the transit time function of the route ρ, starting at time t; tau is * (t) thenWhich represents the shortest transit time function of all the routes from the start point to the end point, starting at time t.
3. The method for querying Top-K spatial keywords based on the road network index according to claim 1, wherein the step 2) comprises the following steps:
2.1) defining a query parameter query of a Top-K space keyword based on a road network space:
query=<v q ,q.key,t,k>
in the formula, v q Representing query points, q.key representing a keyword phrase of the query, t representing the searching time, and k representing the number of returned spatial keyword objects o;
2.2) Using the shortest transit time function τ * (t) to measure the spatial key object o and the query point v q Spatial proximity of (a), defining a spatial scoring function f s (o) the following:
Figure FDA0003583053450000031
in the formula (I), the compound is shown in the specification,
Figure FDA0003583053450000032
representing the vertex v at which the spatial key object o is located o To a query point v q The shortest transit time in between;
2.3) using TF-IDF model to measure the text similarity between the spatial keyword object o and the keyword group q.key of the query, defining a text score function f d (o) the following:
Figure FDA0003583053450000033
in the formula, key and q.key respectively represent keyword and keyword phrase of query, tf (key) represents the frequency of occurrence of the keyword key in the spatial keyword object o, idf (key) represents the reciprocal of the frequency of occurrence of the keyword key in all the spatial keyword objects o; the importance of a key increases in direct proportion to the number of times it appears in a spatial key object o, but at the same time decreases in inverse proportion to the frequency with which it appears in all spatial key objects o;
2.4) Using the text score function f d (o) and a spatial score function f s (o) obtaining a spatial text score function
Figure FDA0003583053450000034
The calculation formula is as follows:
Figure FDA0003583053450000035
in the formula, α represents a weight factor,
Figure FDA0003583053450000036
representing a spatial text scoring function, which is defined by a spatial scoring function f s (o) and text score function f d And (o) performing weighted calculation.
4. The method for searching for Top-K spatial keywords based on the road network index as claimed in claim 1, wherein: in step 3), the road network index TKG-tree is constructed by a graph cutting method, first taking the entire road network graph G as a root node, then cutting G into f subgraphs with the same size as the root node, and taking the subgraphs as child nodes of the root node, and then continuing to cut the subgraphs recursively until the number of vertices contained in the last subgraph does not exceed μ; running Floyd algorithm to obtain node n according to topological relation of road network i The transit time matrix M i Matrix M i Records the node n i Corresponding subgraph G i Shortest transit time function tau between vertices within * (t); for the node n i Corresponding subgraph G i Key information o.key of all spatial key objects o in the node n is obtained i Inverted document D i
5. The method for searching for Top-K spatial keywords based on the road network index as claimed in claim 1, wherein: in step 4), after inputting query parameters, a spatial text score function is used
Figure FDA0003583053450000041
Searching as an index of Top-K result sorting; defining a maximum priority queue Q and a result set R, from a query point v q Starting searching on the sub-graph, adding all spatial keyword objects o in the sub-graph into a priority queue Q and according to a spatial text score function
Figure FDA0003583053450000042
Sorting; using n x Recording the highest level node currently visited on the TKG-tree, and marking the current search range as the subgraph G corresponding to the node x At the same time, defining the spatial text score of the node
Figure FDA0003583053450000043
Pruning as upper bound:
Figure FDA0003583053450000044
in the formula (f) s (node) representing a query point v q To the current sub-graph G x The shortest passing time of the network is the passing time matrix M of the road network index TKG-tree i Calculating to obtain;
Figure FDA0003583053450000045
the inverted document D which represents the maximum text score that the spatial keyword object o can reach and uses the road network index TKG-tree i Calculating to obtain; sequentially dequeuing the spatial keyword objects o in the queue Q, and screening out the upper bound of the spatial text score ratio
Figure FDA0003583053450000046
High spatial key object o, joinA result set R; if queue Q is empty and result set R is less than k, n will be x Updating the node to the parent node of the original node, and expanding the current search range to n x After the nodes are updated, the corresponding subgraphs are simultaneously updated
Figure FDA0003583053450000047
To n x The space text after the node is updated scores, the operation is repeated until k results exist in the result set R, and the algorithm is automatically ended; maximum priority queue Q and upper bound
Figure FDA0003583053450000048
Ensure the slave query point v q The result is globally optimal by the kth, so that the Top-K space keyword object o can be correctly found.
CN202210356274.9A 2022-04-06 2022-04-06 Top-K space keyword query method based on road network index Pending CN114896480A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210356274.9A CN114896480A (en) 2022-04-06 2022-04-06 Top-K space keyword query method based on road network index

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210356274.9A CN114896480A (en) 2022-04-06 2022-04-06 Top-K space keyword query method based on road network index

Publications (1)

Publication Number Publication Date
CN114896480A true CN114896480A (en) 2022-08-12

Family

ID=82714665

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210356274.9A Pending CN114896480A (en) 2022-04-06 2022-04-06 Top-K space keyword query method based on road network index

Country Status (1)

Country Link
CN (1) CN114896480A (en)

Similar Documents

Publication Publication Date Title
Yiu et al. Reverse nearest neighbors in large graphs
US9430559B2 (en) Document retrieval using internal dictionary-hierarchies to adjust per-subject match results
Rocha-Junior et al. Top-k spatial keyword queries on road networks
Gao et al. Efficient collective spatial keyword query processing on road networks
Li et al. G*-tree: An efficient spatial index on road networks
CN109992786B (en) Semantic sensitive RDF knowledge graph approximate query method
CN108846029B (en) Information correlation analysis method based on knowledge graph
CN104376112B (en) A kind of method of road cyberspace key search
CN110059264B (en) Site retrieval method, equipment and computer storage medium based on knowledge graph
CN108932347B (en) Spatial keyword query method based on social perception in distributed environment
Zou et al. Pareto-based dominant graph: An efficient indexing structure to answer top-k queries
Zou et al. Answering pattern match queries in large graph databases via graph embedding
CN104346444B (en) A kind of the best site selection method based on the anti-spatial key inquiry of road network
CN107506490A (en) Preferential search algorithm and system based on position top k keyword queries under sliding window
CN107633068A (en) Fast indexing method and system based on position top k keyword queries under sliding window
Abeywickrama et al. K-SPIN: Efficiently processing spatial keyword queries on road networks
CN107085594A (en) Subgraph match method based on set similarity in big chart database
CN115269968A (en) Internet big data keyword word searching method of improved RDF
CN114201480A (en) Multi-source POI fusion method and device based on NLP technology and readable storage medium
CN114996278B (en) Road network shortest path distance query method based on reinforcement learning
Zhu et al. I/O-efficient algorithms for top-k nearest keyword search in massive graphs
CN114896480A (en) Top-K space keyword query method based on road network index
Xu et al. Continuous k nearest neighbor queries over large multi-attribute trajectories: a systematic approach
CN105912649A (en) Database fuzzy retrieval method and system
Yu et al. A tree-based indexing approach for diverse textual similarity search

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination