CN112948717B - Massive space POI searching method and system based on multi-factor constraint - Google Patents
Massive space POI searching method and system based on multi-factor constraint Download PDFInfo
- Publication number
- CN112948717B CN112948717B CN202110519532.6A CN202110519532A CN112948717B CN 112948717 B CN112948717 B CN 112948717B CN 202110519532 A CN202110519532 A CN 202110519532A CN 112948717 B CN112948717 B CN 112948717B
- Authority
- CN
- China
- Prior art keywords
- poi
- cluster
- layer
- query
- point
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9537—Spatial or temporal dependent retrieval, e.g. spatiotemporal queries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/901—Indexing; Data structures therefor; Storage structures
- G06F16/9014—Indexing; Data structures therefor; Storage structures hash tables
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/903—Querying
- G06F16/90335—Query processing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/906—Clustering; Classification
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Computational Linguistics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention relates to a method and a system for searching a mass space POI based on multi-factor constraint. According to the method, the position distance, the position direction and the key attributes of the POI relative to the user are considered, and the POI which is located in the range of the appointed direction of the user, is restrained by the appointed key words of the user and is closest to the user is recommended for the user. The invention comprises five stages: 1) a spatial POI crawling stage; 2) a spatial POI keyword coding stage; 3) an index establishing stage; 4) a user query requirement input stage; 5) and (5) a query processing stage. The method fully considers the actual requirements of the user in the spatial position service, namely the requirements of the user on the POI distance, the direction and the keywords are considered; the invention designs an indexing mechanism with direction, position and keyword perception to organize spatial mass POIs, and based on the indexing mechanism, the efficiency and accuracy of POI searching can be improved, and the method has wide practical application value.
Description
Technical Field
The invention relates to the field of position space query, in particular to a method and a system for searching a mass space POI based on multi-factor constraint.
Background
In a conventional location-based service, a Point of Interest (POI) closest to a user is always recommended to the user, but the POI has not only a distance attribute with respect to the user location, but also a direction attribute with respect to the user. For example, when the user is moving in a direction, the user may prefer POIs closer to the user in the direction of his movement, while POIs closer to the user but not in the direction of his movement will not meet the user's requirements. Therefore, the existing search scheme for the mass spatial POI has low accuracy.
Disclosure of Invention
The invention aims to provide a method and a system for searching a large number of spatial POI based on multi-factor constraint so as to improve the recommendation accuracy of the POI.
In order to achieve the purpose, the invention provides the following scheme:
a massive space POI searching method based on multi-factor constraint comprises the following steps:
acquiring space POI data of a target area to obtain a space POI set; the space POI set comprises all POIs, each POI is identified by an attribute, and the attribute comprises a name, a type and longitude and latitude coordinates;
constructing a hash list for POI in the POI set; each POI in the hash list corresponds to a key value pair, the key value pair is a key word-coded key value pair, the key word is the type of the POI, the code is a numerical value corresponding to the type, the codes corresponding to the same key word in the hash list are the same, and the codes corresponding to different key words are different;
performing hierarchical iterative clustering on all POI of the POI set by adopting a hierarchical spatial clustering method according to a rule that the number of elements in each group of clusters is not more than a set threshold value to obtain a cluster corresponding to each layer; the cluster corresponding to the (i + 1) th layer is obtained by carrying out spatial clustering on the basis of the cluster corresponding to the i-th layer; each cluster of each layer comprises a plurality of elements, the elements in the cluster of the (i + 1) th layer are the clusters of the (i) th layer, and the elements in the cluster of the (1) th layer are POIs;
determining the attribute of each cluster in each layer; the cluster attributes comprise a minimum bounding rectangle of the cluster, an angle range corresponding to the cluster and a keyword bag corresponding to the cluster; minimum bounding rectangle MBR of jth clusterjIs the smallest rectangle that encloses all POIs in the jth cluster; the angle range corresponding to the jth cluster is MBRjA range of angles relative to the lower left vertex of the smallest bounding rectangle R of the set of POIs; the key word bag corresponding to the jth cluster is a set formed by codes in key value pairs corresponding to all POI in the jth cluster;
based on the attributes of all clusters in each layer, creating nodes corresponding to each layer of the index tree to obtain a node set corresponding to each layer of the nodes of the index tree; the nodes at the first layer of the index tree are leaf nodes of the index tree, and the nodes above the first layer of the index tree are middle nodes of the index tree; the jth node of the ith layer of the index tree corresponds to the jth cluster in the ith layer;
acquiring a query requirement; the query requirement comprises a keyword, a direction range and a position coordinate;
determining a code corresponding to the keyword of the query requirement according to the keyword of the query requirement and the hash list;
and traversing the node set corresponding to each layer of the index tree from top to bottom according to the codes and the direction ranges corresponding to the query requirements, and determining the POI which meets the query requirements and is closest to the position coordinates in the target area as the POI which meets the query requirements and is closest to the position coordinates in the cluster corresponding to the nodes in the index tree, wherein the codes and the angle ranges meet the direction ranges of the query requirements.
Optionally, the hierarchical spatial clustering method is adopted, and hierarchical iterative clustering is performed on all POIs of the POI set according to a rule that the number of elements in each group of clusters is not greater than a set threshold, so as to obtain a cluster corresponding to each layer, and specifically includes:
for the first layer, clustering all POI of the POI set by adopting a spatial clustering method according to a rule that the number of POI in each group of clusters is not more than a set threshold value to obtain k1Grouping first-level clustering;
for the nth layer, adopting a spatial clustering method, and carrying out k treatment on the nth-1 layer according to a rule that the number of n-1-level clusters in each group of clusters is not more than a set threshold valuen-1Clustering the n-1 level clusters to obtain knGrouping n-level clusters; n is an integer greater than 1;
judgment of knWhether the threshold value is greater than the set threshold value;
when k isnWhen the threshold value is larger than the set threshold value, the spatial clustering method is continuously adopted for hierarchical iterationClustering;
when k isnAnd when the threshold value is not larger than the set threshold value, finishing clustering to obtain clusters corresponding to each layer.
Optionally, the determining the attribute of each cluster in each layer specifically includes:
determining a minimum bounding rectangle R of the set of POIs; the minimum bounding rectangle R is the minimum rectangle that bounds all POI points in the POI set, R = (X)min,Xmax,Ymin,Ymax),(Xmin,Ymin) As the coordinates of the lower left vertex of the least bounding rectangle R, (X)max,Ymax) Coordinates of the top right vertex of the minimum bounding rectangle R;
for the jth cluster, determining a minimum bounding rectangle MBR of the jth clusterj;
Computing minimum bounding rectangle MBRjThe angle of each vertex in the graph relative to the left lower vertex of the minimum bounding rectangle R obtains four relative angles;
according to four relative angles, determining the angle range [ 2 ] corresponding to the jth clusterα j,β j];α jIs the minimum of the four relative angles and,β jis the maximum of the four relative angles;
and determining the code in the key value pair corresponding to each POI in the jth cluster according to the Hash list to obtain a keyword bag corresponding to the jth cluster.
Optionally, the determining, according to the keyword required for query and the hash list, a code corresponding to the keyword required for query specifically includes:
judging whether a key value pair of a keyword which is the keyword required by the query exists in the hash list;
if the key value pair of the keyword which is the query requirement does not exist in the hash list, determining that the query requirement is input wrongly;
if the key word in the hash list is the key value pair of the key word required by the query, determining the code corresponding to the key word which is the same as the key word required by the query as the code corresponding to the key word required by the query.
Optionally, the traversing, according to the coding and the direction range corresponding to the query requirement, a node set corresponding to each layer of nodes of the index tree from top to bottom, and determining, as the POI meeting the query requirement and closest to the location coordinate in the target region, the POI corresponding to the node in the index tree, where the coding and the angle range meet the direction range of the query requirement and the POI corresponding to the node in the index tree are closest to the location coordinate, specifically includes:
for the 1 st iteration, traversing the node set at the uppermost layer of the index tree, extracting nodes, of which the keyword bags of the corresponding clusters in the node set comprise codes corresponding to the query requirement and the angle range meets the direction range of the query requirement, and adding the nodes into a point set to be selected;
for the ith iteration, acquiring a point to be selected of the ith iteration; i is an integer greater than 1, the point to be selected of the ith iteration is a point in the point set to be selected, which is closest to the position coordinate in the query requirement, and the point is a node or a POI;
determining the type of the point to be selected of the ith iteration; the types comprise intermediate nodes, leaf nodes and POIs;
when the type of the point to be selected of the ith iteration is an intermediate node, acquiring the child nodes of the point to be selected, and adding the nodes, of which the corresponding clustered keyword bags comprise codes corresponding to the query requirement and the angle range meets the direction range of the query requirement, into the point set to be selected; updating the iteration times and entering the next iteration;
when the type of the point to be selected of the ith iteration is a leaf node, acquiring all POIs in the cluster corresponding to the point to be selected, and adding the POIs which comprise codes corresponding to the query requirement and are in the direction range of the query requirement in the point to be selected into the point to be selected set; updating the iteration times and entering the next iteration;
and when the type of the point to be selected of the ith iteration is the POI, determining the point to be selected as the POI which meets the query requirement and is closest to the target area, and ending the iteration.
The invention also provides a massive space POI searching system based on multi-factor constraint, which comprises the following steps:
the spatial POI data acquisition module is used for acquiring spatial POI data of a target area to obtain a spatial POI set; the space POI set comprises all POIs, each POI is identified by an attribute, and the attribute comprises a name, a type and longitude and latitude coordinates;
the hash list construction module is used for constructing a hash list for the POI in the POI set; each POI in the hash list corresponds to a key value pair, the key value pair is a key word-coded key value pair, the key word is the type of the POI, the code is a numerical value corresponding to the type, the codes corresponding to the same key word in the hash list are the same, and the codes corresponding to different key words are different;
the hierarchical spatial clustering module is used for performing hierarchical iterative clustering on all POI in the POI set by adopting a hierarchical spatial clustering method according to a rule that the number of elements in each group of clusters is not more than a set threshold value to obtain a cluster corresponding to each layer; the cluster corresponding to the (i + 1) th layer is obtained by carrying out spatial clustering on the basis of the cluster corresponding to the i-th layer; each cluster of each layer comprises a plurality of elements, the elements in the cluster of the (i + 1) th layer are the clusters of the (i) th layer, and the elements in the cluster of the (1) th layer are POIs;
the cluster attribute determining module is used for determining the attribute of each cluster in each layer; the cluster attributes comprise a minimum bounding rectangle of the cluster, an angle range corresponding to the cluster and a keyword bag corresponding to the cluster; minimum bounding rectangle MBR of jth clusterjIs the smallest rectangle that encloses all POIs in the jth cluster; the angle range corresponding to the jth cluster is MBRjA range of angles relative to the lower left vertex of the smallest bounding rectangle R of the set of POIs; the key word bag corresponding to the jth cluster is a set formed by codes in key value pairs corresponding to all POI in the jth cluster;
the node creating module is used for creating nodes corresponding to each layer of the index tree based on the attributes of all clusters in each layer to obtain a node set corresponding to each layer of nodes of the index tree; the nodes at the first layer of the index tree are leaf nodes of the index tree, and the nodes above the first layer of the index tree are middle nodes of the index tree; the jth node of the ith layer of the index tree corresponds to the jth cluster in the ith layer;
the query requirement acquisition module is used for acquiring a query requirement; the query requirement comprises a keyword, a direction range and a position coordinate;
the code determining module is used for determining a code corresponding to the keyword required by the query according to the keyword required by the query and the hash list;
and the nearest POI determining module is used for traversing the node set corresponding to each layer of the index tree from top to bottom according to the codes and the direction ranges corresponding to the query requirements, and determining the POI which is in the cluster corresponding to the nodes in the index tree, meets the direction range of the query requirements in the code and angle range corresponding to the query requirements and is nearest to the position coordinates as the POI which meets the query requirements and is nearest to the position coordinates in the target area.
Optionally, the hierarchical spatial clustering module specifically includes:
a POI clustering unit used for clustering all POI of the POI set to obtain k according to the rule that the number of POI in each group of clusters is not more than the set threshold value by adopting a spatial clustering method for the first layer1Grouping first-level clustering;
a multi-level clustering unit for applying a spatial clustering method to the nth layer and performing k-level clustering on the nth layer according to a rule that the number of n-1 level clusters in each group of clusters is not more than a set thresholdn-1Clustering the n-1 level clusters to obtain knGrouping n-level clusters; n is an integer greater than 1;
a judging unit for judging knWhether the threshold value is greater than the set threshold value;
an iteration unit for when knWhen the threshold value is larger than the set threshold value, continuously adopting a spatial clustering method to carry out hierarchical iterative clustering;
a clustering end unit for when knAnd when the threshold value is not larger than the set threshold value, finishing clustering to obtain clusters corresponding to each layer.
Optionally, the cluster attribute determining module specifically includes:
a POI set minimum bounding rectangle determining unit, configured to determine a minimum bounding rectangle R of the POI set; the minimum bounding rectangle R is the minimum rectangle that bounds all POI points in the POI set, R = (X)min,Xmax,Ymin,Ymax),(Xmin,Ymin) As the coordinates of the lower left vertex of the least bounding rectangle R, (X)max,Ymax) Coordinates of the top right vertex of the minimum bounding rectangle R;
a cluster minimum bounding rectangle determination unit for determining the minimum bounding rectangle MBR of the jth cluster for the jth clusterj;
A relative angle calculating unit for calculating MBR of minimum bounding rectanglejThe angle of each vertex in the graph relative to the left lower vertex of the minimum bounding rectangle R obtains four relative angles;
an angle range determining unit for determining an angle range [ 2 ] corresponding to the jth cluster from four relative anglesα j,β j];α jIs the minimum of the four relative angles and,β jis the maximum of the four relative angles;
and the keyword bag determining unit is used for determining the codes in the key value pairs corresponding to each POI in the jth cluster according to the Hash list to obtain the keyword bags corresponding to the jth cluster.
Optionally, the code determining module specifically includes:
a key-value pair judging unit, configured to judge whether a key value pair of a keyword that is the query requirement exists in the hash list;
an input error determination unit, configured to determine that the query requirement input error exists when no key value pair whose key word is the key word of the query requirement exists in the hash list;
and the query unit is used for determining the code corresponding to the keyword which is the same as the keyword required by the query as the code corresponding to the keyword required by the query when the keyword in the hash list is the key value pair of the keyword required by the query.
Optionally, the module for determining a POI closest to the first POI specifically includes:
a to-be-selected point set initialization unit, configured to, for the 1 st iteration, traverse the node set at the uppermost layer of the index tree, extract a node in which a keyword bag of a cluster corresponding to the node set includes a code corresponding to the query requirement and an angle range of the node satisfies a direction range of the query requirement, and add the node to the to-be-selected point set;
a point to be selected acquisition unit, configured to acquire a point to be selected for an ith iteration; i is an integer greater than 1, the point to be selected of the ith iteration is a point in the point set to be selected, which is closest to the position coordinate in the query requirement, and the point is a node or a POI;
a candidate point type determining unit, configured to determine a type of a candidate point of the ith iteration; the types comprise intermediate nodes, leaf nodes and POIs;
a sub-node traversal unit, configured to, when the type of the point to be selected of the ith iteration is an intermediate node, obtain a sub-node of the point to be selected, add, to the point to be selected, a node in which a keyword bag of a cluster corresponding to the sub-node of the point to be selected includes a code corresponding to the query requirement and an angle range of the node satisfies a direction range of the query requirement; updating the iteration times and entering the next iteration;
the POI traversal unit is used for acquiring all POIs in the cluster corresponding to the point to be selected when the type of the point to be selected of the ith iteration is a leaf node, and adding the POIs which comprise the codes corresponding to the query requirement and are in the direction range of the query requirement in the point to be selected set; updating the iteration times and entering the next iteration;
and the nearest POI determining unit is used for determining the point to be selected as the POI which meets the query requirement and is nearest to the target area when the type of the point to be selected of the ith iteration is the POI, and ending the iteration.
According to the specific embodiment provided by the invention, the invention discloses the following technical effects:
according to the method, a keyword-code hash list is formed by encoding the type attribute of the spatial POI in the target area, a tree-shaped index mechanism with direction, position and keyword perception is constructed in a layer-by-layer clustering mode from low to high, the POI meeting the direction constraint, the keyword constraint and the distance constraint is obtained by traversing the tree-shaped index mechanism according to the query requirement input by a user, the accuracy of the queried POI can be improved, and the efficiency and the precision of querying mass POI data can be improved by adopting the tree-shaped index mechanism.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without inventive exercise.
FIG. 1 is a staged flowchart of a method for searching a mass space POI based on multi-factor constraint according to the present invention;
FIG. 2 is a schematic flow chart of a massive spatial POI searching method based on multi-factor constraint according to the present invention;
FIG. 3 is a schematic diagram of a process for constructing a hash list according to the present invention;
FIG. 4 is an example of a hash list of the present invention;
FIG. 5 is a schematic diagram of a process for constructing an index tree according to the present invention;
FIG. 6 is a diagram illustrating the nodes and cluster attributes in the index tree according to the present invention;
FIG. 7 is a flow chart illustrating a query based on an index tree according to the present invention;
fig. 8 is a schematic structural diagram of the system for searching POI in mass space based on multi-factor constraint according to the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In real life, spatial POIs are described by various keywords, such as restaurants, gas stations, coffee shops, and so on. Keywords based on POIs in a location based service would be another of its more important attributes and one of the attributes that the user is more interested in. Therefore, the method and the device fully consider the actual requirements of the user to search the space POI constrained by the direction, the distance and the keywords for the user.
Conventional indexing mechanisms typically build an index based on the spatial location of a POI, but such an index does not satisfy the multi-factor constrained POI queries disclosed herein. Based on the above, the invention discloses a Direction-Location-Keyword-aware indexing mechanism (DLKAI) which is matched with the disclosed query method, so that the query precision is ensured, and the query efficiency is greatly improved. In addition, the indexing mechanism has low maintenance cost and wide practical application value.
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.
Fig. 1 is a staged flow chart of the multi-factor constraint-based mass space POI searching method of the present invention, fig. 2 is a flow chart of the multi-factor constraint-based mass space POI searching method of the present invention, and with reference to fig. 1 and fig. 2, the specific process of the multi-factor constraint-based mass space POI searching method of the present invention is as follows:
1) spatial POI crawling stage:
the stage includes step 100, spatial POI data of a target area are obtained through a crawler technology, and then a POI set composed of all POIs is obtained. The target area may be a city or an area of a city. For example, a crawler technology is adopted to crawl POI data of a target city, such as a city space of Shanghai, Beijing and the like, so as to obtain a space POI set S, wherein any space POI in the S is uniquely identified by attributes such as name, type, longitude and latitude coordinate location and the like.
2) Spatial POI keyword encoding stage:
the stage includes step 200, a hash list is further constructed by constructing key value pairs for the POIs in the POI set, each POI in the POI set S is uniquely encoded according to the type of the POI, and a "key-encoded" hash list HashList is further created, wherein the key is the type of the POI and is encoded into a numerical value corresponding to the type. The hash list is created to store all types to which the POI relates and to encode each type. Thus, when a user inputs a keyword word in subsequent queries, the hash list can be searched to find a corresponding code for the word. Therefore, only codes can be stored in the index without storing complex character strings, so that comparison is accelerated and storage space is saved.
The specific process of constructing the hash list is as follows:
a1. the "key-encoding" hash list HashList is initialized. Creating an empty hash list, wherein each element in the hash list is a key-value pair, the key is a type of POI (point of interest), such as a restaurant and a coffee shop, and the key is a character string type; the value is integer and is the type of POI, for example, if the type of POIp2 is restaurant, and 1 is the code corresponding to the character string "restaurant", then the POIp2 corresponds to the "key-value" pair "restaurant-1" in HashList.
b1. Traversing the POI set S, wherein the current POIpi = (name, type, location), i =0, 1.
c1. And d, judging whether the HashList is empty or not, if so, executing the step d1, and otherwise, executing the step e1.
d1. Store the element "pi.type-0" in HashList and continue to perform step h 1; for example, if pi is a restaurant, pi.type represents a restaurant, and "pi.type-0" is "restaurant-0".
e1. Checking whether an element with the keyword of pi.type exists in the HashList, if so, executing step f1, otherwise, executing step g1.
f1. If the HashList has an element with the keyword pi.type, acquiring a code corresponding to the keyword pi.type, and replacing the pi.type with the code, namely pi.type ← code.
Assuming that type is restaurant, if the element (key value pair) of restaurant-7 exists in the hash list HashList, finding the key word as the element of restaurant, taking out the code 7 corresponding to restaurant and using 7 as the code of type, thus giving coded representation to type of POIpi. This step allows all POIs of the same type to be represented in the hash list using the same code.
g1. If there is no element with the keyword pi.type in the HashList, store the element "pi.type-len + 1" in the HashList, len is the length of the HashList, and continue to execute step h1.
Assuming that pi.type is restaurant, if there is no element (key-value pair) with the key word "restaurant" in the hash list HashList, the number of elements +1 in the current hash list is used as the code of restaurant, for example, if the number of elements in the current hash list, that is, the length is 50, 51 is used as the code of the key word "restaurant", and thus the type of POIpi is given a coded representation, and the key-value pair corresponding to pi in the hash list is "restaurant-51". This step enables all different types of POIs to be represented by different codes in the hash list, and the code corresponding to each type has uniqueness.
h1. Check if i is equal to L, L is the number of POI in the POI set.
i1. If i is equal to L, end.
j1. If i is not equal to L, continue to check the next POI, i plus 1, and return to step b1.
After traversing each POI in the POI set in turn, a key-value pair of each POI in the hash list can be obtained, and an example of the obtained hash list is shown in fig. 4.
3) Index mechanism DLKAI (index Tree) establishment phase:
this stage includes steps 300-500. A hierarchical clustering mode is adopted to build a tree index mechanism DLKAI (direct hierarchical organization interface), namely an index tree, of which the direction, the position and the keyword are sensed from bottom to top, POI in S is clustered to obtain leaf nodes at the bottommost layer, nodes at the lower layer are clustered layer by layer to obtain middle nodes at the upper layer, the number num of children of any node in the DLKAI is not more than M, and any node has the direction, the position, the keyword attribute and a pointer pointing to the children. As shown in fig. 5, the specific process of constructing the index tree is as follows:
a2. calculate the smallest bounding rectangle R = (X) for all POIs in the set Smin,Xmax,Ymin,Ymax) Wherein (X)min,Ymin) Is the coordinate of the lower left vertex of the rectangle, (X)max,Ymax) Is the coordinates of the top right vertex of the rectangle.
b2. Clustering all POI in the POI set S into k by adopting a spatial clustering method according to a rule that the number of elements in each group of clusters is not more than a set threshold value1Group, k of layer 11And (5) clustering groups. The POI of each cluster group is stored in a corresponding List, and taking the jth group as an example, the jth group of POI is stored in a ListjJ =1,2,3,.. k1And the number num of POIs in each group of clusters is not more than a set threshold value M.
c2. For group j POILIstjCompute the minimum bounding rectangle MBR for the set of POIsjCalculating MBRjRelative to the R lower left vertex (X)min,Ymin) Angle range of (2)α j,β j]Calculating the keyword bag workset of the group of POIj,wordSetjA set of type codes for all POIs in the set of clusters. Wherein the angle range [ alpha ]α j,β j]The calculation process of (2) is as follows: computing minimum bounding rectangle MBRjThe angle of each vertex in the graph relative to the left lower vertex of the minimum bounding rectangle R obtains four relative angles; according to the four relative angles, the angle range [ 2 ] corresponding to the jth cluster is determinedα j,β j];α jIs the minimum of the four relative angles and,β jthe maximum of the four relative angles.
d2. For any jth group of POI, create Leaf node LeafjWith attribute of minimum bounding rectangle MBRjAngle range [ 2 ]α j,β j]And keyword bag wordSetjI.e. Leafj.MBR=MBRj,Leafj.childList=Listj,Leafj.anglerange=[α j,β j],Leafj.wordSetj=wordSetj。
e2. To this end, k of the first layer is obtained1A Leaf node, Leaf1,Leaf2,...,Leafj,...,Leafk1And obtaining a node set corresponding to the first-layer node, and continuing to execute the step f2.
f2. Judgment of k1And if the value is not greater than M, executing the step l2, otherwise executing the step g2.
g2. Continuing to adopt spatial clustering to divide all k in R1Clustering of each node as k2Set of k to obtain a second layer2And clustering groups, wherein each group of clusters comprises nodes stored in a corresponding list, and the number num of the nodes in each group is not more than M, j =1,2,32。
h2. Judgment of k2And if the value is larger than M, executing the step i2, otherwise executing the step l2.
i2. For the jth group of clusters of the second layer, the MBR (minimum bounding rectangle) of the nodes in the group of clusters is calculatedjAngle range [ 2 ]α j,β j]And keyword bag wordSetjBag of keywords, wordSetjAnd (4) corresponding to the union of the keyword bags for each node in the jth cluster, namely including the codes of all POI in the jth cluster.
j2. For the jth group cluster of the second layer, an intermediate Node is createdjWith minimum bounding rectangle MBRjAngle range [ 2 ]α j,β j]And keyword bag wordSetjI.e. Nodej.MBR=MBRj,Nodej.childList=Listj,Nodej.anglerange=[α j,β j],Nodej.wordSetj=wordSetj。
k2. To this end, k of the second layer is obtained2In oneNode, Node1,Node2,...,Nodej,...,Nodek2The second level set of nodes is obtained and execution continues to step g2, creating third level, fourth level … … intermediate nodes.
l2. the node of the current layer is the highest layer node, and the index tree construction is finished.
FIG. 6 is a schematic diagram of the nodes and cluster attributes in the index tree of the present invention, and as shown in FIG. 6, it is assumed that a POI set S = { p } exists in a space R1,p2,...,p12Their distribution is shown in fig. 6. A hash list constructed for all POIs of the set S is shown in fig. 4. Any POI, e.g. p2From p2.name、p2.location、p2Type description, and see p from the figure2Type is "gas station". In this embodiment, it is set that each node contains no more than M =3 children. The number of spatial POI clustering groups for the first time is 6, the number of POIs in each group is not more than 3, and 6 Leaf nodes are respectively constructed for the 6 groups of POIs, such as Leaf1From p9~p11Constructed to obtain Leaf1.MBR=mbr1、Leaf1.childList=List1、Leaf1.anglerange=[20°,90°]、Leaf1.wordSetj= 5,3, 7. Because the number of spatial leaves is 6 and is more than M, 3 groups of clusters are obtained by next clustering, and intermediate nodes are respectively constructed for each group of leaf nodes to obtain 3 intermediate nodes. Currently, the number of children of each intermediate node is not more than M, and the number of intermediate nodes is not more than M, so that clustering is finished. And storing the highest-level intermediate nodes into a set NodeSet, and finishing the construction of the index tree.
4) User query requirement input stage:
this stage comprises a step 600 of the user entering a query requirement, which is a triplet including keywords, directional range and location coordinates, the query requirement being denoted Q = ((s) =: (s))Φ 1,Φ 2) Word, (x, y)), wherein (a), (b), (c), (d), (x, y)), and (d), (x, y), inΦ 1, Φ 2) Inputting a direction range for a user, wherein word is a keyword input by the user, and (x, y) is obtained according to the GPS positioning function of the mobile phone of the userThe position coordinates.
5) And (3) query processing stage:
this stage includes steps 700-800. Searching the node set corresponding to the user-specified angle range by traversing the index tree from top to bottom through the minimum heapQue (the heapQue)Φ 1,Φ 2) And (4) meeting the keyword constraint, namely, the POI with the type equal to word and the nearest to the user. As shown in fig. 7, the specific process of performing index tree query according to the query requirement is as follows:
a3. traversing a keyword-coding hash list HashList according to a query request input by a user, and checking whether keywords exist in the HashList or not as elements of keywords word in the query request; if yes, determining the code corresponding to the element as the code corresponding to the keyword in the query requirement, replacing the keyword word, namely word ← code, and executing step b 3; if not, determining that the keyword input by the query request is wrong, printing' the keyword input by the user is wrong, please re-determine and input! ".
b3. Traversing each Node in the Node set at the topmost layer of the index tree, checking whether the code of the keyword input by the user is located in the keyword bag Node of the Node, and if so, executing c3.
c3. Determine whether the node of the node meets the direction requirement input by the user(s) ((Φ 1,Φ 2) If the direction requirement is satisfied, d3 is executed.
d3. And storing the Node into a minimum heap HeapQue, wherein the minimum heap HeapQue is a point set to be selected. At this time, all keyword bags stored in the minimum heap HeapQue comprise codes corresponding to the query requirement, the angle range meets the top-level nodes of the direction range of the query requirement, and the nodes closer to the user position (x, y) are closer to the heap top.
The 1 st iteration is completed, and all nodes which may contain the query result in the top layer are stored in the HeapQue.
e3. For the ith iteration, popping up element on the top of the heap from the HeapQue (the element on the top of the heap is the point closest to the position coordinate), and checking the type of the element; if the element is a leaf node, executing step f 3; if the element is an intermediate node, executing step g 3; if the element is a POI, go to step h3.
f3. Traversal of child node set element of element, namely, POI set in cluster corresponding to leaf node, will satisfy direction requirement (if element is leaf node)Φ 1,Φ 2) And all POI with the keyword coded as word are stored in HeapQue, the closer the element to the user position (x, y) is, the closer the element is to the heap top, the updating of HeapQue is completed, the iteration time is +1, the step e3 is returned to be executed, and the next iteration is started.
g3. Traversing the child list of the element, namely element child list, if the element is an intermediate node, and for any child, satisfying the child angle requirement (for the child)Φ 1,Φ 2) All child nodes child of word in the keyword bag child. word set are stored in the HeapQue, the closer the element to the user position (x, y) is to the heap top, the updating of the HeapQue is completed, the iteration times is +1, the execution step e3 is returned, and the next iteration is started;
h3. if the element is a POI, the POI satisfies the user direction requirement (A)Φ 1,Φ 2) And the keyword requires word and the POI nearest to the user, and the element is output and ended.
Based on the above scheme, the present invention further provides a massive spatial POI search system based on multi-factor constraint, and fig. 8 is a schematic structural diagram of the massive spatial POI search system based on multi-factor constraint. As shown in fig. 8, the system for searching POI in massive space based on multi-factor constraint of the present invention comprises:
a spatial POI data acquisition module 801, configured to acquire spatial POI data of a target area to obtain a spatial POI set; the space POI set comprises all POIs, each POI is identified by an attribute, and the attribute comprises a name, a type and longitude and latitude coordinates;
a hash list construction module 802, configured to construct a hash list for the POIs in the POI set; each POI in the hash list corresponds to a key value pair, the key value pair is a key word-coded key value pair, the key word is the type of the POI, the code is a numerical value corresponding to the type, the codes corresponding to the same key word in the hash list are the same, and the codes corresponding to different key words are different;
a hierarchical spatial clustering module 803, configured to perform hierarchical iterative clustering on all POIs of the POI set according to a rule that the number of elements in each group of clusters is not greater than a set threshold by using a hierarchical spatial clustering method, so as to obtain a cluster corresponding to each layer; the cluster corresponding to the (i + 1) th layer is obtained by carrying out spatial clustering on the basis of the cluster corresponding to the i-th layer; each cluster of each layer comprises a plurality of elements, the elements in the cluster of the (i + 1) th layer are the clusters of the (i) th layer, and the elements in the cluster of the (1) th layer are POIs;
a cluster attribute determining module 804 for determining an attribute of each cluster in each layer; the cluster attributes comprise a minimum bounding rectangle of the cluster, an angle range corresponding to the cluster and a keyword bag corresponding to the cluster; minimum bounding rectangle MBR of jth clusterjIs the smallest rectangle that encloses all POIs in the jth cluster; the angle range corresponding to the jth cluster is MBRjA range of angles relative to the lower left vertex of the smallest bounding rectangle R of the set of POIs; the key word bag corresponding to the jth cluster is a set formed by codes in key value pairs corresponding to all POI in the jth cluster;
a node creating module 805, configured to create a node corresponding to each layer of the index tree based on attributes of all clusters in each layer, to obtain a node set corresponding to each layer of the node of the index tree; the nodes at the first layer of the index tree are leaf nodes of the index tree, and the nodes above the first layer of the index tree are middle nodes of the index tree; the jth node of the ith layer of the index tree corresponds to the jth cluster in the ith layer;
a query requirement obtaining module 806, configured to obtain a query requirement; the query requirement comprises a keyword, a direction range and a position coordinate;
a code determining module 807, configured to determine a code corresponding to the keyword of the query requirement according to the keyword of the query requirement and the hash list;
and a nearest POI determining module 808, configured to traverse a node set corresponding to each layer of the index tree from top to bottom according to the code and the direction range corresponding to the query requirement, and determine, as a POI meeting the query requirement and closest to the location coordinate in the target area, a POI corresponding to a node in the index tree, which includes a code and a direction range corresponding to the query requirement and whose angle range meets the query requirement, and which is closest to the location coordinate.
As a specific embodiment, in the massive spatial POI search system based on multi-factor constraint of the present invention, the hierarchical spatial clustering module 803 specifically includes:
a POI clustering unit used for clustering all POI of the POI set to obtain k according to the rule that the number of POI in each group of clusters is not more than the set threshold value by adopting a spatial clustering method for the first layer1Grouping first-level clustering;
a multi-level clustering unit for applying a spatial clustering method to the nth layer and performing k-level clustering on the nth layer according to a rule that the number of n-1 level clusters in each group of clusters is not more than a set thresholdn-1Clustering the n-1 level clusters to obtain knGrouping n-level clusters; n is an integer greater than 1;
a judging unit for judging knWhether the threshold value is greater than the set threshold value;
an iteration unit for when knWhen the threshold value is larger than the set threshold value, continuously adopting a spatial clustering method to carry out hierarchical iterative clustering;
a clustering end unit for when knAnd when the threshold value is not larger than the set threshold value, finishing clustering to obtain clusters corresponding to each layer.
As a specific embodiment, in the massive spatial POI search system based on multi-factor constraint of the present invention, the cluster attribute determining module 804 specifically includes:
a POI set minimum bounding rectangle determining unit, configured to determine a minimum bounding rectangle R of the POI set; the minimum bounding rectangle R is the minimum rectangle that bounds all POI points in the POI set, R = (X)min,Xmax,Ymin,Ymax),(Xmin,Ymin) The coordinates of the lower left vertex of the least bounding rectangle R,(Xmax,Ymax) Coordinates of the top right vertex of the minimum bounding rectangle R;
a cluster minimum bounding rectangle determination unit for determining the minimum bounding rectangle MBR of the jth cluster for the jth clusterj;
A relative angle calculating unit for calculating MBR of minimum bounding rectanglejThe angle of each vertex in the graph relative to the left lower vertex of the minimum bounding rectangle R obtains four relative angles;
an angle range determining unit for determining an angle range [ 2 ] corresponding to the jth cluster from four relative anglesα j,β j];α jIs the minimum of the four relative angles and,β jis the maximum of the four relative angles;
and the keyword bag determining unit is used for determining the codes in the key value pairs corresponding to each POI in the jth cluster according to the Hash list to obtain the keyword bags corresponding to the jth cluster.
As a specific embodiment, in the massive spatial POI search system based on multi-factor constraint of the present invention, the encoding determining module 807 specifically includes:
a key-value pair judging unit, configured to judge whether a key value pair of a keyword that is the query requirement exists in the hash list;
an input error determination unit, configured to determine that the query requirement input error exists when no key value pair whose key word is the key word of the query requirement exists in the hash list;
and the query unit is used for determining the code corresponding to the keyword which is the same as the keyword required by the query as the code corresponding to the keyword required by the query when the keyword in the hash list is the key value pair of the keyword required by the query.
As a specific embodiment, in the massive spatial POI search system based on multi-factor constraint of the present invention, the POI determining module 808 with the closest distance specifically includes:
and the point set to be selected initialization unit is used for traversing the node set at the uppermost layer of the index tree for the 1 st iteration, extracting nodes, of which the keyword bags of the corresponding clusters in the node set comprise codes corresponding to the query requirement and the angle ranges meet the direction range of the query requirement, and adding the nodes into the point set to be selected.
A point to be selected acquisition unit, configured to acquire a point to be selected for an ith iteration; i is an integer greater than 1, the point to be selected of the ith iteration is a point in the point set to be selected, which is closest to the position coordinate in the query requirement, and the point is a node or a POI.
A candidate point type determining unit, configured to determine a type of a candidate point of the ith iteration; the types include intermediate nodes, leaf nodes, and POIs.
A sub-node traversal unit, configured to, when the type of the point to be selected of the ith iteration is an intermediate node, obtain a sub-node of the point to be selected, add, to the point to be selected, a node in which a keyword bag of a cluster corresponding to the sub-node of the point to be selected includes a code corresponding to the query requirement and an angle range of the node satisfies a direction range of the query requirement; and updating the iteration times and entering the next iteration.
The POI traversal unit is used for acquiring all POIs in the cluster corresponding to the point to be selected when the type of the point to be selected of the ith iteration is a leaf node, and adding the POIs which comprise the codes corresponding to the query requirement and are in the direction range of the query requirement in the point to be selected set; and updating the iteration times and entering the next iteration.
And the nearest POI determining unit is used for determining the point to be selected as the POI which meets the query requirement and is nearest to the target area when the type of the point to be selected of the ith iteration is the POI, and ending the iteration.
The massive space POI searching scheme based on multi-factor constraint has the following beneficial effects:
the invention fully considers the spatial attribute and the text attribute of the POI in the location-based service, and establishes an index mechanism for sensing direction, location and keywords to organize the spatial POI.
The method and the system fully consider the actual query requirements of the user, and search the spatial POI constrained by the direction, the distance and the keywords for the user.
Based on the indexing mechanism DLKAI and the query method disclosed by the invention, the query precision is ensured, the query efficiency is improved, the maintenance cost is low, and the method has wide practical application value.
The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. For the system disclosed by the embodiment, the description is relatively simple because the system corresponds to the method disclosed by the embodiment, and the relevant points can be referred to the method part for description.
The principles and embodiments of the present invention have been described herein using specific examples, which are provided only to help understand the method and the core concept of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, the specific embodiments and the application range may be changed. In view of the above, the present disclosure should not be construed as limiting the invention.
Claims (8)
1. A massive space POI searching method based on multi-factor constraint is characterized by comprising the following steps:
acquiring space POI data of a target area to obtain a space POI set; the space POI set comprises all POIs, each POI is identified by an attribute, and the attribute comprises a name, a type and longitude and latitude coordinates;
constructing a hash list for POI in the POI set; each POI in the hash list corresponds to a key value pair, the key value pair is a key word-coded key value pair, the key word is the type of the POI, the code is a numerical value corresponding to the type, the codes corresponding to the same key word in the hash list are the same, and the codes corresponding to different key words are different;
performing hierarchical iterative clustering on all POI of the POI set by adopting a hierarchical spatial clustering method according to a rule that the number of elements in each group of clusters is not more than a set threshold value to obtain a cluster corresponding to each layer; the cluster corresponding to the (i + 1) th layer is obtained by carrying out spatial clustering on the basis of the cluster corresponding to the i-th layer; each cluster of each layer comprises a plurality of elements, the elements in the cluster of the (i + 1) th layer are the clusters of the (i) th layer, and the elements in the cluster of the (1) th layer are POIs;
determining the attribute of each cluster in each layer; the cluster attributes comprise a minimum bounding rectangle of the cluster, an angle range corresponding to the cluster and a keyword bag corresponding to the cluster; minimum bounding rectangle MBR of jth clusterjIs the smallest rectangle that encloses all POIs in the jth cluster; the angle range corresponding to the jth cluster is MBRjA range of angles relative to the lower left vertex of the smallest bounding rectangle R of the set of POIs; the key word bag corresponding to the jth cluster is a set formed by codes in key value pairs corresponding to all POI in the jth cluster;
based on the attributes of all clusters in each layer, creating nodes corresponding to each layer of the index tree to obtain a node set corresponding to each layer of the nodes of the index tree; the nodes at the first layer of the index tree are leaf nodes of the index tree, and the nodes above the first layer of the index tree are middle nodes of the index tree; the jth node of the ith layer of the index tree corresponds to the jth cluster in the ith layer;
acquiring a query requirement; the query requirement comprises a keyword, a direction range and a position coordinate;
determining a code corresponding to the keyword of the query requirement according to the keyword of the query requirement and the hash list;
traversing a node set corresponding to each layer of the index tree from top to bottom according to the codes and the direction ranges corresponding to the query requirements, and determining POIs which are in clusters corresponding to the nodes in the index tree, satisfy the direction ranges of the query requirements and are closest to the position coordinates, and which correspond to the codes and the angle ranges corresponding to the query requirements, as the POIs which satisfy the query requirements and are closest to the position coordinates in the target area; the specific process is as follows:
for the 1 st iteration, traversing the node set at the uppermost layer of the index tree, extracting nodes, of which the keyword bags of the corresponding clusters in the node set comprise codes corresponding to the query requirement and the angle range meets the direction range of the query requirement, and adding the nodes into a point set to be selected;
for the ith iteration, acquiring a point to be selected of the ith iteration; i is an integer greater than 1, the point to be selected of the ith iteration is a point in the point set to be selected, which is closest to the position coordinate in the query requirement, and the point is a node or a POI;
determining the type of the point to be selected of the ith iteration; the types comprise intermediate nodes, leaf nodes and POIs;
when the type of the point to be selected of the ith iteration is an intermediate node, acquiring the child nodes of the point to be selected, and adding the nodes, of which the corresponding clustered keyword bags comprise codes corresponding to the query requirement and the angle range meets the direction range of the query requirement, into the point set to be selected; updating the iteration times and entering the next iteration;
when the type of the point to be selected of the ith iteration is a leaf node, acquiring all POIs in the cluster corresponding to the point to be selected, and adding the POIs which comprise codes corresponding to the query requirement and are in the direction range of the query requirement in the point to be selected into the point to be selected set; updating the iteration times and entering the next iteration;
and when the type of the point to be selected of the ith iteration is the POI, determining the point to be selected as the POI which meets the query requirement and is closest to the target area, and ending the iteration.
2. The method according to claim 1, wherein the step of performing hierarchical iterative clustering on all POIs in the POI set according to a rule that the number of elements in each cluster is not greater than a set threshold value by using a hierarchical spatial clustering method to obtain a cluster corresponding to each layer specifically comprises:
for the first layer, a spatial clustering method is adopted, and POI in each group of clusters is adoptedClustering all POIs of the POI set to obtain k according to the rule that the number is not more than the set threshold value1Grouping first-level clustering;
for the nth layer, adopting a spatial clustering method, and carrying out k treatment on the nth-1 layer according to the rule that n-1 level clusters in each group of clusters are not more than a set threshold valuen-1Clustering the n-1 level clusters to obtain knGrouping n-level clusters; n is an integer greater than 1;
judgment of knWhether the threshold value is greater than the set threshold value;
when k isnWhen the threshold value is larger than the set threshold value, continuously adopting a spatial clustering method to carry out hierarchical iterative clustering;
when k isnAnd when the threshold value is not larger than the set threshold value, finishing clustering to obtain clusters corresponding to each layer.
3. The method for searching for the POI in the massive space based on the multi-factor constraint of claim 1, wherein the determining the attribute of each cluster in each layer specifically comprises:
determining a minimum bounding rectangle R of the set of POIs; the minimum bounding rectangle R is the minimum rectangle that bounds all POI points in the POI set, R = (X)min,Xmax,Ymin,Ymax),(Xmin,Ymin) As the coordinates of the lower left vertex of the least bounding rectangle R, (X)max,Ymax) Coordinates of the top right vertex of the minimum bounding rectangle R;
for the jth cluster, determining a minimum bounding rectangle MBR of the jth clusterj;
Computing minimum bounding rectangle MBRjThe angle of each vertex in the graph relative to the left lower vertex of the minimum bounding rectangle R obtains four relative angles;
according to four relative angles, determining the angle range [ 2 ] corresponding to the jth clusterα j,β j];α jIs the minimum of the four relative angles and,β jis the maximum of the four relative angles;
and determining the code in the key value pair corresponding to each POI in the jth cluster according to the Hash list to obtain a keyword bag corresponding to the jth cluster.
4. The method for searching for POI in mass space based on multi-factor constraint according to claim 1, wherein the determining of the code corresponding to the keyword required for the query according to the keyword required for the query and the hash list specifically comprises:
judging whether a key value pair of a keyword which is the keyword required by the query exists in the hash list;
if the key value pair of the keyword which is the query requirement does not exist in the hash list, determining that the query requirement is input wrongly;
if the key word in the hash list is the key value pair of the key word required by the query, determining the code corresponding to the key word which is the same as the key word required by the query as the code corresponding to the key word required by the query.
5. A massive spatial POI search system based on multi-factor constraint is characterized by comprising:
the spatial POI data acquisition module is used for acquiring spatial POI data of a target area to obtain a spatial POI set; the space POI set comprises all POIs, each POI is identified by an attribute, and the attribute comprises a name, a type and longitude and latitude coordinates;
the hash list construction module is used for constructing a hash list for the POI in the POI set; each POI in the hash list corresponds to a key value pair, the key value pair is a key word-coded key value pair, the key word is the type of the POI, the code is a numerical value corresponding to the type, the codes corresponding to the same key word in the hash list are the same, and the codes corresponding to different key words are different;
the hierarchical spatial clustering module is used for performing hierarchical iterative clustering on all POI in the POI set by adopting a hierarchical spatial clustering method according to a rule that the number of elements in each group of clusters is not more than a set threshold value to obtain a cluster corresponding to each layer; the cluster corresponding to the (i + 1) th layer is obtained by carrying out spatial clustering on the basis of the cluster corresponding to the i-th layer; each cluster of each layer comprises a plurality of elements, the elements in the cluster of the (i + 1) th layer are the clusters of the (i) th layer, and the elements in the cluster of the (1) th layer are POIs;
the cluster attribute determining module is used for determining the attribute of each cluster in each layer; the cluster attributes comprise a minimum bounding rectangle of the cluster, an angle range corresponding to the cluster and a keyword bag corresponding to the cluster; minimum bounding rectangle MBR of jth clusterjIs the smallest rectangle that encloses all POIs in the jth cluster; the angle range corresponding to the jth cluster is MBRjA range of angles relative to the lower left vertex of the smallest bounding rectangle R of the set of POIs; the key word bag corresponding to the jth cluster is a set formed by codes in key value pairs corresponding to all POI in the jth cluster;
the node creating module is used for creating nodes corresponding to each layer of the index tree based on the attributes of all clusters in each layer to obtain a node set corresponding to each layer of nodes of the index tree; the nodes at the first layer of the index tree are leaf nodes of the index tree, and the nodes above the first layer of the index tree are middle nodes of the index tree; the jth node of the ith layer of the index tree corresponds to the jth cluster in the ith layer;
the query requirement acquisition module is used for acquiring a query requirement; the query requirement comprises a keyword, a direction range and a position coordinate;
the code determining module is used for determining a code corresponding to the keyword required by the query according to the keyword required by the query and the hash list;
the nearest POI determining module is used for traversing the node set corresponding to each layer of the index tree from top to bottom according to the codes and the direction ranges corresponding to the query requirements, and determining the POI which is in the cluster corresponding to the nodes in the index tree, meets the direction range of the query requirements in the code and angle range corresponding to the query requirements and is nearest to the position coordinates as the POI which meets the query requirements and is nearest to the position coordinates in the target area;
the nearest POI determining module specifically includes:
a to-be-selected point set initialization unit, configured to, for the 1 st iteration, traverse the node set at the uppermost layer of the index tree, extract a node in which a keyword bag of a cluster corresponding to the node set includes a code corresponding to the query requirement and an angle range of the node satisfies a direction range of the query requirement, and add the node to the to-be-selected point set;
a point to be selected acquisition unit, configured to acquire a point to be selected for an ith iteration; i is an integer greater than 1, the point to be selected of the ith iteration is a point in the point set to be selected, which is closest to the position coordinate in the query requirement, and the point is a node or a POI;
a candidate point type determining unit, configured to determine a type of a candidate point of the ith iteration; the types comprise intermediate nodes, leaf nodes and POIs;
a sub-node traversal unit, configured to, when the type of the point to be selected of the ith iteration is an intermediate node, obtain a sub-node of the point to be selected, add, to the point to be selected, a node in which a keyword bag of a cluster corresponding to the sub-node of the point to be selected includes a code corresponding to the query requirement and an angle range of the node satisfies a direction range of the query requirement; updating the iteration times and entering the next iteration;
the POI traversal unit is used for acquiring all POIs in the cluster corresponding to the point to be selected when the type of the point to be selected of the ith iteration is a leaf node, and adding the POIs which comprise the codes corresponding to the query requirement and are in the direction range of the query requirement in the point to be selected set; updating the iteration times and entering the next iteration;
and the nearest POI determining unit is used for determining the point to be selected as the POI which meets the query requirement and is nearest to the target area when the type of the point to be selected of the ith iteration is the POI, and ending the iteration.
6. The system according to claim 5, wherein the hierarchical spatial clustering module specifically comprises:
a POI clustering unit used for clustering all POI of the POI set to obtain k according to the rule that the number of POI in each group of clusters is not more than the set threshold value by adopting a spatial clustering method for the first layer1Grouping first-level clustering;
a multi-level clustering unit for applying a spatial clustering method to the nth layer and performing k-level clustering on the nth-1 layer according to a rule that the number of n-1 level clusters in each group of clusters is not more than a set thresholdn-1Clustering the n-1 level clusters to obtain knGrouping n-level clusters; n is an integer greater than 1;
a judging unit for judging knWhether the threshold value is greater than the set threshold value;
an iteration unit for when knWhen the threshold value is larger than the set threshold value, continuously adopting a spatial clustering method to carry out hierarchical iterative clustering;
a clustering end unit for when knAnd when the threshold value is not larger than the set threshold value, finishing clustering to obtain clusters corresponding to each layer.
7. The system according to claim 5, wherein the cluster attribute determining module specifically comprises:
a POI set minimum bounding rectangle determining unit, configured to determine a minimum bounding rectangle R of the POI set; the minimum bounding rectangle R is the minimum rectangle that bounds all POI points in the POI set, R = (X)min,Xmax,Ymin,Ymax),(Xmin,Ymin) As the coordinates of the lower left vertex of the least bounding rectangle R, (X)max,Ymax) Coordinates of the top right vertex of the minimum bounding rectangle R;
a cluster minimum bounding rectangle determination unit for determining the minimum bounding rectangle MBR of the jth cluster for the jth clusterj;
A relative angle calculating unit for calculating MBR of minimum bounding rectanglejThe angle of each vertex in the graph relative to the left lower vertex of the minimum bounding rectangle R obtains four relative angles;
angle of rotationA range determining unit for determining an angle range [ 2 ] corresponding to the jth cluster from four relative anglesα j, β j];α jIs the minimum of the four relative angles and,β jis the maximum of the four relative angles;
and the keyword bag determining unit is used for determining the codes in the key value pairs corresponding to each POI in the jth cluster according to the Hash list to obtain the keyword bags corresponding to the jth cluster.
8. The system according to claim 5, wherein the code determining module specifically comprises:
a key-value pair judging unit, configured to judge whether a key value pair of a keyword that is the query requirement exists in the hash list;
an input error determination unit, configured to determine that the query requirement input error exists when no key value pair whose key word is the key word of the query requirement exists in the hash list;
and the query unit is used for determining the code corresponding to the keyword which is the same as the keyword required by the query as the code corresponding to the keyword required by the query when the keyword in the hash list is the key value pair of the keyword required by the query.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110519532.6A CN112948717B (en) | 2021-05-13 | 2021-05-13 | Massive space POI searching method and system based on multi-factor constraint |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110519532.6A CN112948717B (en) | 2021-05-13 | 2021-05-13 | Massive space POI searching method and system based on multi-factor constraint |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112948717A CN112948717A (en) | 2021-06-11 |
CN112948717B true CN112948717B (en) | 2021-08-20 |
Family
ID=76233752
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110519532.6A Active CN112948717B (en) | 2021-05-13 | 2021-05-13 | Massive space POI searching method and system based on multi-factor constraint |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112948717B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115062172B (en) * | 2022-08-19 | 2022-11-08 | 北京科技大学 | Augmented reality image data searching method and system based on position |
CN117435776B (en) * | 2023-12-20 | 2024-04-30 | 杭州拓数派科技发展有限公司 | Metadata storage and query method, device, computer equipment and storage medium |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7689602B1 (en) * | 2005-07-20 | 2010-03-30 | Bakbone Software, Inc. | Method of creating hierarchical indices for a distributed object system |
CN103150309A (en) * | 2011-12-07 | 2013-06-12 | 清华大学 | Method and system for searching POI (Point of Interest) points of awareness map in space direction |
CN104391853A (en) * | 2014-09-25 | 2015-03-04 | 深圳大学 | POI (point of interest) recommending method, POI information processing method and server |
CN104951466A (en) * | 2014-03-28 | 2015-09-30 | 高德软件有限公司 | POI information search method, device and system and related equipment |
CN105808698A (en) * | 2016-03-03 | 2016-07-27 | 江苏大学 | Internet-of-things user query request-oriented TOP-k position point-of-interest recommendation method |
CN106844376A (en) * | 2015-12-03 | 2017-06-13 | 高德软件有限公司 | Recommend the method and device of point of interest |
CN108763538A (en) * | 2018-05-31 | 2018-11-06 | 北京嘀嘀无限科技发展有限公司 | A kind of method and device in the geographical locations determining point of interest POI |
CN111782741A (en) * | 2020-06-04 | 2020-10-16 | 汉海信息技术(上海)有限公司 | Interest point mining method and device, electronic equipment and storage medium |
CN112214488A (en) * | 2020-10-09 | 2021-01-12 | 华东师范大学 | European style spatial data index tree and construction and retrieval method |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9165074B2 (en) * | 2011-05-10 | 2015-10-20 | Uber Technologies, Inc. | Systems and methods for performing geo-search and retrieval of electronic point-of-interest records using a big index |
CN103744934A (en) * | 2013-12-30 | 2014-04-23 | 南京大学 | Distributed index method based on LSH (Locality Sensitive Hashing) |
CN108446357A (en) * | 2018-03-12 | 2018-08-24 | 浙江大学 | A kind of mass data spatial dimension querying method based on two-dimentional geographical location |
-
2021
- 2021-05-13 CN CN202110519532.6A patent/CN112948717B/en active Active
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7689602B1 (en) * | 2005-07-20 | 2010-03-30 | Bakbone Software, Inc. | Method of creating hierarchical indices for a distributed object system |
CN103150309A (en) * | 2011-12-07 | 2013-06-12 | 清华大学 | Method and system for searching POI (Point of Interest) points of awareness map in space direction |
CN104951466A (en) * | 2014-03-28 | 2015-09-30 | 高德软件有限公司 | POI information search method, device and system and related equipment |
CN104391853A (en) * | 2014-09-25 | 2015-03-04 | 深圳大学 | POI (point of interest) recommending method, POI information processing method and server |
CN106844376A (en) * | 2015-12-03 | 2017-06-13 | 高德软件有限公司 | Recommend the method and device of point of interest |
CN105808698A (en) * | 2016-03-03 | 2016-07-27 | 江苏大学 | Internet-of-things user query request-oriented TOP-k position point-of-interest recommendation method |
CN108763538A (en) * | 2018-05-31 | 2018-11-06 | 北京嘀嘀无限科技发展有限公司 | A kind of method and device in the geographical locations determining point of interest POI |
CN111782741A (en) * | 2020-06-04 | 2020-10-16 | 汉海信息技术(上海)有限公司 | Interest point mining method and device, electronic equipment and storage medium |
CN112214488A (en) * | 2020-10-09 | 2021-01-12 | 华东师范大学 | European style spatial data index tree and construction and retrieval method |
Non-Patent Citations (2)
Title |
---|
基于位置数据和POI的聚类方法;刘辉等;《地理空间信息》;20171130;第15卷(第11期);全文 * |
结合地点类别和社交网络的兴趣点推荐;唐浩然等;《重庆大学学报》;20200731(第07期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN112948717A (en) | 2021-06-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN102395965B (en) | Method for searching objects in a database | |
Zheng et al. | Approximate keyword search in semantic trajectory database | |
Xavier et al. | A survey of measures and methods for matching geospatial vector datasets | |
US10289717B2 (en) | Semantic search apparatus and method using mobile terminal | |
CN112948717B (en) | Massive space POI searching method and system based on multi-factor constraint | |
CN112612863B (en) | Address matching method and system based on Chinese word segmentation device | |
CN104834679B (en) | A kind of expression of action trail, querying method and device | |
CN108369582B (en) | Address error correction method and terminal | |
CN110147421B (en) | Target entity linking method, device, equipment and storage medium | |
CN104657439A (en) | Generation system and method for structured query sentence used for precise retrieval of natural language | |
CN105069094B (en) | A kind of spatial key indexing means based on semantic understanding | |
JP2012524313A5 (en) | ||
CN101542475A (en) | System and method for searching and matching data with ideographic content | |
WO2021072874A1 (en) | Dual array-based location query method and apparatus, computer device, and storage medium | |
CN107766433A (en) | A kind of range query method and device based on Geo BTree | |
Dalvi et al. | Deduplicating a places database | |
CN110377684A (en) | A kind of spatial key personalization semantic query method based on user feedback | |
CN106874425A (en) | Real time critical word approximate search algorithm based on Storm | |
CN111191084B (en) | Map structure-based place name address resolution method | |
CN116303854A (en) | Positioning method and device based on address knowledge graph | |
CN112632406B (en) | Query method, query device, electronic equipment and storage medium | |
Zhai et al. | Geo-spatial query based on extended SPARQL | |
Yang et al. | Pattern-mining approach for conflating crowdsourcing road networks with POIs | |
CN102385597B (en) | The fault-tolerant searching method of a kind of POI | |
CN103294791A (en) | Extensible markup language pattern matching method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |