CN111538725A

CN111538725A - Method and system for nearest quick search for ten-million-level dot-shaped elements

Info

Publication number: CN111538725A
Application number: CN202010197150.1A
Authority: CN
Inventors: 亢晓琛; 刘纪平; 董春; 杨毅; 张用川; 孙立坚
Original assignee: Chinese Academy of Surveying and Mapping
Current assignee: Chinese Academy of Surveying and Mapping
Priority date: 2020-03-19
Filing date: 2020-03-19
Publication date: 2020-08-14
Anticipated expiration: 2040-03-19
Also published as: CN111538725B

Abstract

The application discloses a method and a system for quickly searching the nearest point-like elements in ten million levels, wherein the method comprises the following steps: step one, traversing all the point-like elements to obtain a range of four to four, and calculating the number of coordinates and the point-like elements; step two, establishing a grid layer according to the data acquired in the step one, wherein the grid layer comprises at least one grid unit; step three, constructing a task based on the latest grid unit according to the application request; if the point elements change, dynamically updating, wherein the updating is to add, delete and move the point elements on the grid layer; step five, completing task calculation and result combination, submitting the results to a service request end, and responding to the service request; the method provides a series of key technologies and system implementation schemes for solving the problem of fast neighborhood search of high-density point-like elements in a large-range scene.

Description

Method and system for nearest quick search for ten-million-level dot-shaped elements

Technical Field

The present application relates to the field of a method and a system for quickly searching a nearest point-like element, and more particularly, to a method and a system for quickly searching a nearest point-like element oriented to tens of millions.

Background

The nearest neighbor search is a basic method widely applied to space planning and daily life scenarios. In the geographic information application service, one of the most common requirements is to search for other spatial objects within a certain radius range by taking a position of a certain Point of Interest (POI, each POI contains four-aspect information, name, category, coordinate, classification) as a center, or to search for K nearest other spatial objects by taking the position as a center. In a real-life scene, POIs can be buildings, shops, scenic spots and the like, and the requirements of finding restaurants, finding scenic spots, finding toilets and the like can be met through POI search. With the continuous refinement of data acquisition scale, the content of POI (information point) entities is more and more abundant, such as garbage cans, sharing bicycles, independent trees and the like. These searched objects may be static or dynamic. From the perspective of a searcher, the searching can be performed according to a static position, and can also be performed in a dynamic mode, such as dynamically searching the conditions along the highway in the process of vehicle traveling. In addition, the nearest neighbor search can also be applied to the fields of pattern recognition, statistical classification, databases, data compression, network marketing, and the like.

In view of the above requirements, it is necessary to provide efficient organization and management, fast query and support of diversified service interfaces for the number of very large scale point-like elements.

The existing software technology or research algorithm provides fixed radius range search and K adjacent point query support of the positions of the point-like elements, such as the mainstream commercial software ArcGIS, the popular K-D tree, the R-tree, the locality sensitive hash algorithm, and the popular search bases ANN, FLNN and the like. From the application point of view, the related art or method still has the following two disadvantages:

(1) the time cost of dynamically updating the data source by the tree index is high

In real life, the punctiform elements are dynamic, the designed technical method needs to deal with frequent operations of data concentration point objects, such as addition, deletion, movement and the like, the conventional tree index (such as a K-D tree, an R tree and the like) is complex in structure, high time processing cost exists when the problems of node splitting, node merging and the like are frequently processed, and the subsequent quick query is hindered by maintaining the dynamic balance. When managing short-period update data and corresponding to real-time scene search, such as adjacent vehicle search, a spatial index structure with low update cost and suitable for large-scale concurrent requirements is adopted.

(2) Do not provide a parallel processing strategy to cope with large-scale data

The multi-core computer has become a mainstream computing facility, and currently, the maximum configurable number of a single CPU is 64 cores, and the number of hyper threads reaches up to 128. The existing neighborhood searching method does not comprehensively consider how to quickly construct the easy parallel index on the basis of the ultra-large scale data, so that the algorithm and the high-performance computing facility are effectively linked. To fully utilize parallel computing resources, software strategies that can support parallel updates and parallel queries of data should be designed and implemented from a technical approach hierarchy.

Disclosure of Invention

Based on the technical scheme, in order to support high-efficiency organization management and neighborhood search service of ten million levels of punctiform element data, relevant technologies such as variable window grid spatial index, multi-level grid consistency linkage updating, parallelization spiral scanning search and the like are sequentially introduced from the lower part and the upper part from the technical implementation angle, and a technical method and a real-time service system capable of supporting static and dynamic data real-time updating management, adjacency list construction and nearest punctiform element query are constructed.

The present application discloses the following technical solutions.

A method for searching nearest neighbor of ten-million-level dot-shaped elements comprises the following steps:

step one, traversing all the point-like elements to obtain a range of four to four, a starting coordinate and the number of the point-like elements;

step two, establishing a first layer of grid layer according to the data obtained in the step one, wherein the grid layer comprises at least one grid unit, and if the number of point-like elements in any grid unit in the grid layer exceeds a storage threshold value, establishing at least one new layer of grid layer until the number of point-like elements in each grid unit in the latest layer of grid does not exceed the storage threshold value;

constructing a task based on the latest grid unit according to an application request, wherein the task comprises a subtask searched by a fixed radius or a subtask searched by a K neighbor;

if the point elements change, dynamically updating the point elements, wherein the updating comprises at least one of adding, deleting and moving the point elements in the grid layer;

and fifthly, searching the subtask with the fixed radius and the K adjacent searching subtask according to the fixed radius determined in the third step, performing task calculation, submitting a calculation result to a service request end, and responding to the service request, wherein the task calculation is to calculate the distance between the position of the request point and the position of each point element, and the result is a set of point elements which meet the requirement that the distance is smaller than the request distance or the K distance is the closest.

In one possible embodiment, after creating the first layer mesh, a list of mesh cell attributes is created from the mesh cells, the list comprising: the unit code is encoded in a hierarchical, horizontal, and vertical direction, the number of dot elements being the number of dot elements included in the grid layer, and the dot element list being a list in which dot elements are continuously stored in the grid.

In a possible embodiment, in the second step, dot-shaped elements in each grid layer are traversed, the row and column numbers of the grid where the dot-shaped elements are located correspond to the coordinate positions of the dot-shaped elements, the row and column numbers serve as unique identifiers ID of the dot-shaped elements, the unique identifiers ID of the dot-shaped elements are included in a dot-shaped element list, and the number of the dot-shaped elements is recorded.

In a possible implementation, the task of step three includes: the task coding comprises task coding, task types, subtask coding, subtask states and dependency relations, wherein the task coding is sequentially coded according to the sequence of task reading and writing, the task types comprise two types of updating and calculation, the subtask coding adopts grid coding, the dependency relations are expressed by binary groups, the binary group elements comprise task coding and grid coding, and the subtask states are divided into two types of proceeding and waiting.

In a possible implementation manner, the fixed radius search subtask is constructed by a method of obtaining point-like elements in a grid which overlaps with the radius in the spatial range according to the fixed radius of the search position;

and the K adjacent search subtask is constructed by a method of accessing the number of the point-like elements of each grid layer by layer outwards in a clockwise or anticlockwise direction according to the grid where the search position is located, judging whether the number of the current records is greater than or equal to K after each outward expansion, and if the conditions are met, adding one layer of outward expansion grid as an effective grid range.

In a possible implementation manner, the newly added point-like elements are modified according to the positions of the hierarchical grids where the newly added point-like elements are located, and the number of the point-like elements and the point-like element list are modified;

the point-like element deletion modifies the number of the point-like elements and the point-like element list according to the position of each hierarchical grid where the point-like element deletion is located;

firstly, judging whether the positions of the point-like elements before and after movement belong to the same grid unit, if so, not processing the point-like elements; otherwise, deleting the grid where the mobile terminal is located before moving, inserting the mobile terminal into the grid where the mobile terminal is located after moving, and modifying the number of the dotted elements and the dotted element list.

In a possible implementation manner, the data updating and computing service in step four is operated, and an active queue method is adopted to process the operation sequence of the subtasks constructed in step three, the method includes steps of locking, identifying a state, including an active queue and unlocking, where the locking is a locking subtask, and the locking subtask is included in the active queue for retrieval, and whether a related task exists in the active queue is retrieved, and if no related task exists, the locking is marked as a performing state, and the locking is included in the active queue, so that the locking is released; if the related tasks exist, whether the tasks need to be calculated is continuously judged, if the tasks need to be calculated, the tasks are marked as a progress state and then are brought into an active queue, if the tasks do not need to be calculated, the tasks are marked as a waiting state, then are unlocked and are brought into the active queue, and in the active queue, only the subtasks marked as the progress state are respectively brought into an updating thread pool or a calculating thread pool according to the updating or calculating type.

In a possible implementation manner, sub-tasks passing through the update thread pool or the calculation thread pool are synchronously updated to an activity queue, the synchronous updating method comprises locking, identifying state, bringing into the activity queue and unlocking, the locking is used for locking the sub-tasks, the locked sub-tasks are brought into the activity queue for retrieval, whether the task exists in the activity queue is retrieved, if the task exists, the sub-tasks are deleted, whether other tasks exist and are related to the task is judged, if the task exists, the dependency relationship of the other sub-tasks on the sub-tasks is released, and then the sub-tasks are unlocked; and if the task does not exist in the active queue, marking the task as a progress state, unlocking the task, and bringing the task into the active queue.

In a possible embodiment, in the step five, the fixed radius search subtask selects all the point-like elements smaller than or equal to the fixed radius by calculating the distance between the point-like elements inside each grid unit and the query position one by one and comparing the distance with the fixed radius, and takes the selected point-like elements and the corresponding distances as output results;

and the K adjacent search subtasks calculate the distance between the point-like elements in each grid unit and the query position one by one according to the distance relationship, and extract the first K results as output results in a mode of sequencing according to the distance relationship from near to far.

A nearest neighbor fast search system for ten million point-like elements, the system comprising: a basic space information acquisition module, a grid creation module, a subtask construction module, an update scheduling module and a request response module,

the basic space information acquisition module is used for traversing all the point-like elements to acquire the four-to-one range, the starting coordinate and the number of the point-like elements;

the grid creation module creates a first grid layer according to the data acquired by the basic spatial information acquisition module, wherein the grid layer comprises at least one grid unit, and if the number of point-like elements in any grid unit in the grid layer exceeds a storage threshold value, at least one new grid layer is created until the number of point-like elements in each grid unit in the latest grid layer does not exceed the storage threshold value;

the subtask construction module constructs a task based on the latest grid unit according to an application request, wherein the task comprises a subtask searched by a fixed radius or a subtask searched by a K neighbor;

the updating and scheduling module dynamically updates the point elements according to the change of the point elements, wherein the updating comprises at least one of adding, deleting and moving the point elements in the grid layer;

the request response module searches for the subtask and the K adjacent search subtask according to the fixed radius determined by the subtask construction module, performs task calculation, submits a result to a service request end, and responds to a service request, wherein the task calculation is to calculate the distance between the position of the request point and the position of each point element, and the result is a set of point elements which satisfy the condition that the distance is less than the request distance or the K distance is the nearest.

In one possible implementation, after creating the first layer mesh, the mesh creation module creates a mesh cell attribute list from the mesh cells, where the list includes: the unit code is encoded in a hierarchical, horizontal, and vertical direction, the number of dot elements being the number of dot elements included in the grid layer, and the dot element list being a list in which dot elements are continuously stored in the grid.

In a possible embodiment, the grid creating module traverses the dot-shaped elements in each grid layer, and the row and column numbers of the grid corresponding to the coordinate positions of the dot-shaped elements serve as the unique identification IDs of the dot-shaped elements, and the unique identification IDs of the dot-shaped elements are included in the dot-shaped element list, so as to record the number of the dot-shaped elements.

In one possible embodiment, the tasks of the subtask building block include: the task coding comprises task coding, task types, subtask coding, subtask states and dependency relations, wherein the task coding is sequentially coded according to the sequence of task reading and writing, the task types comprise two types of updating and calculation, the subtask coding adopts grid coding, the dependency relations are expressed by binary groups, the binary group elements comprise task coding and grid coding, and the subtask states are divided into two types of proceeding and waiting.

In a possible implementation manner, the fixed radius search subtask in the subtask construction module is constructed by a method of obtaining a point-like element in the grid which is overlapped with the radius in a spatial range according to the fixed radius of the search position;

In a possible implementation manner, the newly added point-like elements in the update scheduling module modify the number of the point-like elements and the point-like element list according to the positions of the hierarchical grids where the newly added point-like elements are located;

In a possible implementation manner, the data update and the computation service in the update scheduling module run, and an active queue method is used to process the operation sequence of the subtasks, where the method includes steps of locking, identifying a state, including an active queue, and unlocking, where the locking is to lock the subtask, and include the locked subtask in the active queue for retrieval, and retrieve whether a related task exists in the active queue, and if no related task exists, the locked subtask is marked as an on state and included in the active queue, so as to release the locking; if the related tasks exist, whether the tasks need to be calculated is continuously judged, if the tasks need to be calculated, the tasks are marked as a progress state and then brought into an active queue, if the tasks do not need to be calculated, the tasks are marked as a waiting state, then the tasks are unlocked and brought into the active queue, and in the active queue, only the subtasks marked as the progress state are respectively placed into an updating thread pool or a calculating thread pool according to the updating or calculating type.

In a possible implementation manner, the update scheduling module synchronously updates the subtasks passing through the update thread pool or the calculation thread pool to an activity queue, and the synchronous update method includes steps of locking, identifying a state, including the activity queue and unlocking, where the locking is a locking subtask, the locking subtask is included in the activity queue for retrieval, whether the activity queue has the task or not is retrieved, if the activity queue has the task, the subtask is deleted, whether other tasks are related to the task is determined, and if the activity queue has other tasks, the dependency relationship of the other subtasks on the subtasks is removed, and then the tasks are unlocked; and if the task does not exist in the active queue, marking the task as a progress state, unlocking the task, and bringing the task into the active queue.

In a possible implementation manner, the fixed-radius search subtask in the request response module selects all point-like elements smaller than or equal to the fixed radius by calculating the distance between the point-like elements inside each grid unit and the query position one by one and comparing the distance with the fixed radius, and takes the selected point-like elements and the corresponding distances as output results;

Advantageous effects

The application discloses a method and a system for quickly searching the nearest point-like elements in ten million levels, which have the following beneficial effects: the invention provides a series of key technologies and system implementation schemes for solving the problem of rapid neighborhood search of high-density point-like elements in a large-scale scene, and five closely-related core steps are constructed by disassembling a complex business process. On the basis of multi-level grid index, high-efficiency access and management capacity for large-scale point-like element data is provided, and by cooperatively scheduling two tasks of data updating and query computing, real-time updating service for the large-scale point-like element data is provided, parallel computing capacity of a multi-core CPU is fully exerted, and service response time of concurrent search requests is shortened.

Under the technical scheme provided by the invention, the system is deployed by adopting the workstation configured with the double-path 40-core CPU, when more than 1900 ten thousand point-like element sets are processed, neighborhood search calculation of about 1600 times can be processed and responded in 1 second on average, and the average single search point number is not less than 100.

Drawings

The embodiments described below with reference to the drawings are exemplary and intended to be used for explaining and illustrating the present application and should not be construed as limiting the scope of the present application.

FIG. 1 is a flow chart of the method of the present invention;

FIG. 2 is a block diagram of the architecture of the system of the present invention;

FIG. 3 is a method for grid cell attribute and management in accordance with the present invention;

FIG. 4 is a schematic diagram of the creation of a new mesh in accordance with the present invention;

FIG. 5 is a schematic of the K-neighbor search and the fixed radius search of the present invention;

FIG. 6 is a flowchart of the update and compute task scheduling of the present invention.

Detailed Description

In order to make the implementation objects, technical solutions and advantages of the present application clearer, the technical solutions in the embodiments of the present application will be described in more detail below with reference to the drawings in the embodiments of the present application.

The following is an embodiment of a neighborhood fast search method for ten-million-scale punctiform element data, and as shown in fig. 1, the method of the present invention includes five specific core steps, specifically as follows:

step one, traversing all the point-like element data to obtain a range of four to (Xmin, Ymin, Xmax, Ymax), a starting coordinate (Xo, Yo) and the number. If the data set itself already stores four to range and attribute table attributes, the relevant information can be directly obtained. Traversal (Traversal) refers to making one access to each node in the tree (or graph) in turn along a search route. The operation performed by the access node depends on the specific application problem, and the specific access operation may be to check the value of the node, update the value of the node, and the like. Different traversal methods have different access node orders. Traversal is one of the most important operations in the binary tree, and is the basis for performing other operations in the binary tree. Of course the concept of traversal is also applicable to the case of multi-element sets, such as arrays.

a first layer mesh may be created based on the four-to range and the calculated coordinates obtained in step one. And (3) from the beginning to the coordinate calculation, coding the codes of all grid units according to a mode of horizontal first and vertical second to form a hierarchical coding # horizontal grid coding # vertical grid coding, and increasing 1 bit of the grid coding length layer by layer. Each grid cell should contain the number of currently contained elements and the corresponding list of element identifications. The element number and the element identification list are managed by traversing, adding and deleting methods.

The properties of the cells of the mesh and the method are contained in fig. 3.

As shown in fig. 4, the cell size (S) of the first layer of the mesh is 90 km, and the mesh size is subdivided by 3 × 3, and the mesh size of each lower layer is 30 km, 10 km, and … …. To ensure the calculation accuracy, the value can be reserved to 7 bits after the decimal point.

After the first layer mesh is created, traversal of the data begins. And calculating the row and column numbers (C, R) of the corresponding grids according to the coordinates (X, Y) of the point elements. And according to C and R, the unique identification ID of the point element is included into IDLIST, and the element number COUNT is updated.

C＝INT((X-Xo)/S),R＝INT((Y-Yo)/S)

Remarking: INT stands for rounding, which does not need to be rounded.

And after the first round of data traversal is finished, determining whether to continue to create a new grid layer and fill the new grid layer according to whether the COUNT in the grid index still exceeds the threshold value. The storage threshold in a single grid cell is set to 500 by default.

Remarking: step two is an iterative process, if the creation process of the grid index is accelerated, a plurality of hierarchical grids can be created each time, and then the filling is completed through one data traversal.

the complete task comprises five contents of 'task coding, task type, subtask coding, subtask state and dependency relation'. The task codes are sequentially coded according to the arrival time sequence: 1, 2, 3, …, etc.; the task types are divided into two types of updating and computing; the subtask codes are expressed by adopting grid codes; the dependency relationship is labeled by a binary group < task coding, grid coding > mode, and may comprise a plurality of binary groups; the subtask state is divided into "go" and "wait".

For the 'updating' task, the task coding and the task type are both determined items and comprise a plurality of subtasks equal to the level of the grid layer (to ensure the consistency of the level). The subtask state and dependencies are determined by detecting whether there are other subtasks in the active queue for which the current mesh is in an "update" or "compute" state.

As shown in fig. 5, the computation task is divided into a fixed radius search and a K-neighbor search, and the sub-tasks based on the mesh unit are constructed in the following two ways.

(1) Fixed radius search task

And determining a binding rectangle according to the fixed radius of the searched position, and further acquiring an overlapped grid in a spatial range, thereby constructing a plurality of subtasks. Different subtasks have the same task code and task type, but the mesh code is different. The subtask state and dependency are determined by detecting whether there are other subtasks in the active queue for which the current mesh is in an updated state. If present, the status is identified as "waiting", otherwise the status is identified as "proceeding".

(2) K-neighbor search task

And according to the grids where the searched positions are located, accessing the number of the point-like elements of each grid layer by layer in a clockwise (or anticlockwise) direction. After each external expansion, judging whether the number of the current records is more than or equal to K, and if the conditions are met, adding one layer of external expansion grid as an effective grid range.

Step four, if the point-like elements change, the point-like elements are dynamically updated, and the relationship between the updating and the task scheduling calculation is shown in FIG. 6.

After the multi-level grid index is constructed, if data changes, the multi-level grid index should be dynamically updated. The data updating supports the parallel completion of a multi-thread mode. Data update is divided into three categories: adding, deleting and moving positions.

When the point-like elements are added, the COUNT and the IDLIST of the point-like elements are modified according to the positions of the grids of the levels where the point-like elements are updated. If there is concurrent thread modification or simultaneous reading and writing to the same grid cell, coordination should be performed to ensure sequential entry.

And when the point-like elements are deleted, modifying the COUNT and the IDLIST according to the positions of the updated hierarchical grids. The operation process is the same as the addition of the point-like elements.

When moving the point-like elements, it is first determined whether the positions before and after the movement belong to one grid cell. If the two positions belong to the same position, no processing is performed. Otherwise, the grid where the mobile terminal is located is deleted before the mobile terminal is moved, and then the mobile terminal is inserted into the grid where the mobile terminal is located.

In a possible implementation manner, during data updating and computing service operation, simultaneous writing or simultaneous reading and writing of the same grid unit or hierarchical dependent unit should be avoided, and during the data updating and computing service operation in step four, an active queue method is adopted to process the operation sequence of the subtasks constructed in step three, the method includes locking, identifying states, including an active queue and unlocking steps, wherein locking is to lock a subtask, and to include the locked subtask into the active queue for retrieval, and to retrieve whether a related task exists in the active queue, if no related task exists, the locked subtask is marked as a performing state, and is included into the active queue, and the subtask is driven to perform processing at the same time, so that locking is released; if the related tasks exist, whether the tasks need to be calculated is continuously judged, if the tasks need to be calculated, the tasks are marked as a progress state and then brought into an active queue, if the tasks do not need to be calculated, the tasks are marked as a waiting state, then the tasks are unlocked and brought into the active queue, and in the active queue, only the subtasks marked as the progress state are respectively placed into an updating thread pool or a calculating thread pool according to the updating or calculating type.

In one possible implementation, the subtasks that pass through the update thread pool or the compute thread pool are updated to the active queue synchronously, and the updating mode is divided into 'logout' and 'activation'. The synchronous updating method comprises the steps of locking, identifying the state, bringing into an active queue and unlocking, wherein the locking is a locking subtask, the locking subtask is brought into the active queue for retrieval, whether the task exists in the active queue or not is retrieved, if the task exists, the subtask is deleted, whether other tasks are related to the task is judged, if the task exists, the dependency relationship of the other subtasks on the subtask is removed, and then the locking is unlocked; and if the task does not exist in the active queue, marking the task as a progress state, unlocking the task, and bringing the task into the active queue.

The size of the active queue can be set according to needs, the size is set to be 100 under the default condition, the maximum size does not exceed 1000, and the dependence detection cost is increased when the size is too large. The total size of the updated thread pool and the calculation thread pool should not exceed the total number of cores (or the number of core threads) of the CPU, and the configuration is carried out according to the ratio of 1:4 in a default state.

Based on a thread pool technology and a grid task unit, the consistency scheduling of subtask coding, state and dependency relationship is realized through low-time-consumption active queue operation, the fine-grained concurrent processing capability of data updating and distance calculation tasks based on a multi-core CPU environment is realized, the problem of high time consumption of task-level operation is solved, and the online management and real-time query service of large-scale point-like elements can be supported.

The execution result of the updated task is directly reflected to the data and the hierarchical grid index, and a basis is provided for subsequent query calculation, and the calculation method adopts the following steps:

D＝sqrt((X–X_T)^2+(Y–Y_T)^2)

remarking: (X)_T，Y_T) For searching the position, (X, Y) is the position of the point-like element in the grid, and sqrt is the root function.

And for the search request with the fixed radius, calculating the distance between the dot elements in each grid unit and the query position one by one, comparing the distance with the fixed radius, and recording the element identification and the corresponding distance of all the dot elements with the radius less than or equal to the fixed radius. After all subtasks of the same calculation task are calculated, the screened point-like elements and the corresponding distances are used as output results;

and directly splicing and summarizing the calculation results of each grid for the K adjacent search requests, sequencing according to the distance, and extracting the first K results as output.

The following is an embodiment of a neighborhood fast search system for ten-million-scale punctiform element data, and as shown in fig. 2, the system of the present invention includes five specific core modules, specifically as follows:

a basic space information acquisition module, a grid creation module, a subtask construction module, an update scheduling module and a request response module,

the basic space information acquisition module is used for traversing all the point-like element data to acquire four-to-range (Xmin, Ymin, Xmax, Ymax), starting-up coordinates (Xo, Yo) and quantity. If the data set itself already stores four to range and attribute table attributes, the relevant information can be directly obtained. Traversal (Traversal) refers to making one access to each node in the tree (or graph) in turn along a search route. The operation performed by the access node depends on the specific application problem, and the specific access operation may be to check the value of the node, update the value of the node, and the like. Different traversal methods have different access node orders. Traversal is one of the most important operations in the binary tree, and is the basis for performing other operations in the binary tree. Of course the concept of traversal is also applicable to the case of multi-element sets, such as arrays.

the first layer grid can be established according to the four-to-range and the starting coordinate acquired by the basic space information acquisition module. And (3) from the beginning to the coordinate calculation, coding the codes of all grid units according to a mode of horizontal first and vertical second to form a hierarchical coding # horizontal grid coding # vertical grid coding, and increasing 1 bit of the grid coding length layer by layer. Each grid cell should contain the number of currently contained elements and the corresponding list of element identifications. The element number and the element identification list are managed by traversing, adding and deleting methods.

The properties of the cells of the mesh and the method are contained in fig. 3.

C＝INT((X-Xo)/S),R＝INT((Y-Yo)/S)

Remarking: INT stands for rounding, which does not need to be rounded.

Remarking: the mesh creation module is an iterative module, and if the creation process of the mesh index is to be accelerated, multiple hierarchical meshes can be created at a time, and then the filling is completed through one data traversal.

(1) Fixed radius search task

(2) K-neighbor search task

The updating and scheduling module is used for dynamically updating the point elements when the point elements change, and the relationship between the updating and the task scheduling calculation is shown in fig. 6.

In a possible implementation manner, during data updating and computing service operation, simultaneous writing or simultaneous reading and writing of the same grid unit or a hierarchical dependent unit should be avoided, the data updating and computing service operation in the updating scheduling module processes the subtask operation sequence of the subtask building module by using an activity queue method, the method includes the steps of locking, identifying a state, including an activity queue and unlocking, the locking is to lock the subtask, and to include the locked subtask into the activity queue for retrieval, and to retrieve whether a related task exists in the activity queue, if no related task exists, the locking is marked as a performing state, the locked subtask is included into the activity queue, and the subtask is driven to perform processing at the same time, so that the locking is released; if the related tasks exist, whether the tasks need to be calculated is continuously judged, if the tasks need to be calculated, the tasks are marked as a progress state and then brought into an active queue, if the tasks do not need to be calculated, the tasks are marked as a waiting state, then the tasks are unlocked and brought into the active queue, and in the active queue, only the subtasks marked as the progress state are respectively placed into an updating thread pool or a calculating thread pool according to the updating or calculating type.

D＝sqrt((X–XT)^2+(Y–YT)^2)

The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present application should be covered within the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A nearest quick search method for ten million-level point-like elements is characterized by comprising the following steps: the method comprises the following steps:

step one, traversing all the point-like elements to obtain a range of four to four, and calculating the number of coordinates and the point-like elements;

2. The method of claim 1, wherein: in the second step, dot-shaped elements in each grid layer are traversed, the row and column numbers of the grids are corresponding to the coordinate positions of the dot-shaped elements, the row and column numbers serve as the unique identification IDs of the dot-shaped elements, the unique identification IDs of the dot-shaped elements are put into a dot-shaped element list, and the number of the dot-shaped elements is recorded.

3. The method of claim 1 or 2, wherein: in the third step, the fixed radius search subtask is constructed by a method of obtaining point-like elements in the grid overlapped with the radius in the space range according to the fixed radius of the searched position;

4. The method of claim 1, wherein: the data updating and the computing service operation in the fourth step are carried out, and the operation sequence of the subtasks constructed in the third step is processed by adopting an activity queue method, wherein the method comprises the steps of locking, identifying the state, bringing the locked subtasks into the activity queue and unlocking, the locking is carried out on the locked subtasks, the locked subtasks are brought into the activity queue for retrieval, whether related tasks exist in the activity queue or not is retrieved, if the related tasks do not exist, the locked subtasks are marked as the on state and brought into the activity queue, and therefore the locking is released; if the related tasks exist, whether the tasks need to be calculated is continuously judged, if the tasks need to be calculated, the tasks are marked as a progress state and then brought into an active queue, if the tasks do not need to be calculated, the tasks are marked as a waiting state, then the tasks are unlocked and brought into the active queue, and in the active queue, only the subtasks marked as the progress state are respectively placed into an updating thread pool or a calculating thread pool according to the updating or calculating type.

5. The method of claim 1, wherein: in the step five, the searching subtask with the fixed radius takes the screened point-like elements and the corresponding distances as output results in a mode of calculating the distances between the point-like elements in each grid unit and the query position one by one and comparing the distances with the fixed radius and screening all the point-like elements with the fixed radius or less;

6. A nearest neighbor fast search system facing ten million-level point-like elements is characterized in that: the system comprises: a basic space information acquisition module, a grid creation module, a subtask construction module, an update scheduling module and a request response module,

7. The system of claim 6, wherein: the grid creating module traverses the dot-shaped elements in each grid layer, and the row and column numbers of the grids corresponding to the coordinate positions of the dot-shaped elements are used as the unique identification IDs of the dot-shaped elements, the unique identification IDs of the dot-shaped elements are put into a dot-shaped element list, and the number of the dot-shaped elements is recorded.

8. The system of claim 6 or 7, wherein: the fixed radius search subtask in the subtask construction module is constructed by a method of obtaining point-like elements in a grid which is overlapped with the radius in a space range according to the fixed radius of the searched position;

9. The system of claim 6, wherein: the data updating and computing service in the updating and scheduling module runs, and the operation sequence of the subtasks is processed by adopting an activity queue method, wherein the method comprises the steps of locking, identifying the state, bringing the locked subtasks into an activity queue and unlocking, the locking is used for locking the subtasks, the locked subtasks are brought into the activity queue for searching, whether related tasks exist in the activity queue or not is searched, if the related tasks do not exist, the locked subtasks are marked as the performing state and are brought into the activity queue, and the locking is released; if the related tasks exist, whether the tasks need to be calculated is continuously judged, if the tasks need to be calculated, the tasks are marked as a progress state and then brought into an active queue, if the tasks do not need to be calculated, the tasks are marked as a waiting state, then the tasks are unlocked and brought into the active queue, and in the active queue, only the subtasks marked as the progress state are respectively placed into an updating thread pool or a calculating thread pool according to the updating or calculating type.

10. The system of claim 6, wherein: the search subtask with the fixed radius in the request response module selects all the point-like elements with the radius smaller than or equal to the fixed radius by calculating the distance between the point-like elements in each grid unit and the query position one by one and comparing the distance with the fixed radius, and takes the selected point-like elements and the corresponding distances as output results;