CN111046126A - Geographic position retrieval method - Google Patents

Geographic position retrieval method Download PDF

Info

Publication number
CN111046126A
CN111046126A CN201911307381.7A CN201911307381A CN111046126A CN 111046126 A CN111046126 A CN 111046126A CN 201911307381 A CN201911307381 A CN 201911307381A CN 111046126 A CN111046126 A CN 111046126A
Authority
CN
China
Prior art keywords
latitude
longitude
payload
docvalues
morton
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911307381.7A
Other languages
Chinese (zh)
Inventor
母延年
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Luxin Software Technology Co ltd
Original Assignee
Nanjing Luxin Software Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Luxin Software Technology Co ltd filed Critical Nanjing Luxin Software Technology Co ltd
Priority to CN201911307381.7A priority Critical patent/CN111046126A/en
Publication of CN111046126A publication Critical patent/CN111046126A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/29Geographical information databases

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Remote Sensing (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to the technical field of computer algorithms, and particularly discloses a geographic position retrieval method, which comprises the following steps of S1: acquiring morton codes of longitude and latitude, and storing the morton codes in DocValues in Lucene; s2: migrating the longitude and latitude values of morton codes in DocValues into Payload; s3: taking values from Payload during the secondary verification and shearing of longitude and latitude; s4: completing position retrieval; according to the method, the characteristic that Payload replaces random reading of docvalues by means of the characteristic of continuous reading of inverted rows per se is adopted, namely, data of the same geographic position are stored together, and the number of times of random reading is reduced by constructing continuous data, so that the retrieval matching speed of the geographic position is greatly improved.

Description

Geographic position retrieval method
Technical Field
The invention relates to the technical field of computer algorithms, in particular to a geographic position retrieval method.
Background
The geographic position retrieval service is visible everywhere in daily life, and is as small as sharing a single vehicle and a high-grade map and as large as a flight route track. Many related functions in the service can be realized through GeoHash, Lucene/Solr also can be applied to GeoHash, and indexes, query indexes, distance calculation and the like are established through GeoHash. However, since the GeoHash algorithm adopts a Peano space filling curve, although a two-dimensional space can be converted into a one-dimensional curve, the Peano space filling curve has the greatest defect of mutability, and some codes are adjacent but have a far distance difference, such as 0111 and 1000, and the codes are adjacent but have a large distance difference, but a scheme for replacing the GeoHash, namely, a morton code, is used for replacing the GeoHash. Aiming at the defects of the existing partial model, the problems of shape degradation of the traditional longitude and latitude grid model in a high latitude area and irregular surface patch shape of the regular polyhedron grid model are effectively solved. Through the morton code, conversion between patch coding and traditional geographic coordinates and calculation of adjacency relation are realized, and the defects caused by earth irregularity and latitude change in the GeoHash algorithm are overcome. The Morton code can convert multidimensional data into one-dimensional data codes, the retention precision of the multidimensional data can be determined according to the number of the one-dimensional code codes, the Morton code is a common compression coding method, and particularly used as a mapping algorithm of a hash table and the like, and the storage and access speed of tree structure data is accelerated.
The inside sandbox package of the Lucene supports the retrieval of the geographic position, and the default implementation can support the retrieval of the geographic position of a square, a circle and a polygon. The current lucene extension package also supports geographic location retrieval, which is implemented based on morton codes. Because the morton code can only be used for expressing a square area, and when the lucene is used for searching the circular and polygonal geographic positions, a rough range is firstly screened out based on morton, and then each screened record is subjected to secondary verification and shearing so as to achieve the purpose of accurate matching.
Currently, the secondary verification of lucene is implemented by using DocValues, a DocValues field is a field oriented to column storage, and the DocValues are new characteristics introduced in Lucene4.0 and belong to forward indexes. It stores an index of document numbers to the forward relation of field values. The DocValues-based realization of secondary verification and shearing has more random IO, and if the number of hit records is large, the overall geographic position retrieval performance is very poor.
Disclosure of Invention
The present invention is directed to a geographic location searching method to solve the above problems.
In order to achieve the purpose, the invention provides the following technical scheme: a geographic position retrieval method includes the following steps,
s1: acquiring morton codes of longitude and latitude, and storing the morton codes in DocValues in Lucene;
s2: migrating the longitude and latitude values of morton codes in DocValues into Payload;
s3: taking values from Payload during the secondary verification and shearing of longitude and latitude;
s4: and completing the position retrieval.
Preferably, in step S1, the morton codes of longitude and latitude are stored in the DocValues in a forward column storage format.
Preferably, in step S2, the morton codes of longitude and latitude are stored in Payload in an inverted storage structure format.
Preferably, in step S3, when performing secondary verification and clipping of longitude and latitude, intercepting the calling of longitude and latitude data from the DocValues, and providing an interface for taking a value from Payload, thereby implementing the taking of longitude and latitude values from Payload.
Compared with the prior art, the invention has the beneficial effects that: according to the method, the characteristic that Payload replaces random reading of docvalues by means of the characteristic of continuous reading of inverted rows per se is adopted, namely, data of the same geographic position are stored together, and the number of times of random reading is reduced by constructing continuous data, so that the retrieval matching speed of the geographic position is greatly improved.
Drawings
FIG. 1 is a flow chart of the steps of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, the present invention provides a technical solution: a geographic position retrieval method includes the following steps,
s1: acquiring morton codes of longitude and latitude, and storing the morton codes in DocValues in Lucene;
s2: migrating the longitude and latitude values of morton codes in DocValues into Payload;
s3: taking values from Payload during the secondary verification and shearing of longitude and latitude;
s4: and completing the position retrieval.
Further, in step S1, the memory format of the morton codes of longitude and latitude in the DocValues is a forward column memory format.
Further, in step S2, the morton codes of longitude and latitude are stored in Payload in an inverted storage structure.
Further, in step S3, when performing secondary verification and cutting of longitude and latitude, intercepting the calling of longitude and latitude data from the DocValues, and providing an interface for taking a value from Payload, thereby implementing the taking of longitude and latitude values from Payload.
The working principle is as follows: in step S1, the latitude morton code in the map is acquired and stored in the DocValues in Lucene; migrating the longitude and latitude values of the morton code in the DocValues to Payload in step S2; in step S3, when the longitude and latitude secondary verification step is performed in the location retrieval, a value is taken from Payload; in step S4, the position search is completed.
The invention replaces the random reading characteristic of docvalues by adopting the characteristic that Payload reads continuously by virtue of inverted self. Thereby realizing the retrieval matching speed of the geographical position greatly improved; data read from the original DocValues for secondary verification of geographic positions (such as circles and polygons) are distributed at different positions of the disk and are discontinuous; the data read from the Payload is that the data of the same geographic position are stored together, and the random reading times are reduced by constructing continuous data; therefore, the random reading times are greatly reduced, continuous reading on the hard disk is constructed through the payload, and the continuous reading performance of the conventional disk is far higher than that of random reading and writing, so that the query response speed is greatly improved.
Although embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that changes, modifications, substitutions and alterations can be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.

Claims (4)

1. A geographic location retrieval method, characterized by: comprises the following steps of (a) carrying out,
s1: acquiring morton codes of longitude and latitude, and storing the morton codes in DocValues in Lucene;
s2: migrating the longitude and latitude values of morton codes in DocValues into Payload;
s3: taking values from Payload during the secondary verification and shearing of longitude and latitude;
s4: and completing the position retrieval.
2. A geographical location retrieval method according to claim 1, wherein: in step S1, the memory format of the morton code of latitude and longitude in the DocValues is a forward column memory format.
3. A geographical location retrieval method according to claim 1, wherein: in step S2, the longitude and latitude morton code is stored in Payload in an inverted storage structure.
4. A geographical location retrieval method according to claim 1, wherein: in the step S3, when performing secondary verification and clipping of longitude and latitude, intercepting the calling of longitude and latitude data from the DocValues, and providing an interface for taking a value from Payload, thereby implementing the taking of a longitude and latitude value from Payload.
CN201911307381.7A 2019-12-18 2019-12-18 Geographic position retrieval method Pending CN111046126A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911307381.7A CN111046126A (en) 2019-12-18 2019-12-18 Geographic position retrieval method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911307381.7A CN111046126A (en) 2019-12-18 2019-12-18 Geographic position retrieval method

Publications (1)

Publication Number Publication Date
CN111046126A true CN111046126A (en) 2020-04-21

Family

ID=70237198

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911307381.7A Pending CN111046126A (en) 2019-12-18 2019-12-18 Geographic position retrieval method

Country Status (1)

Country Link
CN (1) CN111046126A (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108009265A (en) * 2017-12-15 2018-05-08 中国公路工程咨询集团有限公司 A kind of space data index method under cloud computing environment
CN110413624A (en) * 2019-08-07 2019-11-05 南京录信软件技术有限公司 A method of the multiple row stored in association deposited based on column

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108009265A (en) * 2017-12-15 2018-05-08 中国公路工程咨询集团有限公司 A kind of space data index method under cloud computing environment
CN110413624A (en) * 2019-08-07 2019-11-05 南京录信软件技术有限公司 A method of the multiple row stored in association deposited based on column

Similar Documents

Publication Publication Date Title
CN109992636B (en) Space-time coding method, space-time index and query method and device
US10034141B2 (en) Systems and methods to identify home addresses of mobile devices
US20140278228A1 (en) Mapping uncertain geometries to graticules
CN103714145A (en) Relational and Key-Value type database spatial data index method
CN108009265B (en) Spatial data indexing method in cloud computing environment
CN108205562B (en) Positioning data storage and retrieval method and device for geographic information system
CN103399945A (en) Data structure based on cloud computing database system
CN111552694B (en) Self-adaptive geospatial grid indexing method
US11537614B2 (en) Implementing multidimensional two-sided interval joins using sampling-based input-domain demarcation
CN105528460A (en) Establishing method of tile pyramid model and tile reading method
CN112395288B (en) R-tree index merging and updating method, device and medium based on Hilbert curve
CN112214472B (en) Meteorological lattice data storage and query method, device and storage medium
CN115391699B (en) Cloud-optimized static geographic information vector data online service method
CN104182472A (en) Land utilization vector data storage method based on main memory database Redis
CN115687675A (en) Airborne vector map data processing method
CN109885638B (en) Three-dimensional space indexing method and system
CN114398459A (en) Space-time grid and tree fusion storage method for ship positioning data
CN111046126A (en) Geographic position retrieval method
CN113076334B (en) Data query method, index generation method, device and electronic equipment
CN116049521A (en) Space-time data retrieval method based on space grid coding
CN115408382A (en) Vector space information gridding storage management method for large-scale satellite data
CN114116925A (en) Spatio-temporal data query method and related device
CN107992555B (en) Method for storing and reading raster data
CN117671392B (en) Grid division method, device, equipment and medium for homeland space data
CN109325086B (en) Discrete geographic data archiving management method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination