CN112307138A - Storage and query method, system and medium of region information - Google Patents

Storage and query method, system and medium of region information Download PDF

Info

Publication number
CN112307138A
CN112307138A CN201910692124.3A CN201910692124A CN112307138A CN 112307138 A CN112307138 A CN 112307138A CN 201910692124 A CN201910692124 A CN 201910692124A CN 112307138 A CN112307138 A CN 112307138A
Authority
CN
China
Prior art keywords
address
code
level
information
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910692124.3A
Other languages
Chinese (zh)
Inventor
苏同
王亚晗
亢海波
史凯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hylink Digital Technology Co ltd
Original Assignee
Hylink Digital Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hylink Digital Technology Co ltd filed Critical Hylink Digital Technology Co ltd
Priority to CN201910692124.3A priority Critical patent/CN112307138A/en
Publication of CN112307138A publication Critical patent/CN112307138A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/29Geographical information databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • Development Economics (AREA)
  • Finance (AREA)
  • Data Mining & Analysis (AREA)
  • Accounting & Taxation (AREA)
  • Strategic Management (AREA)
  • General Engineering & Computer Science (AREA)
  • Game Theory and Decision Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Software Systems (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Remote Sensing (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present disclosure provides a file-based region information query method and a file generation method, wherein the region information is related to an address interval indicated by a start IP address and an end IP address, and the file includes a recording area, an index area, and a file header. The query method comprises the following steps: receiving a region inquiry request containing a target IP address; and responding to the region inquiry request, inquiring the file to determine the region information of the target IP address according to the address interval of the target IP address.

Description

Storage and query method, system and medium of region information
Technical Field
The invention relates to the technical field of internet, in particular to a method for storing and inquiring geographic information based on an IP address.
Background
In an advertisement delivery system, a region where a user is located is usually determined according to an IP address in a user request, and advertisement delivery is performed according to regional orientation. In the prior art, to perform region orientation, region information corresponding to each IP address value or IP address field is usually stored, and since the same region information can correspond to a plurality of IP address values and IP address fields in different ranges, repeated records of the region information are increased, and an excessive storage space is occupied. In addition, conventionally, the IP address value and the IP address field and the corresponding region information are generally stored in the database, so that higher requirements are put on the platform to be used, and the use cost is increased.
Disclosure of Invention
The invention aims to provide a method for storing and inquiring region information corresponding to an IP address, so as to reduce repeated region information storage and save storage space. Meanwhile, the IP address and the region information are stored in one file, so that the requirement on a database is avoided, and any platform can conveniently use the file to inquire the region information.
According to one aspect of the present invention, there is provided a method of creating a file for a geographic information query, comprising: acquiring a plurality of pieces of source data, wherein each source data comprises a starting IP address and a terminating IP address of an indicated address interval and region information to which the address interval belongs; generating a file based on the plurality of pieces of source data, the file including: a recording area for recording an end address code generated based on an end IP address of each piece of source data and related contents related to the region information contained in the piece of source data; an index area for storing a start address code generated based on the start IP address of each piece of source data and an address of an end address code of the piece of source data in the recording area; and the file header is used for storing the index starting address and the index ending address of the first starting address code and the last starting address code in the index area.
In one example, the start and end IP addresses are IPV6 addresses, wherein the start and end IP addresses are encoded to generate the start and end address codes, respectively, in an encoding format as follows: converting each of the eight segments of each IPV6 address into a binary representation having a length of two bytes; the eight segments of each IPV6 address in the binary representation are concatenated to form a 16-byte stream.
According to another aspect of the present invention, there is provided a file-based geographical information query method, the geographical information being associated with an address interval indicated by a start IP address and an end IP address, wherein the file includes: a recording area in which each record stores an end IP address code generated based on the end IP address and related contents of the region information to which the address section including the end IP address belongs; the index area is used for storing the starting IP address code generated based on the starting IP address and the address of the ending IP address code of the corresponding ending IP address in the recording area according to the sequence of the starting IP address; the file header is used for storing the index starting address and the index ending address of the first starting IP address code and the last starting IP address code in the index area; the method comprises the following steps: receiving a region inquiry request containing a target IP address; and responding to the region inquiry request, inquiring the file to determine the region information of the target IP address according to the address interval of the target IP address.
According to still another aspect of the present invention, there is provided a file-based region information query system, the region information being related to an IP address interval indicated by a start IP address and an end IP address, comprising: a memory for storing executable instructions; one or more processors configured to execute the executable instructions to: receiving a region inquiry request containing a target IP address; loading a file, the file comprising: a recording area, wherein each record stores the terminal IP address code generated based on the terminal IP address and the related content of the region information corresponding to the IP address interval containing the terminal IP address; the index area is used for storing the IP addresses of the initial IP address codes generated based on the initial IP address and the corresponding termination IP address codes of the termination IP address in the recording area according to the sequence of the initial IP address; the file header is used for storing the index initial IP address and the index ending IP address of the first initial IP address code and the last initial IP address code in the index area; and responding to the region inquiry request, inquiring the file to determine the region information of the target IP address according to the address interval of the target IP address.
The present invention also provides a system for creating a file for a geographic information query, comprising: a memory having executable instructions stored thereon; one or more processors configured to implement the methods of the present invention by executing the instructions.
The present invention also provides a computer storage medium having stored thereon executable instructions that, when executed, implement the method of the present invention.
Drawings
In order to more clearly embody the technical scheme used in the present invention, the drawings used in the technical scheme will be briefly described below.
Fig. 1 shows a flow chart for storing IP address and zone information according to the present invention;
FIG. 2 illustrates a file structure diagram according to one embodiment of the invention;
FIG. 3 is a flow diagram of generating a file according to one embodiment of the invention;
fig. 4 is a flowchart of writing a file recording area according to an embodiment of the present invention;
FIG. 5 is a flow diagram of writing a file index area according to one embodiment of the invention;
FIG. 6 is a flow chart of querying geographical information, according to an embodiment of the present invention;
FIG. 7 is a flow diagram of querying a file to determine geographic information, in accordance with one embodiment of the present invention;
FIG. 8 is a schematic diagram of a computer in accordance with one embodiment of the present invention.
Detailed Description
The following describes embodiments of the present invention in detail with reference to the accompanying drawings. While the preferred embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
The network activity information may be stored as historical data in a data source for subsequent data analysis, where the network activity information may include an IP address or an IP address interval (also referred to herein as an IP address segment) and corresponding region information, where the IP address interval is specified by a start IP address and a stop IP address, and the region information may be a specific geographical name, a corresponding zip code or a region indicated by any other coding method. In the following embodiments, the IPV6 address and the numerical code of the region are used as examples to illustrate the method of storing region information according to the embodiments of the present invention, wherein the province code and the city code may be composed of 10 digits as an example.
As shown in fig. 1, in step 102, source data is obtained from an external data source, where each source data includes a start IPv6 address, an end IPv6 address, and region information belonging to an address interval indicated by the start IPv6 address and the end IPv6 address, and in this example, the source data is stored as a hierarchical nested data structure, which is expressed as follows, for example:
Figure BDA0002148168190000041
as shown, the regional information is classified and encoded according to the administrative scope, for example, under the CountryEntity list, there is a province, represented by a province code, and under the province, there is a city entity list, the city to which it belongs is represented by a city code, and at the last stage, the start IPv6 address and the end IPv6 address corresponding to the lowest administrative region are shown. It should be noted that, in the data structure, only province and city secondary administrative areas are used as examples, and it is obvious that the present invention is not limited thereto, and may be expressed as more specific multi-level areas, such as countries, states, cities, towns, etc., and the IP address section corresponds to the lowest level, such as towns. For simplicity, in the following embodiments, only the region information including the second-level province code and the city code is described as an example.
As known to those skilled in the art, IPv6 addresses are represented by a colon hexadecimal notation, in which each IPv6 address is split by a colon into 8 segments, each segment containing a 4-bit hexadecimal number, and the leading 0 of each segment may be omitted. Therefore, the source data acquired from the external data source is divided into independent data according to rows, each independent source data is divided into four parts, namely a starting IPv6 address, an ending IPv6 address, province coding and city coding from left to right. Thus, each piece of source data x may be represented as (IPStart, IPEnd, Geo1, Geo2), where IPStart represents a starting IPv6 address, IPEnd represents an ending IPv6 address, Geo1 represents province coding at a first level, Geo2 represents city coding at a second level, and different source data may have the same or different Geo1 and Geo 2. An exemplary source data format is as follows:
x-x (2001:0250:0800:0:0:0:0:0, 2001:0250:0800: ffff: ffff: ffff: ffff, 1156130000, 1156130100), wherein 1156130000 represents Hebei and 1156130100 represents Shijiazhuang.
After acquiring a large amount of independent source data x, for example, N pieces of source data (x) in the manner as described above1,x2,…xN) And then, generating a geographic query file GIF by using the N pieces of source data so as to store the initial IPv6 address, the terminal IPv6 address and the belonging region information contained in the N pieces of source data in the file. Fig. 2 shows an exemplary structure of the file GIF.
As shown in fig. 2, the GIF file includes three areas, i.e., a file header, an index area, and a recording area. For ease of illustration, the relationship between these three regions is shown here in a tiled fashion. As shown, a termination IPV6 address (IPEnd) based on each piece of source data is recorded in the recording area1,IPEnd2,…IPEndN) And encoding the generated termination address code (IPeC)1,IPeC2,…IPeCN) And the Content related to the regional information contained in each source data1~ContentN. Note that subscripts 1 to N herein are used to indicate the order in which the respective source data are randomly stored in the recording area.
Storing in an index areaStarting address code IPsC generated based on starting IPV6 address IPStart coding of each piece of source data1 (1),IPsC3 (2),…,IPsCm (N)In addition, the address of the ending address code IPeC in the recording area corresponding to each starting address code IPsC is also stored in the index area, for example, the absolute offset address IPeC _ OFF relative to the file header. In a preferred embodiment, the start address code IPsC(1),IPsC(2),…,IPsC(N)Is according to N starting IP addresses IPStart1~IPStartNThe corresponding start address codes are stored in the index area in order of magnitude. Accordingly, the superscript is hereby incorporated herein(1)~(N)The storage sequence, i.e. the position sequence, of each start address code IPsC in the index area is shown, and the subscript represents the storage sequence of the corresponding end address code and the related content of the start address code IPsC in the recording area. For example, as illustrated in FIG. 2, IPsC is recorded for the first entry in the index region2 (1)Its corresponding termination address code IPeC2At the 1 st position in the recording area, i.e. the first recording, here with the absolute offset address IPeC _ OFF1Indicating the termination address code IPeC2At the location of the recording area. Similarly, the last IPsC of the index regionm (N)And IPeC _ OFFmTable correspondent termination address code IPeCmLocated at the m-th position in the recording area, i.e. m-th recording, and using IPeC _ OFF1Indicating the termination address code IPeCmAt the location of the recording area.
The file header contains two parts, which are respectively used for storing the index starting address IPsC arranged at the first bit in the index area2 (1)And last and index end address IPsCm (N)Address IPsC _ OFF in the index area1And IPsC _ OFFNIn one example, the address IPsC _ OFF may be an absolute offset address of each record in the index area with respect to the file header. A process of writing source data information into a recording area, an index area, and a file header in the GIF file according to an example of the present invention will be described in detail below with reference to fig. 3.
FIG. 3 illustrates a flow diagram for generating a file, according to one embodiment of the invention. As shown in fig. 3, in step 301, an address length usable for addressing the entire GIF file is determined based on the acquired data amount N of the source data and the encoding format adopted for storing the source data in the file, and two storage locations P1 and P2 having the address length are reserved in the header of the GIF file.
In the extracted N pieces of source data, it is assumed that the province code and the city code corresponding to all IPv6 address fields are different, and obviously, the storage space occupied for storing these data is the most. According to an example of the present invention, to save storage space, the IPv6 address in the GIF file is encoded in a byte stream format, i.e. 16 bytes are used to store 8 segments of the IPv6 address, i.e. each segment of the colon partition is represented by 2 bytes. For example, IPv6 address 2001:0250:0800:0:0:0:0 is coded into byte stream 0x20010250020000000000000000000000, and occupies 16 bytes. And the province code and the city code in the source data can be coded and stored in a text string format and end up in '\ 0', and as in the source data of the example, each province code and city code respectively occupy 11 bytes of storage space. Therefore, taking the example of extracting 40 ten thousand pieces of source data (i.e., N ═ 40 ten thousand), the maximum storage space occupied by the recording area is (16 bytes ending IPv6 address (IPeC) +11 bytes province code (Geo1) +11 bytes city code (Geo2)) × 40 ten thousand ≈ 14.5M. The data of the recording area of the 14.5M space can be addressed using an address of at least 3 bytes bit wide, in this embodiment a 4 byte address is used to address the data in the recording area. Meanwhile, correspondingly, the address offset IPeC _ OFF of the ending IPv6 address IPeC in the recording area is stored by 4 bytes in the index area, and the maximum storage space occupied by the index area is: (16 bytes IPsC +4 bytes IPeC _ OFF) × 40 ten thousand ≈ 7.63M, so that the finally generated file size is about 14.5M +7.63M ═ 22.13M.
Therefore, after the storage space occupied by the GIF file generated by 40 ten thousand pieces of source data is determined to be at most 22.13M according to the data volume and the coding format, 8 bytes are reserved in the header of the file to store the header information of the file, wherein the first source data IPsC in the index area is stored in the position of P1 of the first 4 bytes2 (1)Absolute Address offset IPsC _ OFF in a File1After, after4-byte P2 location storage index area last source data IPsCm (N)Absolute Address offset IPsC _ OFF in a FileN
It should be noted that, although the IP address and the region information are encoded and stored in the form of a byte stream and text characters, respectively, in the present embodiment, the present invention may also be stored in other formats, for example, the IP address itself may be directly stored in the index area and the recording area. In addition, province codes and city codes are used herein to indicate specific provinces and cities, but in another example, provinces and city names may be directly stored. Obviously, different source data volumes and IP addresses and geographical information formats affect the size of the entire file and the corresponding address length for addressing the entire file.
After the header is set, in step 303, the ending address code IPeC of each piece of source data and the related Content of the regional information included in the piece of source data are recorded in the recording area. Fig. 4 shows a flowchart of a method of writing data in a recording area.
At step 402, a piece of source data, e.g., x, is read from a source data set1And writing the address IPEnd of the end IPV6 at the start position of the recording area, for example, at the position immediately following the header1Encoded IPeC1. For example, assume x1(2001:0250:0800:0:0:0, 2001:0250:0800: ffff: ffff: ffff, 1156130000, 1156130100), the termination address code IPeC1=0x200102500800fffffffffffffffffffff。
At step 404, data x is read1Region information in (1), i.e. provincial code Geo1 in this example1And city code Geo21. In step 406, the province code Geo1 is determined1And city code Geo21Whether all have been recorded in previous recordings of the recording area as a whole. Here, since x1Is the first piece of source data to which the file begins to be written, so the provincial code Geo1 can be determined in step 4061And city code Geo21Are all the first occurrences, proceed to step 408, which will be encoded by the province Geo11And city code Geo21Content of the composition1Writing recording area end address code IPeC1And thereafter, thereby forming a first record of the recording area. In this example, the provincial code Geo11And city code Geo21Utf-8 encoding is used to distinguish it from offset address data. Then proceed to step 410 to determine whether all the source data have been stored, and if not, return to step 402 to continue writing the next source data x2
For source data x2Similarly, at step 402, source data x is read from the source data set2And x is stored in the recording area1By IPEnd2Encoded IPeC2. At step 404, data x is read2Province code Geo1 in (1)2And city code Geo22. In step 406, the province code Geo1 is determined2And city code Geo22Whether all have been recorded in the recording area as a whole. If provincial code Geo12And city code Geo22All occur for the first time, proceed to step 408, encode the province with Geo12And city code Geo22Writing recording area end address code IPeC2The latter position. Then, the process proceeds to step 410, and determines whether all the source data have been stored, and if not, returns to step 402, and continues to write the next source data.
The above-described process is repeatedly performed to write the termination address codes and the region-related contents in the entire source data into the recording area. It is noted here that if it is determined at step 406, for example x, for the current datamIts province code Geo1mAnd city code Geo2mIf the entire system does not appear for the first time, the process proceeds to step 412 to determine whether part of the region information has been recorded in the previous record, in this case, the provincial code Geo1mWhether it has already been recorded. Provincial code Geo1mIt has been recorded that for example the same province code appears in the second recording, but the city code Geo2mWhen it is the first occurrence, the process proceeds to step 414, where the provincial code Geo1 is read from the recording areamIs recorded in the GIF file, in this example, for Geo1mNamely Geo12Absolute offset address Content _ OFF with respect to file header2. Then, in step 416, province code Geo1 is setmThe Flag 0x02, which has occurred in the previous recording, is then set to 0x02, the same province code Geo12Absolute offset location in file Content _ OFF2And city code Geo2mAs a region-related ContentmWriting recording area end address code IPeCmThe latter position. Then, the process proceeds to step 410, and determines whether all the source data have been stored, and if not, returns to step 402 to continue reading the next source data.
During writing of the recording area, if it is determined in step 412, for the current source data, e.g., xpProvincial code Geo1pAnd city code Geo2pAll have appeared out of date, e.g. with the source data xmIf the province code and city code are the same, the process proceeds to step 418, where the Content in the mth record in the recording area is read from the recording areamOffset position Content _ OFF in the GIF filemThen, in step 420, the province code Geo1 is setpAnd city code Geo2pFlag of 0x01 appearing in the file record at the same time, and then the Flag of 0x01 and offset address Content _ OFF are setmAs a region-related ContentpWriting recording area end address code IPeCpThe latter position. Then, the process proceeds to step 422, and determines whether all the source data have been stored, and if not, returns to step 402 to continue reading the next source data.
After the steps 402-422 are repeatedly executed until the IPeC, the province code Geo1 and the city code Geo2 in the N source data are written into the recording area, the process ends in the step 422 and returns.
According to the invention, the Flag and the first and second level geographical information Geo1, Geo2 or offset address IPeC _ OFF have different formats in order to be distinguished from each other, for example the Flag is in a single byte length and the offset address is in a fixed 4 byte length, while the provincial code Geo1 and the city code Geo2 are in utf-8 coding, thus distinguishing from the offset address IPeC _ OFF that may occur before.
Returning to fig. 3, after the storage of the end address and the region information of the source data in the recording area is completed in step 302, the process proceeds to step 305, and the data is written in the index area. According to an embodiment of the present invention, the IPv6 address segments in the N pieces of extracted source data are sorted according to the starting IPv6 address IPStart, and then the offset positions IPeC _ OFF of the starting address code IPsC corresponding to the sorted starting IPv6 address IPStart of the source data and the ending address code IPeC corresponding to the starting address code iptc in the file recording area are sequentially recorded in the index area, and the offset positions IPeC _ OFF are still stored by 4 bytes to prevent memory overflow caused by the increase of the IPv6 address segments in the source data. Fig. 5 shows a flowchart of writing data in the over-index area.
In step 501, the extracted N pieces of source data are sorted according to the ascending or descending order of the starting IPv6 address IPStart, for example, (x) taking 5 pieces of data acquired from the data source as an example1,x2,x3,x4,x5):
x1=(2001:0250:0240:0:0:0:0:0,2001:0250:03ff:ffff:ffff:ffff:ffff:ffff,1156000000,1156000000)
X2=(2001:0250:0200:0:0:0:0:0,2001:0250:023f:ffff:ffff:ffff:ffff:ffff,1156000000,1156110000)
X3=(2001:0250:0408:0:0:0:0:0,2001:0250:07ff:ffff:ffff:ffff:ffff:ffff,1156000000,1156000000)
x4=(2001:0250:0800:0:0:0:0:0,2001:0250:0800:ffff:ffff:ffff:ffff:ffff,1156130000,1156130100)
x5=(2001:0250:0400:0:0:0:0:0,2001:0250:0407:ffff:ffff:ffff:ffff:ffff,1156000000,1156120000),
The data sequence after sorting is: x is the number of2,x1,x5,x3,x4
Then at step 503, the first bit ranked x is read2Starting IPv6 address IPStart of2And encodes it into an initial address code IPsC2And writing the data into the start position of the index area, and recording the data as IPsC2 (1). Then, in step 505, the start IPv6 address IPStart is read from the recording area2The corresponding IPeC2At the location of the recording area, in the example shown in FIG. 2, IPeC2Record in record 2, thus obtain its offset address IPeC _ OFF2. Offset address IPeC _ OFF in step 5072IPsC in write index zone2 (1)And then, forming the first record of the index area. Then, in step 509, it is determined whether the entire source data has been processed, and if the entire section data has not been stored, the next start IPv6 address ranked at a subsequent position, i.e., the IPStart of the present example, is read in step 5111And encodes it into an initial address code IPsC1And stores the position after the first record in the index area, namely IPsC1 (2). The process then returns to step 505 to read the IPStart from the recording area1The corresponding IPeC1At the location of the recording area, in this case the offset address IPeC _ OFF1And turns OFF the offset address IPeC _ OFF in step 5071Write start address code IPsC1 (2)And thereafter, to form a second record of the index area. By repeatedly executing step 505 and 511, the sorted start address codes IPsC and the corresponding stop address codes IPeC can be sequentially stored in the index area at the absolute offset address IPeC _ OFF of the recording area, as shown in FIG. 2, among the N source data, the data xmIs the largest, and is thus recorded in the last record in the index area. After the writing of the indexed region data, as shown in step 513, the process ends and returns to fig. 3.
In step 307, after the data in the index area is written, the header information is written according to the actual data writing situation, that is, the header information is written againFirst record IPsC in index area2 (1)Offset address IPsC _ OFF of1Writing the address as the index starting address into the first 4 bytes in the file header, namely the position P1, and recording the last record IPsC in the index aream (N)Absolute offset address IPsC _ OFF ofNAnd the index end address is written into the position of P2 which is the last 4 bytes in the file header for the data entry of the index area in the subsequent calculation. This completes the generation of the entire GIF file, which may be in any format and naming, such as ip.
It should be noted here that in the above-described embodiment, the data of the recording area is written after the file header, and then the data of the index area is written after the recording area. In another embodiment of the present invention, after the data amount N and the address length are determined, the space occupied by the index area, for example, 7.63M in the previous example, can be determined. Therefore, a space of 7.63M may also be reserved behind the file header to be written with index area data later, and the recording area may be located behind the index area.
The above embodiment is described with reference to the IP address being IPv6, and the present invention is also applicable to the case of IPv4 addresses, which is different from the encoding format for IPv4 addresses. Taking the source data as the IPv4 start and end address segments and the corresponding province codes and city codes as an example, similarly, the source data is divided into independent data by rows, each independent data is divided into four parts by commas, which are the start IPv4 address, the end IPv4 address, the province codes, and the city codes in sequence from left to right. For example, several exemplary source data formats are as follows:
14.197.149.252,14.197.149.255,1156000000,1156110000
14.197.150.0,14.197.150.255,1156130000,1156130500
14.197.151.0,14.197.151.255,1156130000,1156130000
14.197.152.0,14.197.159.255,1156130000,1156130100
wherein 1156000000 denotes China, 1156110000 denotes Beijing, 1156130000 denotes Hebei, 1156130100 denotes Shijiazhuang, and 1156130500 denotes the Paschen stage.
The IPv4 address is in a dot decimal format, and since the sections of the IPv4 address divided by dots are decimal within 255, each IPv4 address is stored by adopting 4 bytes in the process of coding, namely the dot decimal IPv4 address is converted into a 32-bit unsigned integer and can be stored into 4 bytes, so that the storage space is saved. Therefore, the source data based on the address of the IPv4 can be written into the GIF file according to the aforementioned GIF file generation method, that is, the GIF file also includes three parts, namely, a write file header, a write recording area, and a write index area.
According to the present invention, by using a GIF file generated according to a history record, for example, ip.dat, when a region query request including a destination IP address is received, a region where a user having the destination IP address under current network activity is located can be determined according to the GIF file, and fig. 6 shows a flowchart of a method for determining region information to which the destination IP address belongs based on the GIF file according to an embodiment of the present invention, the method can be implemented on a computing device or system such as any computer, and the GIF file can be stored locally on the computing device or system or can be called from the outside. The method is also applicable to the case where the target IP address is an IPv6 or IPv6 address.
As shown in fig. 6, in step 601, a geographic query request Geo _ ReQ containing a target IP address IP _ key is received, and the query request Geo _ ReQ may be from a third party, such as an advertisement delivery platform. In step 602, after receiving the query request Geo _ ReQ, querying the GIF file, and determining the region information to which the IP _ key belongs by determining the IP address interval in which the IP _ key included in the query request Geo _ ReQ is located. Fig. 7 illustrates a flowchart of querying the GIF file to determine corresponding regional information, according to an embodiment of the present invention.
As shown in fig. 7, first, in step 701, after receiving a query request Geo _ ReQ, a file ip. Step 702-704 is then performed to query ip.dat to determine whether an address interval containing the target IP address IP _ key exists.
As an example, in particular, in step 702, first the file header of ip.dat is read to determine the index-causing start address and the index-ending address in the index area, i.e. IPsC _ OFF1And IPsC _ OFFN. Then, the target address ip _ key is encoded into the address code ip _ key by adopting the encoding format, and the offset address IPsC _ OFF is used1And IPsC _ OFFNDetermining whether a maximum IPsC less than or equal to ip _ keyC exists in the N start address codes of the defined index area record in the sequential storagej (i)As a candidate start address code for determining the candidate address interval of the ip _ key, that is, whether there is an IPsC satisfying the following conditionj (i)
IPsCj (i)≤ip_keyC≤IPsC(i+1)Where 1. ltoreq. i.ltoreq.N, where when i is equal to N, i.e. as long as the last IPsC(N)If the value is less than or equal to ip _ keyC, IPsC(N)The maximum IPsC that is considered to satisfy less than or equal to ip _ keyC.
According to an embodiment of the present invention, since the records in the index area are stored in the order of the start address when the GIF file is generated, the index start address of the first record and the index end address IPsC _ OFF of the last record in the index area are passed through1And IPsC _ OFFNThe total number of records stored in the index area can be calculated, and then the start address code IPsC can be quickly determined by adopting the dichotomy(i). For example, based on the total number of records in the index area, the address of the index record in the middle position, i.e., IPsC _ OFF, can be determineduWhere u is N/2, however, the sum of ip _ keyC and IPsC _ OFFuInitial address code IPsC of position storage(u)Comparing, if ip _ keyC is larger than IPsC(u)Then continue at address IPsC _ OFFuAnd IPsC _ OFFNContinues to perform the dichotomy therebetween to find the maximum IPsC less than or equal to ip _ keyC(i). And if ip _ keyC is smaller than IPsC(u)Then continue at address IPsC _ OFF1And IPsC _ OFFuContinue to perform dichotomy in between to find ip _ keyC smaller than or equal to and larger than IPsC(i)If ip _ keyC is exactly equal to IPsC(u)Then the maximum IPsC is found(i)I.e. IPsC(u)
If it is determined in step 702 that such a start address code IPsC existsj (i)Go to step 703, reading from index area and IPsCj (i)Corresponding terminating IP address code IPeCjOffset address IPeC _ OFF in the recording areajWhere j denotes a group with IPsCK (i)Corresponding termination IP address code is at Kth position in the recording area, and address IPeC _ OFF is then shifted according to the termination codejReading out corresponding IPeC from record areajThen, step 704 is entered.
At step 704, a determination is made as to whether the destination address code ip _ keyC is located at the start address code IPsCj (i)With the read termination address code IPeCjIn the meantime. If the target address code ip _ keyC is determined to be positioned at the initial address code IPsCj (i)And the termination address code IPeCjAnd then determining that the target IP address IP _ key finds the corresponding modified address interval in the GIF file, namely that the IP _ key is positioned in the source data xjAddress field (IPStart) of (1)j,IPEndj) Then, the subsequent processing is performed based on the target address code IPeC read from the recording areajCorresponding Content of the related ContentjAnd determining the region information to which the target IP address IP _ key belongs.
As described in the foregoing when generating the file, the content stored after the termination code IPeC in the recording area is not necessarily a provincial code, but may be a mark and an offset address. Thus, as an example, as shown in fig. 7, at step 705, the stop code IPeC is readjThe first byte thereafter to determine whether the byte represents a Flag. As mentioned above, since the province code is encoded in utf-8 and the mark is represented by binary value, it is possible to determine the IPeC code to be read outiWhether the content of the first byte thereafter is equal to 0x01 or 0x02 to determine whether there is a Flag.
If the termination code IPeC is determined in step 705jThe value of the next byte is 0x02, which indicates that the byte is a Flag and that it indicates that province encoding has occurred but city encoding is the first occurrence, for example, j-m, as shown in the mth record of the recording area in fig. 2, i.e., Flag 0x02 is followed immediately by Flag 0x02Is the absolute offset address Content _ OFF of the previous record which has been recorded in the recording area by the same province code2And city code GeomProceed to step 706.
In step 706, read Flag is followed by the absolute offset address Content _ OFF occupying 4 bytes of space2Jump to Content _ OFF by the absolute offset address2The designated position is to read the province code Geo ending with separator \0 recorded in the 2 nd record2. In step 707, skip the offset Content _ OFF of the province code following the Flag2Occupied 4 bytes, directly reading out the city code Geo coded in utf-8 format ending with separator \0m. Step 708 encodes Geo with the provinces read in steps 706 and 7072And city coded GeomReturned to, for example, a third party as a response to the request Geo REQ and ended.
If the termination code IPeC is determined in step 705jA value of 0x01 for the next byte indicates that the byte is a Flag and that both the province code and the city code have occurred, for example, the case of j ═ p shown in fig. 2, i.e., the Flag is followed by the absolute offset address Content _ OFF recorded in the recording area previously and having recorded the same province code and city codemThen it proceeds to step 709 to read the absolute offset address Content _ OFF following Flag, occupying 4 bytes of spacemJump to Content _ OFF by the absolute offset addressmThe specified address. It is noted here that since it is still possible for the province code recorded in the previous recording to be recorded in an earlier recording, for example, the province code in the m-th recording in this example is recorded in an earlier recording, i.e., the 2 nd recording, according to this embodiment, the post-jump absolute offset address Content _ OFF is read in step 710mThe first byte of the indicated location to determine if the first byte contains the flag 0x 02. If the first byte is not the flag 0x02, then at step 711, at the absolute offset address Content _ OFFmStarting at a specified location to read a province code character Geo1 ending in, for example, \0mAnd a city code character Geo2 ending with, for example, \0mAnd in step (c)Step 711 encodes the read province code content Geo1mAnd city coded content Geo2mReturned to, for example, a third party as a response to the request Geo REQ. If it is determined at step 710 that the first byte contains the flag 0x02, as shown in the mth record in this example, the process proceeds to step 706. As described above, the absolute offset address Content _ OFF occupying 4 bytes of space after the read Flag is 0x022Jump to Content _ OFF by the absolute offset address2Address specified to read the provincial code Geo1 recorded in the previous recording2. Subsequently, in step 707, the absolute address offset Content _ OFF of the province code following the Flag is skippedmOccupied 4 bytes, directly reading out the city code Geo2 coded in utf-8 formatm. Step 708 encodes the province code content Geo1 read in steps 706 and 7072And city coded content Geo2mReturned to, for example, a third party as a response to the request Geo REQ.
If the termination code IPeC is determined in step 705jIf there is no Flag (0x01 or 0x02), the end code IPeC indicates that neither province code nor city code is presentjThe following contents are sequentially province code Geo1 adopting utf-8 codingjAnd city code Geo2jTherefore, proceed to step 711 to directly read the province code character Geo1 ending with, for example,/0jAnd city code character Geo2jAnd the read province code Geo1 is encoded in step 708jAnd city code Geo2jReturned to, for example, a third party as a response to the request Geo REQ.
If at step 704, no satisfied condition is found for the destination IP address IP _ key (i.e., IPsC)j (i)≤ip_keyC≤IPeCj) That is, in the N IP address intervals (IPStart, IPEnd) stored in the GIF file, there is no address interval containing the destination IP address IP _ key, the process proceeds to step 712, and a response that the IP _ key cannot identify the query failure is returned to the third party. Likewise, if no maximum IPsC is found to exist that is less than or equal to ip _ keyC at step 702j (i)I.e. no start is found which is less than or equal to the destination IP address IP _ keyIP address IPStart, the process proceeds to step 712, and returns a response that IP _ key fails to identify the query failure to the third party.
It is to be noted here that in the above embodiment, the absolute address offset Content _ OFF following the Flag refers to an address relative to the file header, while in another embodiment, Content _ OFF may also previously record a relative offset relative to the current record.
It should be noted that, in the above embodiment, since the addresses of the start IP address and the end IP address stored in the GIF file encode IPsC and IPeC, when querying the GIF file to determine the region to which the IP _ key belongs, the IP _ key needs to be converted into IP _ keyC according to the same encoding format so as to be compared with the IPsC and IPeC. However, the embodiment of the present invention is not limited to this, and for example, instead of converting ip _ key, IPsC and IPeC may be decoded into corresponding IPStart and IPEnd, so as to implement the comparison with ip _ key. Further, as described previously, in another embodiment of the present invention, the IPStart and IPEnd of each source data may be directly stored in the GIF file without any encoding, so that a direct comparison with the ip _ key may be made.
Fig. 8 schematically illustrates an example file-based geographic information query system that may be used to practice various embodiments described herein, and as shown, by way of example, the system is a computer having one or more processors, a non-volatile memory that may be used to store, for example, data and/or computer-readable instructions, the memory may be a hard disk (HDD), a compact disk (CD or DVD), a memory stick, a flash disk, or other computer-readable medium, wherein the processors may implement various methods provided by embodiments of the present invention by executing the instructions on the memory, including a GIF file generation method and a GIF file-based geographic information query method. For example, according to one embodiment, the one or more processors are configured to execute the executable instructions to: receiving a region inquiry request containing a target IP address; loading a GIF file, wherein the GIF file comprises: a recording area, wherein each record stores the terminal IP address code generated based on the terminal IP address and the related content of the region information corresponding to the IP address interval containing the terminal IP address; the index area is used for storing the IP addresses of the initial IP address codes generated based on the initial IP address and the corresponding termination IP address codes of the termination IP address in the recording area according to the sequence of the initial IP address; the file header is used for storing the index initial IP address and the index ending IP address of the first initial IP address code and the last initial IP address code in the index area; and responding to the region inquiry request, inquiring the file to determine the region information of the target IP address according to the address interval of the target IP address. According to the present invention, the GIF file can be stored in a local storage of the computer system, or can be called and loaded into a memory of the computer from a remote place through a network interface when the computer system is operated.
In the implementation of the present invention, the geographic information query system is not limited to a computer, but may also be, but not limited to: a server, a workstation, a desktop computing device, or a mobile computing device (e.g., a laptop computing device, a handheld computing device, a tablet computer, a netbook, etc.).
The above description is intended to be illustrative, and not restrictive. For example, the order of execution of at least some of the method steps may be interchanged without affecting the operation or effectiveness of the invention. Although the invention has been shown and described in detail in the drawings and in the preferred embodiments, the invention is not limited to the embodiments disclosed, and it will be apparent to those skilled in the art from this disclosure that various code auditing means in the various embodiments described above can be combined to obtain further embodiments of the invention, which are also within the scope of the invention.

Claims (35)

1. A method of creating a file for a geographic information query, comprising:
acquiring a plurality of pieces of source data, wherein each source data comprises a starting IP address and a terminating IP address of an indicated address interval and region information to which the address interval belongs;
generating a file based on the plurality of pieces of source data, the file including:
a recording area for recording an end address code generated based on an end IP address of each piece of source data and related contents related to the region information contained in the piece of source data;
an index area for storing a start address code generated based on the start IP address of each piece of source data and an address of an end address code of the piece of source data in the recording area;
and the file header is used for storing the index starting address and the index ending address of the first starting address code and the last starting address code in the index area.
2. The method of claim 1, further comprising: sorting the plurality of pieces of source data according to the starting IP address,
and the index area stores the start address code of each piece of source data and the address of the corresponding end address code of the piece of source data in the recording area according to the sequence of the start address.
3. The method of claim 2, wherein the generating the file comprises:
determining the address length which can be used for addressing the file based on the data volume of the obtained multiple pieces of source data and the IP address type, and reserving two storage positions with the address length at the file header;
recording the termination address code of each piece of source data and the related content of the piece of source data in the recording area;
according to the sequencing of the initial IP addresses, storing the initial address codes of all the source data and the addresses of the corresponding termination address codes in the recording area in the index area;
and storing the addresses of the first start address code and the last start address code in the index area in two reserved storage positions of the file header.
4. The method of claim 3, wherein the start and end IP addresses are IPV6 addresses, wherein the start and end IP addresses are encoded to generate the start and end address codes, respectively, in an encoding format that:
converting each of the eight segments of each IPV6 address into a binary representation having a length of two bytes;
the eight segments of each IPV6 address in the binary representation are concatenated to form a 16-byte stream.
5. The method of claim 3, wherein the start and end IP addresses are IPV4 addresses, wherein the start and end IP addresses are encoded to generate the start and end address codes, respectively, in an encoding format that:
converting each of the 4 segments of each IPV4 address to an 8-bit binary representation;
the 4 segments of IPV4 address in the binary representation are concatenated to form a 32-bit byte stream.
6. The method of claim 3 wherein the start IP address code and the end IP address code are IPV4 or IPV6 addresses themselves.
7. The method according to one of claims 1 to 6, wherein the geographical information comprises at least a first level of geographical information and a second level of geographical information;
the related content of the region-related information comprises:
the at least first level geographic information and second level geographic information; or
A flag indicating that at least one of the at least first level geographical information and second level geographical information in a current recording in the recording area has occurred in a previous recording, the flag having one of the following values:
a first value indicating that the first level of geographic information has occurred in a previous record;
a second value indicating that both the first level of geographic information and the second level of geographic information have occurred in a previous record;
wherein when the flag has a first value, the related contents of the region-related information include, in addition to the flag: a first offset address leading with the flag for indicating a location in the recording area where the first level geographical information is recorded previously recorded; and the second level of geographic information;
wherein when the flag has a second value, the content of the region-related information includes, in addition to the flag: a second offset address leading with the flag for indicating a location in the recording area where the previous recording of the first level geographical information and the second level geographical information was recorded.
8. The method of claim 7, wherein the first offset address and the second offset address are below:
absolute offset relative to the file header; and
a relative offset of the previous recording relative to the current recording position.
9. The method of claim 7, wherein the flag, the first and second level geographic information, and the first and second offset addresses have different formats to distinguish from each other, the formats including at least one of: length, value type, and coding standard.
10. The method of claim 1, wherein
The address of the corresponding termination address code in the recording area is an absolute offset address relative to the file header;
the addresses of the first start address code and the last start address code in the index area are absolute offset addresses relative to the file header.
11. A system for creating a file for a geographic information query, comprising:
a memory having executable instructions stored thereon;
one or more processors configured to implement the method of one of claims 1-10 by executing the instructions.
12. A computer storage medium having stored thereon executable instructions that, when executed, implement the method of claims 1-10.
13. A method for querying region information based on a file, the region information being related to an address interval indicated by a start IP address and an end IP address, wherein the file comprises:
a recording area in which each record stores an end IP address code generated based on the end IP address and related contents of the region information to which the address section including the end IP address belongs;
the index area is used for storing the starting IP address code generated based on the starting IP address and the address of the ending IP address code of the corresponding ending IP address in the recording area according to the sequence of the starting IP address;
the file header is used for storing the index starting address and the index ending address of the first starting IP address code and the last starting IP address code in the index area;
the method comprises the following steps:
receiving a region inquiry request containing a target IP address;
and responding to the region inquiry request, inquiring the file to determine the region information of the target IP address according to the address interval of the target IP address.
14. The method of claim 13, wherein querying the file to determine the geographic information to which the target IP address belongs comprises:
inquiring the file to determine whether a candidate address interval containing the target IP address exists, wherein when the candidate address interval containing the target IP address exists, the region information to which the target IP address belongs is determined based on the relevant content corresponding to the termination address code of the candidate address interval read from the recording area; and when the candidate address interval does not exist, returning the query failure as the response of the request.
15. The method of claim 14, wherein querying the file to determine whether a candidate address interval containing the target IP address exists comprises:
reading the file header to extract the index starting address and the index ending address;
determining the maximum initial address code of the target address code which is less than or equal to the target IP address in the range of the index area defined by the index initial address code and the index end address as the candidate initial address code of the candidate address interval;
reading the address of the ending address code stored in association with the candidate starting address code in the recording area from the index area to read the candidate ending address code of the candidate address interval from the recording area based on the address;
determining whether the target IP address code is located between the candidate start address code and a candidate end address code;
wherein the candidate address interval is determined to exist if the target IP address code is located between the candidate start address code and the candidate end address code.
16. The method of claim 15 wherein said destination IP address code, start IP address code and end IP address code are said destination IP address, start IP address and end IP address themselves.
17. The method of claim 15, wherein the destination IP address, the start IP address, and the end IP address are IPV6 addresses, wherein the IP addresses are encoded to generate the address codes, respectively, in an encoding format that:
converting each of the eight segments of each IPV6 address into a binary representation having a length of two bytes;
the eight segments of each IPV6 address in the binary representation are concatenated to form a 16-byte stream.
18. The method of claim 15, wherein the destination IP address, the start IP address, and the end IP address are IPV4 addresses, wherein the IP addresses are encoded to generate the address codes, respectively, in an encoding format that: :
converting each of the 4 segments of each IPV4 address to an 8-bit binary representation;
the 4 segments of IPV4 address in the binary representation are concatenated to form a 32-bit byte stream.
19. The method according to one of claims 16 to 18, wherein the geographical information comprises at least a first level of geographical information and a second level of geographical information;
wherein, based on the relevant content corresponding to the termination address code of the candidate address interval read from the recording area, determining the region information to which the target IP address belongs includes:
if the relevant content does not contain the mark, directly reading the at least first-level geographic information and the second-level geographic information from the relevant content as the region information to which the target IP address belongs;
if the related content contains a mark with a first value, reading the second-level geographic information and a first mark offset address guided by the mark from the related content, reading a first previous record indicated by the first mark offset address from the recording area to determine the first-level geographic information, and determining the read second-level geographic information and the determined first-level geographic information as the region information to which the target IP address belongs; and
and if the related content contains a mark with a second value, reading a second mark offset address guided by the mark from the related content, reading the related content in a second previous record indicated by the second mark offset address from the recording area to determine the first-level and second-level geographic information, and taking the determined first-level and second-level geographic information as the region information to which the target IP address belongs.
20. The method of claim 19, wherein reading the associated content in the second previous recording indicated by the second flag offset address from the recording area to determine the first level and second level geographical information comprises:
determining whether the related content in the second previous record contains a flag having a first value, wherein if the related content contains a flag having a first value, reading the second-level geographical information and a third offset address guided by the flag from the related content, and reading a third previous record indicated by the third-level flag offset address from the recording area to determine the first-level geographical information, and determining the read second-level geographical information and the determined first-level geographical information as the region information to which the target IP address belongs;
and if the related content does not contain the mark, directly reading the at least first-level geographic information and the second-level geographic information from the related content as the region information to which the target IP address belongs.
21. A method as claimed in claim 19 or 20, wherein the flag is recorded in a first byte of the associated content.
22. The method of claim 21, wherein the tag offset address comprises:
the absolute offset IP address previously recorded in the file relative to the file header; or
The relative offset IP address of the previous record relative to the current record location.
23. The method of claim 19 or 20, wherein said flags ], said first and second levels of geographic information, first and second flag offset addresses have different formats to distinguish from each other, the formats including at least one of: byte length, value, and encoding standard.
24. A computer storage medium having stored thereon executable instructions that, when executed, implement the method of any one of claims 13-23.
25. A file-based geographical information query system, the geographical information relating to an IP address interval indicated by a start IP address and an end IP address, comprising:
a memory for storing executable instructions;
one or more processors configured to execute the executable instructions to:
receiving a region inquiry request containing a target IP address;
loading a file, the file comprising: a recording area, wherein each record stores the terminal IP address code generated based on the terminal IP address and the related content of the region information corresponding to the IP address interval containing the terminal IP address; the index area is used for storing the IP addresses of the initial IP address codes generated based on the initial IP address and the corresponding termination IP address codes of the termination IP address in the recording area according to the sequence of the initial IP address; and the file header is used for storing the index initial IP address and the index ending IP address of the first initial IP address code and the last initial IP address code in the index area, responding to the region query request, and querying the file to determine the region information to which the target IP address belongs according to the address interval in which the target IP address is located.
26. The system of claim 25, wherein the one or more processors, by executing the executable instructions, are further configured to:
inquiring the file to determine whether a candidate address interval containing the target IP address exists, wherein when the candidate address interval containing the target IP address exists, the region information to which the target IP address belongs is determined based on the relevant content corresponding to the termination address code of the candidate address interval read from the recording area; and when the candidate address interval does not exist, returning the query failure as the response of the request.
27. The system of claim 26, wherein the one or more processors, by executing the executable instructions, are further configured to:
reading the file header to extract the index starting address and the index ending address;
determining the maximum initial address code of the target address code which is less than or equal to the target IP address in the range of the index area defined by the index initial address code and the index end address as the candidate initial address code of the candidate address interval;
reading the address of the ending address code stored in association with the candidate starting address code in the recording area from the index area to read the candidate ending address code of the candidate address interval from the recording area based on the address;
determining whether the target IP address code is located between the candidate start address code and a candidate end address code;
wherein the candidate address interval is determined to exist if the target IP address code is located between the candidate start address code and the candidate end address code.
28. The system of claim 27 wherein the destination IP address code, the start IP address code, and the end IP address code are the destination IP address, the start IP address, and the end IP address themselves.
29. The system of claim 27, wherein,
the target IP address, the start IP address, and the end IP address are IPV6 addresses, wherein the IP addresses are encoded to generate the address codes respectively in the following encoding formats:
converting each of the eight segments of each IPV6 address into a binary representation having a length of two bytes;
the eight segments of each IPV6 address in the binary representation are concatenated to form a 16-byte stream.
30. The system of claim 27, wherein the destination IP address, the start IP address, and the end IP address are IPV4 addresses, wherein the IP addresses are encoded to generate the address codes, respectively, in an encoding format that:
converting each of the 4 segments of each IPV4 address to an 8-bit binary representation;
the 4 segments of IPV4 address in the binary representation are concatenated to form a 32-bit byte stream.
31. The system according to any of claims 25-30, wherein the geographic information comprises at least a first level of geographic information and a second level of geographic information;
wherein the one or more processors are further configured by executing the executable instructions to:
wherein, based on the relevant content corresponding to the termination address code of the candidate address interval read from the recording area, determining the region information to which the target IP address belongs includes:
if the relevant content does not contain the mark, directly reading the at least first-level geographic information and the second-level geographic information from the relevant content as the region information to which the target IP address belongs;
if the related content contains a mark with a first value, reading the second-level geographic information and a first mark offset address guided by the mark from the related content, reading a first previous record indicated by the first mark offset address from the recording area to determine the first-level geographic information, and determining the read second-level geographic information and the determined first-level geographic information as the region information to which the target IP address belongs; and
and if the related content contains a mark with a second value, reading a second mark offset address guided by the mark from the related content, reading the related content in a second previous record indicated by the second mark offset address from the recording area to determine the first-level and second-level geographic information, and taking the determined first-level and second-level geographic information as the region information to which the target IP address belongs.
32. The system of claim 31, wherein the system further comprises,
wherein reading the relevant content in a second previous recording indicated by the second mark offset address from the recording area to determine the first level and second level geographical information comprises:
determining whether the related content in the second previous record contains a flag having a first value, wherein if the related content contains a flag having a first value, reading the second-level geographical information and a third offset address guided by the flag from the related content, and reading a third previous record indicated by the third-level flag offset address from the recording area to determine the first-level geographical information, and determining the read second-level geographical information and the determined first-level geographical information as the region information to which the target IP address belongs;
and if the related content does not contain the mark, directly reading the at least first-level geographic information and the second-level geographic information from the related content as the region information to which the target IP address belongs.
33. The system of claim 31 or 32, wherein the flag is recorded in a first byte of the associated content.
34. The system of claim 33, wherein the system further comprises,
wherein the tag offset address comprises:
the absolute offset IP address previously recorded in the file relative to the file header; or
The relative offset IP address of the previous record relative to the current record location.
35. The system of claim 31 or 32, wherein the token and the first and second levels of geographic information or the first and second token offset IP addresses have different formats to distinguish from each other, the formats including at least one of: length, value, and coding standard.
CN201910692124.3A 2019-07-30 2019-07-30 Storage and query method, system and medium of region information Pending CN112307138A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910692124.3A CN112307138A (en) 2019-07-30 2019-07-30 Storage and query method, system and medium of region information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910692124.3A CN112307138A (en) 2019-07-30 2019-07-30 Storage and query method, system and medium of region information

Publications (1)

Publication Number Publication Date
CN112307138A true CN112307138A (en) 2021-02-02

Family

ID=74330077

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910692124.3A Pending CN112307138A (en) 2019-07-30 2019-07-30 Storage and query method, system and medium of region information

Country Status (1)

Country Link
CN (1) CN112307138A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114201520A (en) * 2021-12-09 2022-03-18 北京航星永志科技有限公司 IP address quick retrieval method and device and electronic equipment
CN114285797A (en) * 2021-12-30 2022-04-05 北京天融信网络安全技术有限公司 Method and device for processing IP address and storage medium
WO2022142499A1 (en) * 2020-12-28 2022-07-07 北京锐安科技有限公司 Method and apparatus for determining region to which ip address belongs, and electronic device and storage medium

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7889676B1 (en) * 2006-04-13 2011-02-15 Infoblox Inc. Systems and methods for storing and retrieving data
CN102469134A (en) * 2010-11-17 2012-05-23 广州欢网科技有限责任公司 IP (Internet Protocol) address search method and device
CN104202441A (en) * 2014-09-10 2014-12-10 北京国双科技有限公司 IP (internal protocol) address data processing method and device
CN104796437A (en) * 2014-01-16 2015-07-22 深圳市快播科技有限公司 Method, device and system for querying geographical location information based on Nginx
CN104935676A (en) * 2014-03-17 2015-09-23 阿里巴巴集团控股有限公司 Method and device for determining IP address fields and corresponding latitude and longitude
CN106599019A (en) * 2016-10-21 2017-04-26 东莞市大易产业链服务有限公司 Precise and highly-efficient IP address locating method
CN106777163A (en) * 2016-12-20 2017-05-31 携程旅游网络技术(上海)有限公司 IP address institute possession querying method and system based on RBTree
CN107613043A (en) * 2017-09-26 2018-01-19 小草数语(北京)科技有限公司 The regional information searching method and its device of IP address
CN107682466A (en) * 2017-09-26 2018-02-09 小草数语(北京)科技有限公司 The regional information searching method and its device of IP address
CN108875006A (en) * 2018-06-15 2018-11-23 泰康保险集团股份有限公司 Determine method and device regional belonging to IP address

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7889676B1 (en) * 2006-04-13 2011-02-15 Infoblox Inc. Systems and methods for storing and retrieving data
CN102469134A (en) * 2010-11-17 2012-05-23 广州欢网科技有限责任公司 IP (Internet Protocol) address search method and device
CN104796437A (en) * 2014-01-16 2015-07-22 深圳市快播科技有限公司 Method, device and system for querying geographical location information based on Nginx
CN104935676A (en) * 2014-03-17 2015-09-23 阿里巴巴集团控股有限公司 Method and device for determining IP address fields and corresponding latitude and longitude
CN104202441A (en) * 2014-09-10 2014-12-10 北京国双科技有限公司 IP (internal protocol) address data processing method and device
CN106599019A (en) * 2016-10-21 2017-04-26 东莞市大易产业链服务有限公司 Precise and highly-efficient IP address locating method
CN106777163A (en) * 2016-12-20 2017-05-31 携程旅游网络技术(上海)有限公司 IP address institute possession querying method and system based on RBTree
CN107613043A (en) * 2017-09-26 2018-01-19 小草数语(北京)科技有限公司 The regional information searching method and its device of IP address
CN107682466A (en) * 2017-09-26 2018-02-09 小草数语(北京)科技有限公司 The regional information searching method and its device of IP address
CN108875006A (en) * 2018-06-15 2018-11-23 泰康保险集团股份有限公司 Determine method and device regional belonging to IP address

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
何江;李志蜀;陈宇;: "一种基于R树空间索引技术的GIS数据索引方法", 四川大学学报(自然科学版), no. 06 *
罗望东;梁艳花;王佳;: "利用SQL区分网站域名IP地址归属的方法", 电子技术与软件工程, no. 16 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022142499A1 (en) * 2020-12-28 2022-07-07 北京锐安科技有限公司 Method and apparatus for determining region to which ip address belongs, and electronic device and storage medium
CN114201520A (en) * 2021-12-09 2022-03-18 北京航星永志科技有限公司 IP address quick retrieval method and device and electronic equipment
CN114201520B (en) * 2021-12-09 2023-04-28 北京航星永志科技有限公司 Method and device for rapidly retrieving IP address and electronic equipment
CN114285797A (en) * 2021-12-30 2022-04-05 北京天融信网络安全技术有限公司 Method and device for processing IP address and storage medium
CN114285797B (en) * 2021-12-30 2024-04-19 北京天融信网络安全技术有限公司 Processing method, device and storage medium of IP address

Similar Documents

Publication Publication Date Title
US8838551B2 (en) Multi-level database compression
JP4685348B2 (en) Efficient collating element structure for handling large numbers of characters
EP1779273B1 (en) Multi-stage query processing system and method for use with tokenspace repository
US11204905B2 (en) Trie-based indices for databases
US7917480B2 (en) Document compression system and method for use with tokenspace repository
US4955066A (en) Compressing and decompressing text files
Moffat et al. On the implementation of minimum redundancy prefix codes
CN112307138A (en) Storage and query method, system and medium of region information
CN104657362A (en) Method and device for storing and querying data
CN110990520B (en) Address coding method and device, electronic equipment and storage medium
JP4114600B2 (en) Variable length character string search device, variable length character string search method and program
RU2633178C2 (en) Method and system of database for indexing links to database documents
CN100472526C (en) Method for storing, fetching and indexing data
JPWO2014097359A1 (en) Compression program, compression device, decompression program, and decompression device
CN111190896B (en) Data processing method, device, storage medium and computer equipment
US8463759B2 (en) Method and system for compressing data
Cannane et al. General‐purpose compression for efficient retrieval
JPH0869476A (en) Retrieval system
CN108090034B (en) Cluster-based uniform document code coding generation method and system
JP2011175231A (en) Map data
CN107832345A (en) The method of base station data unique numberization mark
CN110941730B (en) Retrieval method and device based on human face feature data migration
JP2005004560A (en) Method for creating inverted file
WO2021056167A1 (en) Information encoding method and apparatus, information decoding method and apparatus, storage medium, and information storage and interpretation method
CN111538796A (en) Address normalization processing method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination