CN108737592A - A method of verification IP address resources bank precision - Google Patents

A method of verification IP address resources bank precision Download PDF

Info

Publication number
CN108737592A
CN108737592A CN201810678523.XA CN201810678523A CN108737592A CN 108737592 A CN108737592 A CN 108737592A CN 201810678523 A CN201810678523 A CN 201810678523A CN 108737592 A CN108737592 A CN 108737592A
Authority
CN
China
Prior art keywords
address
library
libraries
resources bank
verification
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810678523.XA
Other languages
Chinese (zh)
Inventor
刘晓光
汪志武
赵子毅
温伟球
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Wisdom Cloud Technology Co Ltd
Original Assignee
Beijing Wisdom Cloud Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Wisdom Cloud Technology Co Ltd filed Critical Beijing Wisdom Cloud Technology Co Ltd
Priority to CN201810678523.XA priority Critical patent/CN108737592A/en
Publication of CN108737592A publication Critical patent/CN108737592A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L2101/00Indexing scheme associated with group H04L61/00
    • H04L2101/60Types of network addresses
    • H04L2101/69Types of network addresses using geographic information, e.g. room number

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The present invention relates to a kind of verification IP address resources bank precision method,IP address resources bank is divided into multiple IP address libraries,Data in IP address library are arranged according to unified format,It is used as the benchmark IP address number of comparison,Finally the coverage rate and coincidence factor of different stage in these libraries are counted respectively respectively,Storage and conclusion,And assess the relative reliability between them,It was found that the height of the coincidence factor in different IP addresses library under different stage,And it can be found that the relationship of the height and address base confidence level of coincidence factor,In the case of incomplete believable IP address library,It can also be to multiple IP address libraries in coverage rate,The different dimension such as coincidence factor is compared,And point out difference and rule in the presence of these IP address libraries,To find relatively believable IP address library,Improve the accuracy that IP positioning is carried out by IP address library,Provide a method that can be for reference.

Description

A method of verification IP address resources bank precision
Technical field
The present invention relates to the data assessment technologies of database, and in particular to a kind of side of verification IP address resources bank precision Method.
Background technology
IP address database (is also) in the libraries IP, is collected by multiple technologies means by long-time by professional technician And come, and there is professional to be updated, safeguard, supplement for a long time.A large amount of IP address is housed inside the libraries IP, facilitates use Family is inquired, for example, being aware of some IP, so that it may anti-in which street of that city of which province even which Internet bar to find this computer It, it is known that a place can also check that this place has which IP sections.
User can be parsed into IP in the server in the domain name for accessing website input, and the final IP that accesses shows website Content, then, it is exactly to be realized by dns resolution that domain name, which is become IP,.Under normal conditions, a domain name can parse one Or multiple IP address, multiple IP will form IP address resources bank.In CDN business, need according to CDN node Service Quality Amount translates domain names into as the optimal IP address of service quality, therefore the accuracy in IP address library is directly related to the provided clothes of CDN The quality of business.
In general, the record in an IP address library all contains an IP address section, and each IP address section is closed It is linked to other information used for positioning:Including national coding, city, latitude and longitude coordinates and postcode etc..The country has very at present Multiple separate IP address libraries, each IP address library are primarily present problem following aspects, include among these:Numerous numbers It is rarely consistent from according to the geographical location information of IP address in library;In IP address library data cover city numbers and cover compared with Fine-grained degree difference is larger, between data geography information be not inconsistent and the divided difference of IP address section, it is thus larger The accuracy for affecting IP address library.
There has been the verification method for IP address library accuracy in the country at present, and more common is to use simple randomization The methods of sampling verifies the reliability of IP address geolocation mapping database, but the sample that this method uses is very few, and The covering surface of use is insufficient, and the reliability of verification result is not special abundant.
Invention content
In view of the drawbacks of the prior art and insufficient, the invention discloses a kind of sides of verification IP address resources bank precision All kinds of IP address libraries are compared in method, the difference to find out batch from each IP address library and rule present in it, To verify the accuracy in each IP address library.
The present invention is achieved by the following technical programs:
A method of IP address resources bank is divided into multiple IP address libraries by verification IP address resources bank precision, will Data in IP address library are arranged according to unified format, specifically, IP address block is divided into the IP that prefix is 24 The geographical location information of address field, the mapping of IP address block is marked according to county and administrative division code above county level;Then right The IP address section in these IP address libraries takes union, after obtaining union, is used as the benchmark IP address number of comparison, finally The coverage rate and coincidence factor of different stage in these libraries are counted, stored and are concluded respectively respectively, and assess them it Between relative reliability.
In the above-mentioned technical solutions, after the IP address section of multiple databases merges operation completion, then with each The IP address information of database is compared with benchmark IP address data, respectively obtains each database in country, identity and city Coverage rate on 3, city different stage, and assess the relative reliability between them.
In the above-mentioned technical solutions, it is compared with the IP address information of each database with benchmark IP address data, and The relative reliability between them is assessed, refers to by the relationship of intersection and registration between different IP addresses library, generation is covered Lid rate analyzes the reliability of divided multiple databases with this, when the rank that comparison uses is lower, what address base overlapped Ratio is lower, and the believable degree of address base is also just lower.
As a preferred embodiment of the above technical solution, compared with the IP address information of each database and benchmark IP address data Compared with, and the relative reliability between them is assessed, take three libraries to be compared specifically, appointing from divided multiple databases It is right, or,
Appointing from divided multiple databases takes two libraries to be compared, i.e., any two IP address library under different stage The poor proportion subtracted each other of intersection and other two IP address library, or,
By assuming that any one IP address library confidence level is true, by the data overlapped two-by-two, obtain at this time other three Then the confidence level in a IP address library sums respective IP address library confidence level, finally obtains the total credible of each IP address library Degree.
Beneficial effects of the present invention are:
The present invention relates to a kind of verification IP address resources bank precision method, it can be found that in different IP addresses library The height of coincidence factor under different stage, and it can be found that the relationship of the height and address base confidence level of coincidence factor, when not complete In the case of complete believable IP address library, the method detected present invention provides one, to multiple IP address libraries coverage rate, The different dimension such as coincidence factor compares, and points out the difference and rule in the presence of these IP address libraries, i.e. IP address Geographic information data in library is thinner in the granularity of division, and the difference of data is bigger in each IP address library, and confidence level is lower.For Relatively believable IP address library is found, improves the accuracy for carrying out IP positioning by IP address library, providing one can be for reference Method.
Description of the drawings
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below There is attached drawing needed in technology description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this Some embodiments of invention for those of ordinary skill in the art without creative efforts, can be with Obtain other attached drawings according to these attached drawings.
Fig. 1 is a kind of method for verification IP address resources bank precision that the embodiment of the present invention 1 provides in different stage Under multiple IP address libraries coverage rate schematic diagram;
Fig. 2 is a kind of multiple IP address of the method for verification IP address resources bank precision that the embodiment of the present invention 1 provides The schematic diagram of coincidence factor of the library under different stage;
Fig. 3 is a kind of multiple IP address of the method for verification IP address resources bank precision that the embodiment of the present invention 2 provides The schematic diagram of coverage rate of the library under different stage;
Fig. 4 is a kind of multiple IP address of the method for verification IP address resources bank precision that the embodiment of the present invention 3 provides The schematic diagram of coverage rate difference of the library under different stage;
Specific implementation mode
In order to make the object, technical scheme and advantages of the embodiment of the invention clearer, below in conjunction with the embodiment of the present invention In attached drawing, technical scheme in the embodiment of the invention is clearly and completely described, it is clear that described embodiment is A part of the embodiment of the present invention, instead of all the embodiments.Based on the embodiments of the present invention, those of ordinary skill in the art The every other embodiment obtained without creative efforts, shall fall within the protection scope of the present invention.
Embodiment 1
The embodiment of the present invention is illustrated by taking the tetra- IP address libraries A, B, C, D as an example, its accuracy is compared.Comparison Method it is as follows:
The data in the tetra- IP address libraries A, B, C, D are arranged according to unified format first, the purpose of arrangement is Four addresses are compared for convenience, IP address block is divided into the IP address section that prefix is 24, the mapping of IP address block Geographical location information is marked according to county and administrative division code above county level;Then to the IP address section in these IP address libraries Take union, after obtaining these unions, be used as comparison benchmark IP address number, finally respectively in these libraries not Same level is respectively that country, province and the other coverage rate of City-level and coincidence factor are counted respectively, and assess them Between relative reliability.
Believe after the IP address section of 4 databases to be merged to operation and is completed, then with the IP address of each database Breath is compared with benchmark IP address data, obtains covering of each database respectively in country, identity and 3, city rank Rate.
As shown in Figure 1, axis of abscissas is from left to right followed successively by the tetra- IP address libraries A, B, C, D, axis of ordinates is coverage rate. As seen from Figure 1, in country level coverage rate, the ratio of B address bases is highest, in province and city rank, A The coverage rate in location library is highest.Coverage rate of the D address bases under country, province, three, city rank is all minimum.So From the point of view of coverage rate, A address bases are best, and D address bases are worst.
Through the embodiment of the present invention, it has also been found that national coincidence factor, province coincidence factor, city are heavy in different IP addresses library The height of conjunction rate, and it can be found that the relationship of the height and address base confidence level of coincidence factor.As shown in Fig. 2, axis of abscissas is from a left side It is followed successively by country, province and city rank to the right side, axis of ordinates is the coincidence factor in each IP address library under rank.From Fig. 2 I It can be seen that, national coincidence factor, province coincidence factor, the city coincidence factor in 4 IP address libraries reduce successively.Illustrate from whole next It sees, when the rank that comparison uses is lower, the ratio that 4 address bases overlap is lower, and 4 believable degree of address base are also just more It is low.
Embodiment 2
The relationship analysis number of the intersection and registration between a kind of library by different IP addresses is additionally provided in the present invention According to the method for library reliability.In the present invention, appointing from the tetra- IP address libraries A, B, C, D takes three libraries to be compared, i.e., different The poor proportion that the intersection in arbitrary three IP address libraries is subtracted each other with another IP address library under rank, is shown in Fig. 3, one shares four Kind situation.
As shown in figure 3, its axis of abscissas be from left to right followed successively by from the tetra- IP address libraries A, B, C, D appoint take three libraries into Row compares, and axis of ordinates is its difference proportion.
It can be seen that under the country of comparison, province and three kinds of city rank, the coincidence factor of A, B, C address base all compares It is high, that is to say, that in four IP address libraries, the similarity highest in these three libraries, D address bases are with other three in geography information Statistically difference it is maximum.Meanwhile under country, province and three kinds of city different stage, the coincidence factor of A, B, C address base also exists Constantly decline, the coincidence factor of three kinds of databases is rising in three kinds of situations in addition, and this also illustrates the ranks used when comparison When lower, the ratio that four databases overlap is lower, and the credibility of four databases is also just worse.
Embodiment 3
If appointing from four address bases takes two libraries to be compared, i.e., the friendship in any two IP address library under different stage The poor proportion that collection subtracts each other with other two IP address library, is shown in Fig. 4, shares six kinds of situations.
As shown in figure 3, its axis of abscissas be from left to right followed successively by from the tetra- IP address libraries A, B, C, D appoint take 2 libraries with Other 2 are compared, and axis of ordinates is its difference proportion.
A, B coincidence factor highest under country level, and under province and city rank the IP address library of C and B coincidence factor Highest.Coincidence factor of the D and B address bases under country, province and three kinds of city different stage is all minimum.In addition, at six kinds In the case of, when the tri- IP address libraries A, B, C are compared with D, coincidence factor will be than comparing between the tri- IP address libraries A, B, C Coincidence factor want low, this also illustrates three address bases of D address bases and other are larger in the statistically difference of geography information, A, B, The coincidence factor of tri- address bases of C is higher.
Embodiment 4
In the present invention, additionally provide it is a kind of judge address base confidence level method, this method by assuming that any one IP address library confidence level is true, it is assumed that the confidence level in any one IP address library is true, by the data overlapped two-by-two, can be obtained To the confidence level in other three IP address libraries at this time, in the case of then four kinds are assumed, the summation of respective IP address library confidence level, just It can obtain total confidence level in each IP address library.
As shown in table 1 below, in the case of describing four kinds of hypothesis, confidence level of each IP address library under the rank of city.Table The true value that lattice assume the numerical value on line for four kinds, i.e., coverage rate of each IP address library under the rank of city, other numerical value are The current confidence level for assuming other lower three IP address libraries.
The confidence level summation in respective IP address library in the case of assuming four kinds of the embodiment of the present invention, can obtain A, B, C, D Confidence level of four IP address libraries under the rank of city is respectively 243.80%, 207.08%, 201.97%, 231.14%.Institute With the confidence level highest of the A address bases under the rank of city, D address base confidence levels are minimum.Can similarly obtain, in country level and The confidence level height of tetra- address bases of A, B, C, D under the rank of province.
City-level other data reliability % in each IP address library in the case of 1 four kinds of table
IP address library A B C D
A 60.04 43.00 51.51 52.53
B 43.00 60.33 50.01 48.63
C 51.51 50.01 81.89 60.39
D 52.53 48.63 60.39 69.59
By establishing IP address base value, to contrasting the case where coincidence in four IP address libraries, this four IP are found Address base differs greatly, coverage rate maximum difference nearly 30%, and when the rank of comparison is lower, the coverage rate in IP address library with And coincidence factor can all decline, the confidence level of data also declines therewith.In addition, in the IP address library of four domestic mainstreams, no matter It is from the point of view of coverage rate or coincidence factor, the quality of data of D address bases will be less than other three IP address libraries.Therefore it can not find Before other three IP address libraries are the evidences mutually used for reference, comprehensive four IP address libraries can in the performance of coverage rate and coincidence factor To obtain, D address bases are that confidence level is minimum, the confidence level highest of A address bases.
The above embodiments are merely illustrative of the technical solutions of the present invention, rather than its limitations;Although with reference to the foregoing embodiments Invention is explained in detail, it will be understood by those of ordinary skill in the art that:It still can be to aforementioned each implementation Technical solution recorded in example is modified or equivalent replacement of some of the technical features;And these modification or It replaces, the spirit and scope for various embodiments of the present invention technical solution that it does not separate the essence of the corresponding technical solution.

Claims (4)

1. a kind of method of verification IP address resources bank precision, it is characterised in that:IP address resources bank is divided into multiple IP Address base arranges the data in IP address library according to unified format, specifically, IP address block is divided into prefix is 24 IP address sections, the geographical location information of IP address block mapping is according to county and administrative division code above county level into rower Note;Then union is taken to the IP address section in these IP address libraries, after obtaining union, with being used as the benchmark IP of comparison Location number is finally counted, stored and is concluded to the coverage rate and coincidence factor of different stage in these libraries respectively, and commented respectively Estimate the relative reliability between them.
2. a kind of method of verification IP address resources bank precision according to claim 1, it is characterised in that:When multiple numbers It is merged after operation completes according to the IP address section in library, then with the IP address information of each database and benchmark IP address data It is compared, respectively obtains coverage rate of each database on country, identity and 3, city different stage, and assess them Between relative reliability.
3. a kind of method of verification IP address resources bank precision according to claim 2, it is characterised in that:With every number It is compared with benchmark IP address data according to the IP address information in library, and assesses the relative reliability between them, refer to passing through The relationship of intersection and registration between different IP addresses library generates coverage rate, and divided multiple databases are analyzed with this Reliability, when the rank that comparison uses is lower, the ratio that address base overlaps is lower, and the believable degree of address base is also just more It is low.
4. a kind of method of verification IP address resources bank precision according to claim 2, it is characterised in that:With every number It is compared with benchmark IP address data according to the IP address information in library, and assesses the relative reliability between them, specifically, from Appointing in divided multiple databases takes three libraries to be compared, or,
Appointing from divided multiple databases takes two libraries to be compared, i.e., the friendship in any two IP address library under different stage The poor proportion that collection subtracts each other with other two IP address library is led to or, by assuming that any one IP address library confidence level is true The data overlapped two-by-two are crossed, the confidence level in other three IP address libraries at this time is obtained, then respective IP address library confidence level is asked With finally obtain total confidence level in each IP address library.
CN201810678523.XA 2018-06-27 2018-06-27 A method of verification IP address resources bank precision Pending CN108737592A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810678523.XA CN108737592A (en) 2018-06-27 2018-06-27 A method of verification IP address resources bank precision

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810678523.XA CN108737592A (en) 2018-06-27 2018-06-27 A method of verification IP address resources bank precision

Publications (1)

Publication Number Publication Date
CN108737592A true CN108737592A (en) 2018-11-02

Family

ID=63931189

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810678523.XA Pending CN108737592A (en) 2018-06-27 2018-06-27 A method of verification IP address resources bank precision

Country Status (1)

Country Link
CN (1) CN108737592A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110661901A (en) * 2019-08-08 2020-01-07 网宿科技股份有限公司 Method for collecting and integrating IP library, electronic equipment and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110661901A (en) * 2019-08-08 2020-01-07 网宿科技股份有限公司 Method for collecting and integrating IP library, electronic equipment and storage medium
CN110661901B (en) * 2019-08-08 2022-11-04 网宿科技股份有限公司 Letter collecting method, integration method, electronic equipment and storable medium of IP library

Similar Documents

Publication Publication Date Title
CN108446281B (en) Method, device and storage medium for determining user intimacy
Gharaibeh et al. A look at router geolocation in public and commercial databases
Hersh et al. The primacy of race in the geography of income‐based voting: New evidence from public voting records
Ojanperä et al. The digital knowledge economy index: mapping content production
Stropp et al. Mapping ignorance: 300 years of collecting flowering plants in Africa
Costas et al. Do “altmetrics” correlate with citations? Extensive comparison of altmetric indicators with citations from a multidisciplinary perspective
Bloomfield et al. A comparison of network and clustering methods to detect biogeographical regions
US6741990B2 (en) System and method for efficient and adaptive web accesses filtering
US20120330959A1 (en) Method and Apparatus for Assessing a Person's Security Risk
Takhteyev et al. Investigating the geography of open source software through GitHub
Lieberman et al. You are where you edit: Locating wikipedia contributors through edit histories
EP3396558B1 (en) Method for user identifier processing, terminal and nonvolatile computer readable storage medium thereof
US10049369B2 (en) Group targeting system and method for internet service or advertisement
Marsico et al. Small herbaria contribute unique biogeographic records to county, locality, and temporal scales
Özkula et al. Easy data, same old platforms? A systematic review of digital activism methodologies
Wiser Achievements and challenges in the integration, reuse and synthesis of vegetation plot data
CN111447292B (en) IPv6 geographical position positioning method, device, equipment and storage medium
Mashhadi et al. On the accuracy of urban crowd-sourcing for maintaining large-scale geospatial databases
US8396877B2 (en) Method and apparatus for generating a fused view of one or more people
Haffner A spatial analysis of non‐English Twitter activity in Houston, TX
Sarkar et al. Corporate editors in OpenStreetMap: Investigating co‐editing patterns
Li et al. Changing IP geolocation from arbitrary database query towards multi-databases fusion
Cainelli et al. Does agglomeration affect exports? Evidence from Italian local labour markets
CN108737592A (en) A method of verification IP address resources bank precision
Chatterjee et al. SAGEL: smart address geocoding engine for supply-chain logistics

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20181102

RJ01 Rejection of invention patent application after publication