CN109543118A - Web terrestrial reference reliability estimation method and device based on multilevel policy decision - Google Patents

Web terrestrial reference reliability estimation method and device based on multilevel policy decision Download PDF

Info

Publication number
CN109543118A
CN109543118A CN201811338745.3A CN201811338745A CN109543118A CN 109543118 A CN109543118 A CN 109543118A CN 201811338745 A CN201811338745 A CN 201811338745A CN 109543118 A CN109543118 A CN 109543118A
Authority
CN
China
Prior art keywords
terrestrial reference
candidate
domain name
address
web
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811338745.3A
Other languages
Chinese (zh)
Other versions
CN109543118B (en
Inventor
尹美娟
杨文�
刘晓楠
陈静
罗向阳
孙志豪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Information Engineering University of PLA Strategic Support Force
Original Assignee
Information Engineering University of PLA Strategic Support Force
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Information Engineering University of PLA Strategic Support Force filed Critical Information Engineering University of PLA Strategic Support Force
Priority to CN201811338745.3A priority Critical patent/CN109543118B/en
Publication of CN109543118A publication Critical patent/CN109543118A/en
Application granted granted Critical
Publication of CN109543118B publication Critical patent/CN109543118B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention belongs to network security applied technical field, in particular to a kind of Web terrestrial reference reliability estimation method and device, this method based on multilevel policy decision includes: parsing IP address in candidate terrestrial reference;For the candidate terrestrial reference after parsing, candidate terrestrial reference is filtered using filter, deletes invalid data;Filtered candidate terrestrial reference is assessed, its confidence score is obtained.The present invention is not under the premise of depending on path detection, make full use of internet open service, the confidence level of categorical filtering, quantitative evaluation terrestrial reference is carried out to the invalid Web terrestrial reference of different characteristic, the problems such as realizing that the automation of extensive Web terrestrial reference is assessed, solving the automatic quantization operation of low, inefficient, the extensive Web terrestrial reference of current method accuracy rate;It realizes ineffectually target filtering and effectively target credibility quantification is assessed, and effectively improve the accuracy of terrestrial reference acquisition and the accuracy of positioning result, there is great importance to the network server accurate acquiring technology of class entity terrestrial reference.

Description

Web terrestrial reference reliability estimation method and device based on multilevel policy decision
Technical field
The invention belongs to network security applied technical field, in particular to a kind of Web terrestrial reference based on multilevel policy decision is reliable Property appraisal procedure and device.
Background technique
Network entity location technology, i.e. IP location technology are one that network entity geographical location is determined by IP address Kind of technology, the technology targeted ads dispensing, the content customization based on region, in terms of widely answered With.Extensive utilization has been obtained with its higher accuracy and reliability based on the location technology of terrestrial reference, it is a large amount of high density, high-precision The network terrestrial reference of degree is supported at the key foundation that IP is positioned, and simultaneously target stability also directly affects the effect of IP positioning Fruit.The Web server of throughout world, widely distributed, quantity size is big, and IP address and geographical location relationship are relatively fixed, is The ideal chose of network terrestrial reference is referred to as designated as " Web terrestrial reference " suchly.The Web net that existing Web terrestrial reference method for digging will acquire Geographical location of the geographical location of organization belonging to standing as terrestrial reference, however due to hosting, shared host, CND network A large amount of presence, the especially rapid development of cloud service in recent years, cause Web terrestrial reference provide geographical location may not for Web take The actual position of business device can not position for IP and provide effective support.Therefore it needs to take effective algorithm to the geography of Web terrestrial reference The reliability of location information is assessed.
Method currently used for Web assessment mainly has: the method based on homepage redirection, referred to as LVM, this method pass through The mode of homepage redirection and postal region information contrast filters CDN network and shared host, may filter that partial invalidity terrestrial reference, but do not weigh Orientation is not necessarily exactly credible terrestrial reference, and many websites do not support IP address to access, with leading to the Web of this method acquisition standard True rate is not high, and because that need to carry out web page access twice to each terrestrial reference and be made whether that redirecting judgement causes estimating velocity slow, It is not suitable for extensive terrestrial reference assessment;Based on the street-level terrestrial reference appraisal procedure of nearest common router, referred to as SLE, the party Method according to access routing packets, carries out terrestrial reference according to the constraint relationship whether terrestrial reference in organizing meets network delay and geographic distance Reliability estimation greatly improves the accuracy of terrestrial reference, but target detects reachable, couple in router not over the ground for this method requirement It is anonymous and need repeated measurement path and time delay, can not automatic operating, be not suitable for extensive terrestrial reference assessment.
Summary of the invention
For this purpose, the present invention provides a kind of Web terrestrial reference reliability estimation method and device based on multilevel policy decision, solve current Obtain terrestrial reference accuracy rate it is low, it is inefficient, can not automatic operating the problems such as.
According to design scheme provided by the present invention, a kind of Web terrestrial reference reliability estimation method based on multilevel policy decision, packet Containing following content:
Parse IP address in candidate terrestrial reference;
For the candidate terrestrial reference after parsing, candidate terrestrial reference is filtered using filter, deletes invalid data;
Filtered candidate terrestrial reference is assessed, its confidence score is obtained.
Above-mentioned, in candidate terrestrial reference resolving, comprise the following steps:
Candidate terrestrial reference is grouped according to domain name, deletes data lack of standardization, which includes not provide domain name And the underproof candidate terrestrial reference of domain name;
DNS holography parsing is carried out, using multiple dns servers for being distributed in the whole world DNS query is carried out to domain name respectively, closed And each dns server return recording information, generate the IP address list of domain name mapping;
The IP address is assigned if the domain name only includes an IP address for the IP address list of domain name mapping Give the corresponding candidate terrestrial reference of the domain name;If the domain name includes n IP address, the corresponding candidate terrestrial reference of the domain name is replicated into n Part, an IP address in the IP address list is endowed in every part, wherein n is the integer greater than 1.
Above-mentioned, for the candidate terrestrial reference after parsing, the process that candidate terrestrial reference is filtered using filter, comprising such as Lower content:
It for the candidate terrestrial reference after parsing, is successively grouped and filters according to domain name, IP address, for retaining after filtering Candidate terrestrial reference, the domain name provided respectively into Web terrestrial reference and its IP address send Http request, and filtering returns the result inconsistent Terrestrial reference, and according to returning the result the candidate terrestrial reference confidence level initial value of setting.
Preferably, it for the candidate terrestrial reference after parsing, is successively grouped and filters according to domain name, IP address, comprising interior Hold as follows: firstly, being grouped according to candidate terrestrial reference domain name, extracting the statement position in every group of candidate's terrestrial reference, obtains statement position The distribution radius set deletes the candidate terrestrial reference group that distribution radius is more than preset value;Then, candidate terrestrial reference is carried out according to IP address The corresponding domain name list of every group of IP address is extracted in grouping, is merged with website subdomain name and is counted domain name number, and domain name number is deleted Not unique candidate terrestrial reference group;Each candidate terrestrial reference is traversed, according to the IP address that parsing obtains, IP address is deleted and is distributed in two The candidate terrestrial reference of network segment above/24.
Preferably, evaluation process is carried out to filtered candidate terrestrial reference, includes following content:
Each candidate terrestrial reference is traversed, determines the domain name number of the IP address carrying of candidate's terrestrial reference;According to domain name number, amendment The confidence level initial value of candidate terrestrial reference;
The country that candidate terrestrial reference is extracted in Whois registration information and third party library by comparing Web terrestrial reference and its IP saves City's information is adjusted revised confidence level, and candidate terrestrial reference is written in confidence value adjusted.
Preferably, process is adjusted to revised confidence level, includes following content: comparing Web terrestrial reference and its IP's Whois registration information obtains the similarity of information, is weighted adjustment to confidence level according to the similarity;Pass through third party library The national provinces and cities' information for matching the candidate terrestrial reference extracted adjusts confidence value according to matching degree.
Further, Whois registration information includes at least organization names, administrative division and contact method.
Preferably, each candidate terrestrial reference is traversed, the domain name number of the IP address carrying of candidate's terrestrial reference is determined, in following Hold:
Each candidate terrestrial reference is traversed, its IP address is obtained;
Use multiple anti-domain names for looking into the carrying of the query site IP address, the domain name list of Fusion query result;To merging Domain name in domain name list afterwards carries out DNS holography and inquires its IP address list, and deletes and do not include candidate ground in domain name list The domain name for marking IP address obtains the domain name list of candidate terrestrial reference IP address carrying;
Same website subdomain name in domain name list is merged, statistics domain name sum.
A kind of Web terrestrial reference reliability assessment device based on multilevel policy decision includes parsing module, filtering module and assessment mould Block, wherein
Parsing module, for parsing IP address in candidate terrestrial reference;
Filtering module deletes nothing for being filtered to candidate terrestrial reference using filter for the candidate terrestrial reference after parsing Imitate data;
Evaluation module obtains its confidence score for assessing filtered candidate terrestrial reference.
In above-mentioned device, filtering module includes filter submodule one, filter submodule two and initial value acquisition submodule, Wherein
Filter submodule one is extracted every for being grouped according to candidate terrestrial reference domain name for the candidate terrestrial reference after parsing Statement position in the candidate terrestrial reference of group, obtains the distribution radius of statement position, deletes the candidate ground that distribution radius is more than preset value Mark group;
Filter submodule two extracts the corresponding domain of every group of IP address for candidate terrestrial reference to be grouped according to IP address List of file names merges with website subdomain name and counts domain name number, deletes the not unique candidate terrestrial reference group of domain name number;Traversal is each Candidate terrestrial reference deletes the candidate terrestrial reference that IP address is distributed in more than two network segments according to the IP address that parsing obtains;
Initial value acquisition submodule, for being directed to the candidate terrestrial reference retained after filtering, the domain provided respectively into Web terrestrial reference Name and its IP address send Http request, and filtering returns the result inconsistent terrestrial reference, and sets candidate terrestrial reference according to returning the result Confidence level initial value.
Beneficial effects of the present invention:
Under the premise of not depending on path detection, make full use of internet the present invention open service, to the nothing of different characteristic The confidence level that Web terrestrial reference carries out categorical filtering, quantitative evaluation terrestrial reference is imitated, the automation assessment of extensive Web terrestrial reference is realized, solves Current method accuracy rate is low, it is inefficient, can not automatic operating the problems such as;And further, according to invalid candidate terrestrial reference domain name With the mapping relations feature of IP address, filtering uses the candidate terrestrial reference for sharing host, CDN network and Cloud Server;And it is comprehensive The reliability of landmark information is further inferred that using homepage redirection, Whois service and the library third party IP etc., with quantitatively Target confidence value, realizes ineffectually target filtering and effectively target credibility quantification is assessed, and solves extensive Web terrestrial reference Automatic quantization evaluation problem, and effectively improve terrestrial reference acquisition accuracy and positioning result accuracy, to network server The accurate acquiring technology of class entity terrestrial reference has great importance.
Detailed description of the invention:
Fig. 1 is appraisal procedure flow diagram in embodiment;
Fig. 2 is process of analysis schematic diagram in embodiment;
Fig. 3 is that candidate terrestrial reference assesses sub-process schematic diagram in embodiment;
Fig. 4 is the flow diagram that domain name sum is counted in embodiment;
Fig. 5 is to assess schematic device in embodiment;
Fig. 6 is filtering module schematic diagram in embodiment;
Fig. 7 is appraisal framework schematic diagram in embodiment
Fig. 8 be embodiment in be subject to 146, Zhengzhou IP address positioning accuracy verify schematic diagram;
Fig. 9 be embodiment in be subject to 119, Beijing IP address positioning accuracy verify schematic diagram.
Specific embodiment:
To make the object, technical solutions and advantages of the present invention clearer, understand, with reference to the accompanying drawing with technical solution pair The present invention is described in further detail.
It is commented currently, that there are accuracys rate in Web terrestrial reference reliability assessment is not high, estimating velocity is slow, is not suitable for extensive terrestrial reference The situations such as estimate, the embodiment of the present invention is shown in Figure 1, provides a kind of Web terrestrial reference reliability assessment side based on multilevel policy decision Method includes following content:
S101, IP address in candidate terrestrial reference is parsed;
S102, for the candidate terrestrial reference after parsing, candidate terrestrial reference is filtered using filter, delete invalid data;
S103, filtered candidate terrestrial reference is assessed, obtains its confidence score.
Under the premise of not depending on path detection, make full use of internet open service, to the invalid Web of different characteristic Mark carries out the confidence level of categorical filtering, quantitative evaluation terrestrial reference, realizes that front is worked as in the automation assessment of extensive Web terrestrial reference, solution Method accuracy rate is low, it is inefficient, can not automatic operating the problems such as.
In the preprocessing process parsed for candidate terrestrial reference, further embodiment of the present invention is shown in Figure 2, waits In selection of land mark resolving, comprise the following steps:
S1001, candidate terrestrial reference is grouped according to domain name, deletes data lack of standardization, which includes not mention For the underproof candidate terrestrial reference of domain name and domain name;
S1002, DNS query is carried out to domain name using multiple dns servers for being distributed in the whole world respectively, merges each DNS clothes Business device return recording information, generates the IP address list of domain name mapping;
S1003, the IP address list mapped for domain name, if the domain name only includes an IP address, by the IP Address assigns the domain name corresponding candidate terrestrial reference;If the domain name includes n IP address, by the corresponding candidate terrestrial reference of the domain name N parts of duplication, is endowed an IP address in the IP address list in every part, wherein n is the integer greater than 1.
It is illustrated by taking the assessment of Pekinese's web candidate's terrestrial reference as an example, in DNS query, 23 for being distributed in the whole world can be used Dns server is distributed domain name and carries out DNS query as shown in table 1, merges the A record information that each server returns, generates The IP address list of domain name mapping.
The dns server geographical distribution used is inquired in table 1:DNS holography
For pretreated candidate terrestrial reference, in further embodiment of the present invention, candidate terrestrial reference is carried out using filter The process of filtering includes following content: for the candidate terrestrial reference after parsing, being successively grouped according to domain name, IP address and mistake Filter, for the candidate terrestrial reference retained after filtering, the domain name provided respectively into Web terrestrial reference and its IP address send Http request, Filtering returns the result inconsistent terrestrial reference, and sets candidate terrestrial reference confidence level initial value according to returning the result.It is special to invalid terrestrial reference Property carry out layered filtration, improve assessment efficiency and accuracy.
Layered filtration successively is carried out according to domain name, IP address, in another embodiment of the present invention, is installed in the layered filtration Meter is comprising as follows: firstly, being grouped according to candidate terrestrial reference domain name, extracting the statement position in every group of candidate's terrestrial reference, acquisition sound The distribution radius of bright position deletes the candidate terrestrial reference group that distribution radius is more than preset value;Then, by candidate terrestrial reference according to IP address It is grouped, extracts the corresponding domain name list of every group of IP address, merge with website subdomain name and count domain name number, delete domain name The not unique candidate terrestrial reference group of number;Each candidate terrestrial reference is traversed, according to the IP address that parsing obtains, IP address is deleted and is distributed in The candidate terrestrial reference of more than two network segments.
Specifically, it is filtered ineffectually by domain-name position filtering, with IP filtering, with domain filter and redirection hierarchical Mark, wherein candidate terrestrial reference is grouped according to domain name, extracts the statement position of every group of candidate's terrestrial reference by domain-name position filtering, and The distribution radius of these positions is calculated, deleting distribution radius is more than RDCandidate terrestrial reference group;With IP filter, by candidate terrestrial reference according to IP address is grouped, and extracts the corresponding domain name list of every group of IP address, is merged with domain name number is counted after the subdomain name of website, is deleted Except the not unique candidate terrestrial reference group of domain name number;Same domain filter traverses each candidate terrestrial reference, complete according to its inquiry of the domain name DNS The IP address information that breath parsing obtains deletes the candidate terrestrial reference that IP address is distributed in two or more/24 network segments;Homepage redirection, Each candidate terrestrial reference is traversed, the domain name provided respectively into Web terrestrial reference and its IP address send Http request, and filtering returns to HTML As a result inconsistent terrestrial reference, and confidence level initial value r is generated according to returning the result0:
Wherein, resIPIndicate the HTML of the Http request of the IP address construction of Web terrestrial reference as a result, resdominIt indicates As a result, null expression returns the result no content, delete indicates to filter candidate's terrestrial reference the HTML of its domain name.It is waited according to invalid The mapping relations feature of selection of land mark domain name and IP address, filtering use the candidate ground for sharing host, CDN network and Cloud Server Mark;And it comprehensively utilizes homepage redirection, Whois service and the library third party IP etc. and the reliability of landmark information is further pushed away It is disconnected, with quantitatively target confidence value, to realize ineffectually target filtering and effectively target credibility quantification assessment.
For filtered candidate terrestrial reference, in further embodiment of the present invention, shown in Figure 3, evaluation process includes Following content:
The each candidate terrestrial reference of S3001, traversal, determines the domain name number of the IP address carrying of candidate's terrestrial reference;According to the domain name Number corrects the confidence level initial value of candidate terrestrial reference;
S3002, pass through the candidate terrestrial reference of extraction in the Whois registration information and third party library of comparison Web terrestrial reference and its IP National provinces and cities' information, is adjusted revised confidence level, and candidate terrestrial reference is written in confidence value adjusted.
Preferably, process is adjusted to revised confidence level, includes following content: comparing Web terrestrial reference and its IP's Whois registration information obtains the similarity of information, is weighted adjustment to confidence level according to the similarity;Pass through third party library The national provinces and cities' information for matching the candidate terrestrial reference extracted adjusts confidence value according to matching degree.Wherein, Whois registration letter Breath includes at least organization names, administrative division and contact method.
Each candidate terrestrial reference is traversed, determines the domain name number of the IP address carrying of candidate's terrestrial reference, shown in Figure 4, this hair Its step are as follows for design in bright further embodiment:
The each candidate terrestrial reference of S3101, traversal, obtains its IP address;
S3102, multiple anti-domain names for looking into the carrying of the query site IP address, the domain name list of Fusion query result are used; DNS holography is carried out to the domain name in the domain name list after merging and inquires its IP address list, and deletes and does not include in domain name list The domain name of candidate terrestrial reference IP address obtains the domain name list of candidate terrestrial reference IP address carrying;
S3103, the same website subdomain name in domain name list is merged, statistics domain name sum.
IP is counter to look into reasoning, traverses each candidate terrestrial reference, target IP address takes reversed check addition to determine its carrying over the ground Domain name number, then the initial trusted degree of target is modified over the ground accordingly, obtains confidence level r1.By traversing each candidate ground Mark, obtains its IP address;Reversed proof method inquires the IP address, first using it is multiple it is counter look into website, such as table 2, with inquiring the IP Then the domain name of location carrying, amalgamation result list carry out DNS holography to the domain name in domain name list and inquire its IP address list simultaneously Delete list does not always include the domain name of the terrestrial reference IP address, finally obtains the domain name list of terrestrial reference IP address carrying.
Table 2: counter to look into website test result
Same website subdomain name in domain name list is merged, is then merged into list with the domain name in terrestrial reference, is counted Wherein domain name sum n;Web terrestrial reference confidence level is modified, confidence level r is obtained1:
r1=(1-pd)r0+pdf(n) (7)
F (n)=e1-n, (n=1,2 ...) (8)
Wherein, pdFor the anti-reliability weight for looking into information of IP.Counter look into does not ensure that the whole domain names of acquisition, therefore IP is counter looks into As a result it is used as assessment reference rather than filter criteria.
The Whois registration information for comparing Web and its IP, according to the similar of the information such as mechanism name, administrative division, contact method Degree, is weighted adjustment to confidence level, obtains confidence level r2
r2=(1-pw)r1+pwwr (2)
wr=kcwc+kowo+(1-kc-ko)wd (3)
wc=kcowco+kprwpr+(1-kco-kpr)wci (4)
Wherein, pwFor the reliability weight of Whois information, kc, koRespectively whois registers the power in administrative area, registration body Value, wc, wo, wdThe match index that administrative area, registration body and registered domain name are registered for whois, is calculated, value by LCS method Range is 0-1, kco, kprThe weight in respectively national, provincial administrative area, wco, wpr, wciIt is then whois registration information and terrestrial reference The administrative area of information matches granularity, and value condition is as shown in table 3.
Table 3: administrative area matching degree assignment rule
In the public datas such as third party's toll free database, country, the provinces and cities' information of terrestrial reference are extracted, it is same that the IP is obtained by IP The position at place is compared by the data that ip2locationDB9 is obtained, and calculates correction factor according to its matching degree lr, terrestrial reference confidence level r is obtained, and terrestrial reference is written into the confidence value.
R=(1-pl)r2+pllr (5)
lr=kcowLco+kprwLpr+(1-kco-kpr)wLci (6)
Wherein, plFor the reliability weight of IP location library, wLco, wLpr, wLciIt is then respectively IP location library information and terrestrial reference letter The administrative area of breath matches granularity, value condition such as table 3.
For the mappings characteristics of invalid Web terrestrial reference domain name and IP, the DNS for effectively parsing whole IP address of domain name is carried out Holography inquiry, and the reversed verification for obtaining IP carrying domain name is maximized, the layer-by-layer category filter terrestrial reference of the thought of decision tree is taken, And comprehensively utilize public data and service target confidence level over the ground and assessed, the credible terrestrial reference with quantization confidence level is obtained, The drawbacks of existing appraisal procedure can be overcome, obtains high credible terrestrial reference, has in terrestrial reference accuracy and setting accuracy and obviously mention It rises.
A kind of Web based on multilevel policy decision is also provided based on above-mentioned reliability estimation method, in the embodiment of the present invention Reliability assessment device is marked, it is shown in Figure 5, it include parsing module 101, filtering module 102 and evaluation module 103, wherein
Parsing module 101, for parsing IP address in candidate terrestrial reference;
Filtering module 102, for being filtered, being deleted to candidate terrestrial reference using filter for the candidate terrestrial reference after parsing Except invalid data;
Evaluation module 103 obtains its confidence score for assessing filtered candidate terrestrial reference.
In above-mentioned device, filtering module 102 includes that filter submodule 1, filter submodule 2 202 and initial value obtain Submodule 203 is taken, wherein
Filter submodule 1, for being grouped, mentioning according to candidate terrestrial reference domain name for the candidate terrestrial reference after parsing The statement position in every group of candidate's terrestrial reference is taken, the distribution radius of statement position is obtained, deletes the time that distribution radius is more than preset value Selection of land mark group;
It is corresponding to extract every group of IP address for candidate terrestrial reference to be grouped according to IP address for filter submodule 2 202 Domain name list merges with website subdomain name and counts domain name number, deletes the not unique candidate terrestrial reference group of domain name number;Traversal is every A candidate's terrestrial reference deletes the candidate terrestrial reference that IP address is distributed in more than two network segments according to the IP address that parsing obtains;
Initial value acquisition submodule 203, for being provided into Web terrestrial reference respectively for the candidate terrestrial reference retained after filtering Domain name and its IP address send Http request, filtering returns the result inconsistent terrestrial reference, and candidate according to setting is returned the result Terrestrial reference confidence level initial value.
It is shown in Figure 7 in the embodiment of the present invention, layered filtration is carried out for ineffectually target characteristic, and utilize public The reliability of Web terrestrial reference is assessed in the public datas such as service, third party's toll free database.This method solve extensive The credibility quantification of the automatic quantization evaluation problem of Web terrestrial reference, the filtering and effective Web terrestrial reference that realize invalid Web terrestrial reference is commented Estimate, and effectively increases the accuracy of terrestrial reference and the accuracy of positioning result.
For the validity of verification method, distribution is using cross validation and the method for positioning comparison to effectiveness of the invention It is assessed.
Cross validation is compared by the third party's IP location library excavated from terrestrial reference and appraisal procedure source is different, with the two Overlap proportion is come the method for judging terrestrial reference Evaluated effect.Respectively using Evaluator, LVM and SLE method to the time in 5 cities Selection of land mark is assessed, and the terrestrial reference in each city is divided into 5 groups: (1) candidate terrestrial reference collection;(2) ground that LVM method is assessed Mark;(3) terrestrial reference that Evaluator frame is assessed;(4) Evaluator frame is assessed to obtain the ground that confidence level is greater than 0.5 Mark;(5) confidence level that Evaluator frame is assessed is greater than 0.8 terrestrial reference.It will mark to each group and Maxmind database Query result compares, and the height of Duplication then reflects the order of accuarcy of terrestrial reference to a certain extent.
The terrestrial reference entry statistical result that 4 two methods of table are excavated
Two kinds of appraisal procedures all substantially increase the accuracy rate of terrestrial reference as can be seen from Table 4, and evaluation scheme of the invention is more It is excellent, it is significantly improved in terrestrial reference accuracy.Locating verification is positioned by the IP address to known location to verify ground First against the candidate terrestrial reference of Zhengzhou City and Beijing the solution of the present invention is respectively adopted, based on homepage in the method for marking validity The LVM method of redirection and terrestrial reference reliability assessment is carried out based on the SLE method that routes jointly recently, then respectively at two The reliable IP address of city artificial marking position (146, Zhengzhou, 119, Beijing), the credible terrestrial reference finally obtained using assessment The IP address of known location is positioned, and position error is counted.The position error of the three kinds of methods in Zhengzhou and Beijing The distribution of accumulated probability densogram is counted as shown in Fig. 8 and Fig. 9.As seen from the figure, in the positioning of Zhengzhou City, the present invention program Mean error be 9.1 kilometers, it is similar with SLE method (8.6 kilometers of mean error) precision, (average accidentally considerably beyond LVM method Poor 19.7 kilometers) positioning accuracy.In positioning to Pekinese, the mean error of the present invention program is 7.3 kilometers, SLE 6.6 Kilometer and the mean error of LVM method be 23.4 kilometers.Thus, it is possible to find out that the present invention program mentions significantly on the basis of LVM High positioning accuracy, it is similar with the highest SLE method of current accuracy, and the time detected repeatedly for avoiding SLE method opens Pin shows the credibly target validity that the present invention evaluates.
Based on above-mentioned method, the embodiment of the present invention also provides a kind of server, comprising: one or more processors;It deposits Storage device, for storing one or more programs, when one or more of programs are executed by one or more of processors, So that one or more of processors realize above-mentioned method.
Based on above-mentioned method, the embodiment of the present invention also provides a kind of computer-readable medium, is stored thereon with computer Program, wherein the program realizes above-mentioned method when being executed by processor.
Unless specifically stated otherwise, the opposite step of the component and step that otherwise illustrate in these embodiments, digital table It is not limit the scope of the invention up to formula and numerical value.
The technical effect and preceding method embodiment phase of device provided by the embodiment of the present invention, realization principle and generation Together, to briefly describe, Installation practice part does not refer to place, can refer to corresponding contents in preceding method embodiment.
It is apparent to those skilled in the art that for convenience and simplicity of description, the system of foregoing description It with the specific work process of device, can refer to corresponding processes in the foregoing method embodiment, details are not described herein.
In all examples being illustrated and described herein, any occurrence should be construed as merely illustratively, without It is as limitation, therefore, other examples of exemplary embodiment can have different values.
It should also be noted that similar label and letter indicate similar terms in following attached drawing, therefore, once a certain Xiang Yi It is defined in a attached drawing, does not then need that it is further defined and explained in subsequent attached drawing.
The flow chart and block diagram in the drawings show the system of multiple embodiments according to the present invention, method and computer journeys The architecture, function and operation in the cards of sequence product.In this regard, each box in flowchart or block diagram can generation A part of one module, section or code of table, a part of the module, section or code include one or more use The executable instruction of the logic function as defined in realizing.It should also be noted that in some implementations as replacements, being marked in box The function of note can also occur in a different order than that indicated in the drawings.For example, two continuous boxes can actually base Originally it is performed in parallel, they can also be executed in the opposite order sometimes, and this depends on the function involved.It is also noted that It is the combination of each box in block diagram and or flow chart and the box in block diagram and or flow chart, can uses and execute rule The dedicated hardware based system of fixed function or movement is realized, or can use the group of specialized hardware and computer instruction It closes to realize.
In several embodiments provided herein, it should be understood that disclosed systems, devices and methods, it can be with It realizes by another way.The apparatus embodiments described above are merely exemplary, for example, the division of the unit, Only a kind of logical function partition, there may be another division manner in actual implementation, in another example, multiple units or components can To combine or be desirably integrated into another system, or some features can be ignored or not executed.Another point, it is shown or beg for The mutual coupling, direct-coupling or communication connection of opinion can be through some communication interfaces, device or unit it is indirect Coupling or communication connection can be electrical property, mechanical or other forms.
The unit as illustrated by the separation member may or may not be physically separated, aobvious as unit The component shown may or may not be physical unit, it can and it is in one place, or may be distributed over multiple In network unit.It can select some or all of unit therein according to the actual needs to realize the mesh of this embodiment scheme 's.
It, can also be in addition, the functional units in various embodiments of the present invention may be integrated into one processing unit It is that each unit physically exists alone, can also be integrated in one unit with two or more units.
It, can be with if the function is realized in the form of SFU software functional unit and when sold or used as an independent product It is stored in the executable non-volatile computer-readable storage medium of a processor.Based on this understanding, of the invention Technical solution substantially the part of the part that contributes to existing technology or the technical solution can be with software in other words The form of product embodies, which is stored in a storage medium, including some instructions use so that One computer equipment (can be personal computer, server or the network equipment etc.) executes each embodiment institute of the present invention State all or part of the steps of method.And storage medium above-mentioned includes: USB flash disk, mobile hard disk, read-only memory (ROM, Read- Only Memory), random access memory (RAM, Random Access Memory), magnetic or disk etc. are various can be with Store the medium of program code.
Finally, it should be noted that embodiment described above, only a specific embodiment of the invention, to illustrate the present invention Technical solution, rather than its limitations, scope of protection of the present invention is not limited thereto, although with reference to the foregoing embodiments to this hair It is bright to be described in detail, those skilled in the art should understand that: anyone skilled in the art In the technical scope disclosed by the present invention, it can still modify to technical solution documented by previous embodiment or can be light It is readily conceivable that variation or equivalent replacement of some of the technical features;And these modifications, variation or replacement, do not make The essence of corresponding technical solution is detached from the spirit and scope of technical solution of the embodiment of the present invention, should all cover in protection of the invention Within the scope of.Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (10)

1. a kind of Web terrestrial reference reliability estimation method based on multilevel policy decision, which is characterized in that include following content:
Parse IP address in candidate terrestrial reference;
For the candidate terrestrial reference after parsing, candidate terrestrial reference is filtered using filter, deletes invalid data;
Filtered candidate terrestrial reference is assessed, its confidence score is obtained.
2. the Web terrestrial reference reliability estimation method according to claim 1 based on multilevel policy decision, which is characterized in that candidate In terrestrial reference resolving, comprise the following steps:
Candidate terrestrial reference is grouped according to domain name, deletes data lack of standardization, which includes not provide domain name and domain The underproof candidate terrestrial reference of name;
Using the dns server of multiple distribution on global DNS query is carried out to domain name respectively, merges each dns server return recording Information generates the IP address list of domain name mapping;
For the IP address list of domain name mapping, if the domain name only includes an IP address, which is assigned should The corresponding candidate terrestrial reference of domain name;If the domain name includes n IP address, the corresponding candidate terrestrial reference of the domain name is replicated n parts, often An IP address in the IP address list is endowed in part, wherein n is the integer greater than 1.
3. the Web terrestrial reference reliability estimation method according to claim 1 based on multilevel policy decision, which is characterized in that be directed to Candidate terrestrial reference after parsing, the process being filtered using filter to candidate terrestrial reference include following content:
It for the candidate terrestrial reference after parsing, is successively grouped and filters according to domain name, IP address, for the time retained after filtering Selection of land mark, the domain name provided respectively into Web terrestrial reference and its IP address send Http request, and filtering returns the result inconsistent ground Mark, and candidate terrestrial reference confidence level initial value is set according to returning the result.
4. the Web terrestrial reference reliability estimation method according to claim 3 based on multilevel policy decision, which is characterized in that be directed to Candidate terrestrial reference after parsing, is successively grouped and filters according to domain name, IP address, as follows comprising content: firstly, according to candidate Terrestrial reference domain name is grouped, and extracts the statement position in every group of candidate's terrestrial reference, obtains the distribution radius of statement position, deletes distribution Radius is more than the candidate terrestrial reference group of preset value;Then, candidate terrestrial reference is grouped according to IP address, extracts every group of IP address pair The domain name list answered merges with website subdomain name and counts domain name number, deletes the not unique candidate terrestrial reference group of domain name number;Time Each candidate terrestrial reference is gone through, according to the IP address that parsing obtains, deletes the candidate terrestrial reference that IP address is distributed in more than two network segments.
5. the Web terrestrial reference reliability estimation method according to claim 3 based on multilevel policy decision, which is characterized in that mistake Candidate terrestrial reference after filter carries out evaluation process, includes following content:
Each candidate terrestrial reference is traversed, determines the domain name number of the IP address carrying of candidate's terrestrial reference;According to the domain name number, amendment is waited Selection of land target confidence level initial value;
It is right by extracting national provinces and cities' information of candidate terrestrial reference in the Whois registration information and third party library of Web terrestrial reference and its IP Revised confidence level is adjusted, and candidate terrestrial reference is written in confidence value adjusted.
6. the Web terrestrial reference reliability estimation method according to claim 5 based on multilevel policy decision, which is characterized in that repairing Confidence level after just is adjusted process, includes following content: comparing the Whois registration information of Web terrestrial reference and its IP, obtains letter The similarity of breath is weighted adjustment to confidence level according to the similarity;The candidate terrestrial reference extracted is matched by third party library National provinces and cities' information, according to matching degree adjust confidence value.
7. the Web terrestrial reference reliability estimation method according to claim 6 based on multilevel policy decision, which is characterized in that Whois Registration information includes at least organization names, administrative division and contact method.
8. the Web terrestrial reference reliability estimation method according to claim 5 based on multilevel policy decision, which is characterized in that traversal Each candidate's terrestrial reference determines the domain name number of the IP address carrying of candidate's terrestrial reference, includes following content:
Each candidate terrestrial reference is traversed, its IP address is obtained;
Use multiple anti-domain names for looking into the carrying of the query site IP address, the domain name list of Fusion query result;After merging Domain name in domain name list carries out DNS holography and inquires its IP address list, and deletes and do not include candidate terrestrial reference IP in domain name list The domain name of address obtains the domain name list of candidate terrestrial reference IP address carrying;
Same website subdomain name in domain name list is merged, statistics domain name sum.
9. a kind of Web terrestrial reference reliability assessment device based on multilevel policy decision, which is characterized in that include parsing module, filter module Block and evaluation module, wherein
Parsing module, for parsing IP address in candidate terrestrial reference;
Filtering module deletes invalid number for being filtered to candidate terrestrial reference using filter for the candidate terrestrial reference after parsing According to;
Evaluation module obtains its confidence score for assessing filtered candidate terrestrial reference.
10. the Web terrestrial reference reliability assessment device according to claim 9 based on multilevel policy decision, which is characterized in that filtering Module includes filter submodule one, filter submodule two and initial value acquisition submodule, wherein
Filter submodule one, for being grouped according to candidate terrestrial reference domain name, extracting every group of time for the candidate terrestrial reference after parsing Statement position in selection of land mark obtains the distribution radius of statement position, deletes the candidate terrestrial reference group that distribution radius is more than preset value;
Filter submodule two extracts the corresponding domain name column of every group of IP address for candidate terrestrial reference to be grouped according to IP address Table merges with website subdomain name and counts domain name number, deletes the not unique candidate terrestrial reference group of domain name number;Traverse each candidate Terrestrial reference deletes the candidate terrestrial reference that IP address is distributed in more than two network segments according to the IP address that parsing obtains;
Initial value acquisition submodule, for for the candidate terrestrial reference that retains after filtering, the domain name provided respectively into Web terrestrial reference and Its IP address sends Http request, and filtering returns the result inconsistent terrestrial reference, and credible according to the candidate terrestrial reference of setting is returned the result Spend initial value.
CN201811338745.3A 2018-11-12 2018-11-12 Web landmark reliability assessment method and device based on multi-layer decision Active CN109543118B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811338745.3A CN109543118B (en) 2018-11-12 2018-11-12 Web landmark reliability assessment method and device based on multi-layer decision

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811338745.3A CN109543118B (en) 2018-11-12 2018-11-12 Web landmark reliability assessment method and device based on multi-layer decision

Publications (2)

Publication Number Publication Date
CN109543118A true CN109543118A (en) 2019-03-29
CN109543118B CN109543118B (en) 2020-06-12

Family

ID=65846850

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811338745.3A Active CN109543118B (en) 2018-11-12 2018-11-12 Web landmark reliability assessment method and device based on multi-layer decision

Country Status (1)

Country Link
CN (1) CN109543118B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110119437A (en) * 2019-04-03 2019-08-13 中国人民解放军战略支援部队信息工程大学 Network entity terrestrial reference appraisal procedure and device with the error upper limit
CN110188954A (en) * 2019-05-31 2019-08-30 中国人民解放军战略支援部队信息工程大学 Terrestrial reference reliability estimation method and device based on POP network
CN111970262A (en) * 2020-08-07 2020-11-20 杭州安恒信息技术股份有限公司 Method and device for detecting third-party service enabling state of website and electronic device
CN114896522A (en) * 2022-04-14 2022-08-12 北京航空航天大学 Multi-platform information epidemic situation risk assessment method and device
WO2023029486A1 (en) * 2021-08-30 2023-03-09 北京百度网讯科技有限公司 Site evaluation method and apparatus, and electronic device, storage medium and program product

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103051717A (en) * 2012-12-25 2013-04-17 北京小米科技有限责任公司 Method, device and equipment for processing http request
CN104168341A (en) * 2014-08-15 2014-11-26 北京百度网讯科技有限公司 IP address locating method and CDN dispatching method and device
CN104333609A (en) * 2014-10-15 2015-02-04 北京百度网讯科技有限公司 IP address positioning method and device thereof
CN104537105A (en) * 2015-01-14 2015-04-22 中国人民解放军信息工程大学 Automatic network physical landmark excavating method based on Web maps

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103051717A (en) * 2012-12-25 2013-04-17 北京小米科技有限责任公司 Method, device and equipment for processing http request
CN104168341A (en) * 2014-08-15 2014-11-26 北京百度网讯科技有限公司 IP address locating method and CDN dispatching method and device
CN104333609A (en) * 2014-10-15 2015-02-04 北京百度网讯科技有限公司 IP address positioning method and device thereof
CN104537105A (en) * 2015-01-14 2015-04-22 中国人民解放军信息工程大学 Automatic network physical landmark excavating method based on Web maps

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
朱彬: "基于IP地址的网络实体地理位置定位技术研究与实现", 《基于IP地址的网络实体地理位置定位技术研究与实现 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110119437A (en) * 2019-04-03 2019-08-13 中国人民解放军战略支援部队信息工程大学 Network entity terrestrial reference appraisal procedure and device with the error upper limit
CN110119437B (en) * 2019-04-03 2021-04-23 中国人民解放军战略支援部队信息工程大学 Network entity landmark evaluation method and device with error upper limit
CN110188954A (en) * 2019-05-31 2019-08-30 中国人民解放军战略支援部队信息工程大学 Terrestrial reference reliability estimation method and device based on POP network
CN111970262A (en) * 2020-08-07 2020-11-20 杭州安恒信息技术股份有限公司 Method and device for detecting third-party service enabling state of website and electronic device
WO2023029486A1 (en) * 2021-08-30 2023-03-09 北京百度网讯科技有限公司 Site evaluation method and apparatus, and electronic device, storage medium and program product
CN114896522A (en) * 2022-04-14 2022-08-12 北京航空航天大学 Multi-platform information epidemic situation risk assessment method and device

Also Published As

Publication number Publication date
CN109543118B (en) 2020-06-12

Similar Documents

Publication Publication Date Title
CN109543118A (en) Web terrestrial reference reliability estimation method and device based on multilevel policy decision
CN106547770B (en) User classification and user identification method and device based on user address information
CN104537105B (en) A kind of network entity terrestrial reference automatic mining method based on Web maps
CN104199891B (en) Data processing method and device for thermodynamic chart
Drakonakis et al. Please forget where I was last summer: The privacy risks of public location (meta) data
CN106537384A (en) Reverse IP databases using data indicative of user location
Dan et al. Improving IP geolocation using query logs
CN111159973B (en) Administrative division alignment and standardization method for Chinese addresses
CN105704259B (en) A kind of domain name authority services source IP recognition methods and system
Christen et al. A probabilistic geocoding system based on a national address file
CN106302737B (en) The cleaning method of bench mark data in a kind of IP location technology
Li et al. Street‐Level Landmark Evaluation Based on Nearest Routers
Ding et al. Gnn-geo: A graph neural network-based fine-grained ip geolocation framework
CN111026829B (en) Street-level landmark obtaining method based on service identification and domain name association
CN110012128A (en) Network entity terrestrial reference screening technique based on hop count
CN110188954A (en) Terrestrial reference reliability estimation method and device based on POP network
CN109783521B (en) IP home location determination method, device and computer storage medium
CN113242332B (en) Improved method for forming street-level positioning library
Li et al. LandmarkMiner: Street-level network landmarks mining method for IP geolocation
CN113923184A (en) IP positioning reference point extraction method, device and readable storage medium
CN110311991B (en) Street-level landmark obtaining method based on SVM classification model
CN107463558A (en) Business location information for vertical search obtains and analysis method
CN110866611A (en) Malicious domain name detection method based on SVM machine learning
CN110300193A (en) A kind of method and apparatus obtaining entity domain name
Yin et al. Evaluator: A Multilevel Decision Approach for Web‐Based Landmark Evaluation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant