Summary of the invention
The invention provides and detect method, the Apparatus and system that kidnap in the DNS black hole, can detect the DNS black hole and kidnap behavior, and then avoid the user to be subject to the interference that advertisement or the navigation page etc. are kidnapped the page.
The invention provides following scheme:
A kind of method that detects the abduction of DNS black hole comprises:
HTML (Hypertext Markup Language) HTTP connection packet corresponding to web access requests in the crawl network extracted corresponding domain name and the IP address of webpage from described packet, and the corresponding relation between record domain name and the I P address;
The result who grabs is added up, obtain the quantity of different domain names corresponding to same IP address;
According to the quantity of different domain names corresponding to same IP address, be identified for carrying out the IP address that kidnap in the DNS black hole, and preserve the IP address that is used for carrying out the abduction of DNS black hole of determining;
When user's web access requests produces current HTTP connection packet, from described current HTTP connection packet, extract the IP address;
If the IP address that extracts appears at the IP address that is used for carrying out the abduction of DNS black hole of preserving, determine that then user's web access requests is subjected to the abduction of DNS black hole.
The DNS black hole is kidnapped the DNS black hole and is kidnapped the abduction DNS black hole abduction of DNS black hole
Optionally, described quantity according to different domain names corresponding to same IP address, the IP address that is identified for carrying out kidnapping in the DNS black hole comprises:
The quantity of extracting corresponding different domain names reaches the IP address of prerequisite as IP to be verified address;
Obtain server response message corresponding to described IP to be verified address;
According to described server response message described IP to be verified address is verified, if the verification passes, then IP to be verified address is defined as the IP address of kidnapping be used to carrying out the DNS black hole.
Optionally, comprise the web content data bag in the described server response message, described described IP to be verified address the checking according to described server response message comprises:
From web content data bag corresponding to described IP to be verified address, extract web page contents, with the web page contents that extracts with known be to compare be used to web page contents corresponding to the IP address of carrying out kidnapping in the DNS black hole, if similarity reaches preset threshold value, then checking is passed through.
Optionally, comprise web page code in the described server response message, described described IP to be verified address the checking according to described server response message comprises:
Judge whether comprise the key code that presets in the described web page code, if comprise, then checking is passed through.
Optionally, the described key code that presets comprises the jump instruction code.
A kind of device that detects the abduction of DNS black hole comprises:
Placement unit for the HTML (Hypertext Markup Language) HTTP connection packet corresponding to web access requests of crawl network, extracts corresponding domain name and the IP address of webpage from described packet, and the corresponding relation between record domain name and the IP address;
Statistic unit is used for the result who grabs is added up, and obtains the quantity of different domain names corresponding to same IP address;
The IP address determining unit that is used for abduction is used for the quantity according to different domain names corresponding to same IP address, is identified for carrying out the IP address that kidnap in the DNS black hole, and preserves the IP address that is used for carrying out the abduction of DNS black hole of determining;
IP address extraction unit is used for extracting the IP address from described current HTTP connection packet when user's web access requests produces current HTTP connection packet;
Detecting unit if the IP address that is used for extracting appears at the IP address that is used for carrying out the abduction of DNS black hole of preserving, determines that then user's web access requests is subjected to the abduction of DNS black hole.
Optionally, described IP address determining unit for kidnapping comprises:
Extract subelement, reach the IP address of preset threshold value as IP to be verified address for the quantity of extracting corresponding different domain names;
The response information acquisition subelement is used for obtaining server response message corresponding to described IP to be verified address;
The checking subelement is used for according to described server response message described IP to be verified address being verified, if the verification passes, then IP to be verified address is defined as the IP address of kidnapping be used to carrying out the DNS black hole.
Optionally, comprise the web content data bag in the described server response message, described checking subelement comprises:
The first checking subelement, be used for extracting web page contents from web content data bag corresponding to described IP to be verified address, with the web page contents that extracts with known be to compare be used to web page contents corresponding to the IP address of carrying out kidnapping in the DNS black hole, if similarity reaches preset threshold value, then checking is passed through.
Optionally, comprise web page code in the described server response message, described checking subelement comprises:
The second checking subelement is used for judging whether described web page code comprises the key code that presets, if comprise, then checking is passed through.
A kind of system that detects the abduction of DNS black hole comprises server end and client, and wherein, described server end comprises:
Placement unit for the HTML (Hypertext Markup Language) HTTP connection packet corresponding to web access requests of crawl network, extracts corresponding domain name and the IP address of webpage from described packet, and the corresponding relation between record domain name and the IP address;
Statistic unit is used for the result who grabs is added up, and obtains the quantity of different domain names corresponding to same IP address;
The IP address determining unit that is used for abduction is used for the quantity according to different domain names corresponding to same IP address, is identified for carrying out the IP address that kidnap in the DNS black hole, and preserves the IP address that is used for carrying out the abduction of DNS black hole of determining;
Described client comprises:
IP address extraction unit is used for extracting the IP address from described current HTTP connection packet when user's web access requests produces current HTTP connection packet;
Uploading unit is for end that the IP address that extracts is uploaded onto the server;
Described server end also comprises:
Detecting unit if the IP address that is used for extracting appears at the IP address that is used for carrying out the abduction of DNS black hole of preserving, determines that then user's web access requests is subjected to the abduction of DNS black hole.
According to specific embodiment provided by the invention, the invention discloses following technique effect:
By the present invention, by collecting a large amount of HTTP packets, therefrom extract the corresponding relation of domain name and IP address, and it is added up, drawing may be the IP address of kidnapping be used to carrying out the DNS black hole, and then when user's accessed web page, can extract the IP address in the HTTP packet, judge whether it appears at the IP address of kidnapping for carrying out the DNS black hole, if so, can conclude that then user's web page access has been subject to the abduction of DNS black hole.As seen, in the process of user's accessed web page, can detect the DNS black hole and kidnap behavior, and then avoid the user to be subject to the interference that advertisement or the navigation page etc. are kidnapped the page.
Embodiment
Below in conjunction with the accompanying drawing in the embodiment of the invention, the technical scheme in the embodiment of the invention is clearly and completely described, obviously, described embodiment only is the present invention's part embodiment, rather than whole embodiment.Based on the embodiment among the present invention, the every other embodiment that those of ordinary skills obtain belongs to the scope of protection of the invention.
For fear of causing conceptual confusion, at first need to prove, although see to seem Webpage on the presentation of DNS black hole, from technological essence, this behavior also is based on this step of dns resolution to be carried out, but kidnaps also different with DNS.Also namely, kidnap from the DNS black hole in the DNS black hole, DNS abduction etc. all is different concepts, and the below simply introduces respectively.
So-called " webpage abductions " or cry " Pagejack " is that machine has access to correct web page server, and web page server has returned the correct page, can be in the process of the machine but pass at the page, in some links by other people replacement or modification.
It is then more thorough that DNS kidnaps, abduction be exactly dns server itself, when that is to say dns resolution, the parsing that itself is provided by the dns server of a mistake exactly, the naturally wrong page after being replaced namely that finally returns.
So-called DNS black hole then is the analysis service that is provided by correct dns server, just taken wrong IP address in this step of dns resolution, what cause that HTTP access time has access to is exactly wrong web page server, and then what obtain also is the page of mistake.
On another angle, DNS kidnaps and the webpage abduction is generally all caused by virus or hacker attacks, the DNS black hole then is a kind of service that legal operator provides mostly, the page of being kidnapped by the DNS black hole also only limits to this domain name and can't resolve in the situation of (that is to say that domain name is invalid), for the user returns the alternative page, the domain name that can normally resolve can not kidnapped by the DNS black hole.
Referring to Fig. 1, the method that kidnap in the detection DNS black hole that the embodiment of the invention provides can may further comprise the steps:
S101: HTTP(Hypertext Transfer Protocol corresponding to web access requests in the crawl network, HTML (Hypertext Markup Language)) connection packet, from described packet, extract corresponding domain name and the IP address of webpage, and the corresponding relation between record domain name and the IP address;
Use in the process of browser access webpage the user, can produce web access requests, afterwards can be at first convert the URL of accessed webpage to the IP address by dns server, and generation HTTP packet, IP address can preserve conversion in the HTTP packet after is in order to send to web page server corresponding to this IP address with web access requests.In this process, situation about can't normally resolve if there is domain name then may be replaced to the IP address of kidnapping be used to carrying out the DNS black hole by Virtual network operator etc.That is to say, the IP address that comprises in the HTTP packet might be the actual corresponding IP address of URL of webpage, also might be the IP address after being replaced.In embodiments of the present invention, just can collect this HTTP packet, therefrom extract domain name and the IP address of webpage, and record this corresponding relation of domain name and IP address.
Wherein, a certain computer or calculate the title of unit on the Internet that domain name is comprised of a string name of separating with point identifies the electronic bearing of computer, such as abc.com when being used for transfer of data on the internet.Briefly, domain name is the title that computer or calculating unit are registered on the internet, and the user can have access to by the title of this registration corresponding computer or calculate unit.This title can comprise some information of registrant, such as company or organization name, service content etc.Domain name also has other difference of level simultaneously, and abc.com described above is a TLD, and TLD is distributed by special international organization, and second level domain, three grades of domain names can be arranged under the TLD, is a second level domain such as news.abc.com.Some second level domains, especially the second level domain of registering for some establishment, usually can be used for difference and outstanding different business plate, otherwise the different business plate often can reflect by different second level domains, news.abc.com described above can represent the news plate, and sports.abc.com can represent the physical culture plate of this website.
For the user, a domain name has represented a website usually, each webpage that the user browses, it then is the file that certain file of downloading in the server of from then on website presets, network address by user's browsing page, can obtain the domain-name information that comprises in this network address, for example the network address of user's access is sports.abc.com/football/fifa2010/123.htm, and the domain name that can wherein be comprised is: sports.abc.com.
In embodiments of the present invention, in order to grasp the HTTP packet in the network, can realize based on the cloud engine of browser.So-called cloud engine namely refers to the browser program in server end operation, this program can with the browser program cooperating in the subscriber's local operation, jointly finish the access task of webpage for the user.For example, in the situation that use the cloud engine, the user is after initiating a web access requests, and this request can not be directly to send to web page server, but sends to first the cloud engine of browser, sends to web page server by the cloud engine.Like this, each user of this browser is in the process of accessed web page in the network, the cloud engine of browser can get access to web access requests, like this, just can collect a large amount of HTTP packets by the cloud engine of browser, and therefrom extract respectively the corresponding relation of domain name and IP address, be used for follow-up processing operation.Perhaps, under other implementation, the browser of subscriber's local also can copy the HTTP packet cloud engine that portion sends to browser, for the collection of the information of carrying out, etc.
Need to prove why will grasp the HTTP packet and detect, is because the purpose in DNS black hole itself is exactly to kidnap specific webpage, shows the page of oneself.And this must since the browser resolves http data reach the purpose of displaying.The data of non-http protocol can't can directly be showed by browser as HTTP, so also there is not the meaning of abduction.Can have kidnap meaning only have HTTP and two kinds of agreements of HTTPS, but the communication process of HTTPS encrypts, interior data can't be obtained fully, the analysis of also having no way of, thus grasp in the embodiment of the invention only have the HTTP data.
S102: the result who grabs is added up, obtain the quantity of different domain names corresponding to same IP address;
Owing to having grabbed the corresponding relation between a large amount of domain names and the IP address, therefore just can add up based on these data, may be the IP address of kidnapping be used to carrying out the DNS black hole to therefrom getting access to.Because Virtual network operator is when carrying out the abduction of DNS black hole, generally can use one or several fixing IP address, as long as the domain name mapping deviant circumstance occurs, just all be redirected to this one or several fixing IP address, but it is a plurality of that the domain name that can not normally resolve may have, therefore, just may find by statistics, there are a plurality of domain name correspondences same IP address, also be, a lot of domain names all jump to same IP, and this is likely because these domain names can't normally be resolved, and caused by the abduction of DNS black hole, at this moment, just can judge this IP address might be the IP address of kidnapping be used to carrying out the DNS black hole, because under normal circumstances, generally all is that a domain name all is that unique correspondence an IP address.Therefore, after the corresponding relation that grabs between a large amount of domain names and the IP address, just can add up, obtain the respectively domain name quantity of correspondence of each IP address.For example, in certain the HTTP packet that grabs, the domain name that extracts is domain name A, its correspondence be certain IP address, in another HTTP packet that grabs, the domain name that extracts is domain name B, its correspondence also be this IP address, at this moment, domain name quantity corresponding to this IP address is exactly 2, by that analogy.
S103: according to the quantity of different domain names corresponding to same IP address, be identified for carrying out the IP address that kidnap in the DNS black hole, and preserve the IP address that is used for carrying out the abduction of DNS black hole of determining;
After the quantity of determining different domain names corresponding to each IP address, can the quantity of different domain names corresponding to each IP be sorted, several IP addresses that quantity is maximum are defined as be used to the IP address of carrying out the abduction of DNS black hole, perhaps, the IP address that also quantity of the different domain names of correspondence can be reached certain threshold value that presets is defined as be used to the IP address of carrying out the abduction of DNS black hole, etc.
In actual applications, also may have following situation: owing to the reasons such as restriction of network facet, possibly can't directly access some special webpages, at this moment, the user may conduct interviews by means of acting server.Acting server is used to connect the INTERNET(Internet mostly) and the INTRANET(local area network (LAN)).For example, in China, so-called Chinese multimedia public information network and education network all are large-scale national local area network (LAN)s independently, and be isolated with Internet.For various needs, some group or individual have offered acting server between two nets, if know the address of these acting servers, just can utilize it to arrive external website.The user of local area network (LAN) inside only is mapped as an IP address when accessing extraneous webpage by acting server, at this moment, when resolving the HTTP packet, the situation of the corresponding a plurality of domain names in an IP address also can occur.
Therefore, in order to distinguish mutually with above-mentioned situation, in embodiments of the present invention, after finding the corresponding a plurality of domain names in certain IP address, can also verify further whether this IP address is the IP address of kidnapping be used to carrying out the DNS black hole.Specifically can be: obtain web page server response message corresponding to IP to be verified address, then according to this server response message IP to be verified address is verified, if the verification passes, then IP to be verified address is defined as be used to the IP address of carrying out the abduction of DNS black hole.Wherein, the content-data that can comprise webpage in the web page server response message, therefore, wherein a kind of concrete verification mode can the time: from web content data bag corresponding to IP to be verified address, extract web page contents, with the web page contents that extracts with known be to compare be used to web page contents corresponding to the IP address of carrying out kidnapping in the DNS black hole, if similarity reaches preset threshold value, then checking is passed through.That is to say, can obtain in advance and determine to belong to be used to web page contents corresponding to the IP address of carrying out kidnapping in the DNS black hole (may be certain advertising page of Virtual network operator or navigation page etc.), if web page contents corresponding to IP to be verified address is identical with these web page contents or similarity acquires a certain degree, can think that then IP to be verified address is exactly the IP address of kidnapping be used to carrying out the DNS black hole.And if IP to be verified address is the IP address of acting server, then web page contents corresponding to this IP address can not have high similitude with certain advertising page or the navigation page of Virtual network operator, therefore, can accordingly this IP address be foreclosed.Wherein, content of pages is one section text data in essence, is specifically carrying out webpage similarity when contrast, can compare based on the hash value of webpage etc., also can use and calculate cosine apart from the algorithm that waits the text similarity coupling is to calculate the cosine distance, concrete not as limit.
Perhaps, under another kind of implementation, consider for web page code corresponding to the IP address of carrying out the network address abduction and generally all can comprise one section special code, this special code generally is the javascrIPt code, corresponding certain jump instruction, the code that all needs to be written to when carrying out the abduction of DNS black hole in the webpage, for example:
This code can be used for jumping to the abc.com.cn domain name, and this domain name is held for certain Virtual network operator.Therefore, can from server response message corresponding to IP to be verified address, extract web page code, judge whether comprise the key code that presets in the web page code, if comprise, can conclude that then IP to be verified address is used for carrying out the IP address that kidnap in the DNS black hole for this Virtual network operator.Certainly, jump instruction is one of them of above-mentioned key code, when specific implementation, can also be keyword that appointment is arranged (this keyword may be one section character string but not executable instruction).
After finding the IP address that is used for carrying out kidnapping in the DNS black hole, can preserve in modes such as tabulations, in order to kidnap the basis for estimation of behavior as detection DNS black hole.In actual applications, this tabulation can be kept at the cloud engine end of browser.
S104: when user's web access requests produces current HTTP connection packet, from described current HTTP connection packet, extract the IP address;
After having preserved the IP address that is used for carrying out kidnapping in the DNS black hole, just can detect the abduction behavior of DNS black hole accordingly.Specifically when detecting, can produce in user's accessed web page request after the HTTP packet, equally therefrom extract the IP address.Same, the actual corresponding IP address of the URL that this IP address may be accessed webpage also may be being used for after being redirected to carry out the IP address that kidnap in the DNS black hole.
S105: if the IP address that extracts appears at the IP address that is used for carrying out the abduction of DNS black hole of preserving, determine that then user's web access requests is subjected to the abduction of DNS black hole.
After the IP address that in extracting the HTTP packet, comprises, just can compare with each IP address that is used for carrying out kidnapping in the DNS black hole of pre-save, if there is being used for carrying out the IP address that kidnap in the DNS black hole at these, prove that then user's web access requests is kidnapped by the DNS black hole.After this situation of discovery, can be directly with this HTTP data packet discarding, so that this HTTP request can't arrive the IP address after being redirected; Perhaps, can also eject prompting message to the user, kidnap in the current DNS black hole that may suffer of prompting user, whether the inquiry user continues, perhaps finish this visit, if user selection continues, this HTTP packet can let pass, make it arrive IP address after being redirected, and return corresponding web page contents to the user and represent, if user selection finishes this visit, then can be with the HTTP data packet discarding, etc., certainly, can also adopt other adjustment mode, enumerate no longer one by one here.
In a word, in embodiments of the present invention, can by collect a large amount of HTTP packets, therefrom extract the corresponding relation of domain name and IP address, and it is added up, drawing may be the IP address of kidnapping be used to carrying out the DNS black hole, and then when user's accessed web page, can extract the IP address in the HTTP packet, judge whether it appears at the IP address of kidnapping for carrying out the DNS black hole, if so, the web page access that then can conclude the user has been subject to the DNS black hole and has kidnapped.As seen, in the process of user's accessed web page, can detect the DNS black hole and kidnap behavior, and then avoid the user to be subject to the interference that advertisement or the navigation page etc. are kidnapped the page.
The method of kidnapping behavior with the detection DNS black hole that the embodiment of the invention provides is corresponding, and the embodiment of the invention also provides a kind of DNS of detection black hole to kidnap the device of behavior, and referring to Fig. 2, this device can comprise:
Placement unit 201 for the HTML (Hypertext Markup Language) HTTP connection packet corresponding to web access requests of crawl network, extracts corresponding domain name and the IP address of webpage from described packet, and the corresponding relation between record domain name and the IP address;
Statistic unit 202 is used for the result who grabs is added up, and obtains the quantity of different domain names corresponding to same IP address;
The IP address determining unit 203 that is used for abduction is used for the quantity according to different domain names corresponding to same IP address, is identified for carrying out the IP address that kidnap in the DNS black hole, and preserves the IP address that is used for carrying out the abduction of DNS black hole of determining;
IP address extraction unit 204 is used for extracting the IP address from described current HTTP connection packet when user's web access requests produces current HTTP connection packet;
Detecting unit 205 if the IP address that is used for extracting appears at the IP address that is used for carrying out the abduction of DNS black hole of preserving, determines that then user's web access requests is subjected to the abduction of DNS black hole.
Perhaps, described IP address determining unit 203 for kidnapping also can comprise:
Extract subelement, reach the IP address of preset threshold value as IP to be verified address for the quantity of extracting corresponding different domain names;
The response information acquisition subelement is used for obtaining server response message corresponding to described IP to be verified address;
The checking subelement is used for according to described server response message described IP to be verified address being verified, if the verification passes, then IP to be verified address is defined as the IP address of kidnapping be used to carrying out the DNS black hole.
During specific implementation, comprise the web content data bag in the described server response message, at this moment, described checking subelement can comprise:
The first checking subelement, be used for extracting web page contents from web content data bag corresponding to described IP to be verified address, with the web page contents that extracts with known be to compare be used to web page contents corresponding to the IP address of carrying out kidnapping in the DNS black hole, if similarity reaches preset threshold value, then checking is passed through.
Perhaps, under another kind of verification mode, owing to also comprising web page code in the described server response message, therefore, described checking subelement can comprise:
The second checking subelement is used for judging whether described web page code comprises the jump instruction code, if comprise, then checking is passed through.
Corresponding with the device that kidnap in aforementioned detection DNS black hole, the system that the embodiment of the invention also provides a kind of DNS of detection black hole to kidnap, referring to Fig. 3, this system can comprise server end 301 and client 302, wherein, described server end 301 comprises:
Placement unit 3011 for the HTML (Hypertext Markup Language) HTTP connection packet corresponding to web access requests of crawl network, extracts corresponding domain name and the IP address of webpage from described packet, and the corresponding relation between record domain name and the IP address;
Statistic unit 3012 is used for the result who grabs is added up, and obtains the quantity of different domain names corresponding to same IP address;
The IP address determining unit 3013 that is used for abduction is used for the quantity according to different domain names corresponding to same IP address, is identified for carrying out the IP address that kidnap in the DNS black hole, and preserves the IP address that is used for carrying out the abduction of DNS black hole of determining;
Described client 302 comprises:
IP address extraction unit 3021 is used for extracting the IP address from described current HTTP connection packet when user's web access requests produces current HTTP connection packet;
Uploading unit 3022 is for end that the IP address that extracts is uploaded onto the server;
Described server end 301 also comprises:
Detecting unit 3014 if the IP address that is used for extracting appears at the IP address that is used for carrying out the abduction of DNS black hole of preserving, determines that then user's web access requests is subjected to the abduction of DNS black hole.
In said apparatus and system that the embodiment of the invention provides, can be by collecting a large amount of HTTP packets, therefrom extract the corresponding relation of domain name and IP address, and it is added up, drawing may be the IP address of kidnapping be used to carrying out the DNS black hole, and then when user's accessed web page, can extract the IP address in the HTTP packet, judge whether it appears at the IP address of kidnapping for carrying out the DNS black hole, if so, the web page access that then can conclude the user has been subject to the DNS black hole and has kidnapped.As seen, in the process of user's accessed web page, can detect the DNS black hole and kidnap behavior, and then avoid the user to be subject to the interference that advertisement or the navigation page etc. are kidnapped the page.
As seen through the above description of the embodiments, those skilled in the art can be well understood to the present invention and can realize by the mode that software adds essential general hardware platform.Based on such understanding, the part that technical scheme of the present invention contributes to prior art in essence in other words can embody with the form of software product, this computer software product can be stored in the storage medium, such as ROM/RAM, magnetic disc, CD etc., comprise that some instructions are with so that a computer equipment (can be personal computer, server, the perhaps network equipment etc.) carry out the described method of some part of each embodiment of the present invention or embodiment.
Each embodiment in this specification all adopts the mode of going forward one by one to describe, and identical similar part is mutually referring to getting final product between each embodiment, and each embodiment stresses is difference with other embodiment.Especially, for device or system embodiment, because its basic simlarity is in embodiment of the method, so describe fairly simplely, relevant part gets final product referring to the part explanation of embodiment of the method.Apparatus and system embodiment described above only is schematic, wherein said unit as the separating component explanation can or can not be physically to separate also, the parts that show as the unit can be or can not be physical locations also, namely can be positioned at a place, perhaps also can be distributed on a plurality of network element.Can select according to the actual needs wherein some or all of module to realize the purpose of the present embodiment scheme.Those of ordinary skills namely can understand and implement in the situation that do not pay creative work.
Above method, the Apparatus and system that detection DNS provided by the present invention black hole is kidnapped, be described in detail, used specific case herein principle of the present invention and execution mode are set forth, the explanation of above embodiment just is used for helping to understand method of the present invention and core concept thereof; Simultaneously, for one of ordinary skill in the art, according to thought of the present invention, all will change in specific embodiments and applications.In sum, this description should not be construed as limitation of the present invention.