CN104363251A - Website security detecting method and device - Google Patents

Website security detecting method and device Download PDF

Info

Publication number
CN104363251A
CN104363251A CN201410769106.8A CN201410769106A CN104363251A CN 104363251 A CN104363251 A CN 104363251A CN 201410769106 A CN201410769106 A CN 201410769106A CN 104363251 A CN104363251 A CN 104363251A
Authority
CN
China
Prior art keywords
website
link
new url
known specific
links
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201410769106.8A
Other languages
Chinese (zh)
Other versions
CN104363251B (en
Inventor
龙专
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qax Technology Group Inc
Secworld Information Technology Beijing Co Ltd
Original Assignee
Beijing Qihoo Technology Co Ltd
Qizhi Software Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Qihoo Technology Co Ltd, Qizhi Software Beijing Co Ltd filed Critical Beijing Qihoo Technology Co Ltd
Priority to CN201410769106.8A priority Critical patent/CN104363251B/en
Publication of CN104363251A publication Critical patent/CN104363251A/en
Application granted granted Critical
Publication of CN104363251B publication Critical patent/CN104363251B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1433Vulnerability analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The invention relates to a website security detecting method comprising the following steps of receiving acquired data containing a hypertext transfer protocol request packet through a remote port; determining new relevant links belonging to a known specific website by virtue of links contained in the request packet; and carrying out vulnerability scanning detection on websites corresponding to the new links. Correspondingly, the invention also provides a website security detecting device. By using the website security detecting method and device, the know specific website and the new links thereof can be found in time, and the new links can be subjected to vulnerability detection in real time, so that leakage detection can be avoided; in addition, redundant detection on invalid and repeated links can be avoided; and the website security detecting method and device have the advantages of high efficiency and timeliness in maintaining the safety of the websites.

Description

Website security detection method and device
Technical field
The present invention relates to internet security technology, particularly relate to a kind of web portal security detection scheme and device.
Background technology
Website visiting also exists various potential safety hazard, and such as: COOKIE is poisoning, application program Buffer Overflow, cross-site scripting attack, known security flaw etc., these web portal security problems can cause the safety problem of user data further.Therefore, website caller wishes to understand the safe coefficient of website, naturally tends to use safer website, and website webmaster more wish can patching bugs in time, overcome the safety problem of its website, browse platform for website caller provides safer.
The method that web portal security detects, normally goes initiatively to capture webpage by crawler technology by scanner, and carries out security test for captured webpage.Cause in order to avoid implementing crawler technology the load increasing Website server, usually, the security test of website goes to perform webpage capture by the mode of timing or user's manual triggers.But in today that information is highly developed, the web site traffic (code) as information carrier upgrades frequent, and the information security that each company is equipped with is undermanned to support so many and safety test frequently.This just causes conflict, and namely scanner frequently scans that caused server stress increases, human resources are not enough, and the new web page fail safe that causes of scanner interval scan detects contradiction between the two not in time.The careless mistake caused in particular to the prior art of website and webpage security test includes but not limited to occur following problem:
Such as, the isolated island page be reptile grab less than the page, found by hacker again if there is leak, great security risk can have been caused.Existing vulnerability scanners carries out safety test after all capturing web site url based on spider technology again, can not scan the domain name of newly reaching the standard grade in time and the leak that the isolated island page exists can not be detected.
For another example, large-scale website (as news category, electric business's class etc.) all can have a large amount of new web pages to reach the standard grade every day, and timing scan can not carry out security test to the webpage of newly reaching the standard grade in time.Such as webmaster sets and detects website every day 0, then 1 webpage of reaching the standard grade was wanted after 23 hours and just can be detected.If these webpages of newly reaching the standard grade exist leak, website during this period of time will be made to sink into unsafe condition.
Summary of the invention
The object of the invention is to the one or more aspects overcoming the problems referred to above, and a kind of website security detection method and device are provided.
For realizing object of the present invention, the present invention takes following technical scheme:
A kind of website security detection method provided by the invention, comprises the following steps:
By the image data of remote port receiving package containing hypertext transfer protocol requests bag;
The association new url belonging to known specific website is determined in the link utilizing described request bag to comprise;
The webpage corresponding to described new url is implemented vulnerability scanning and is detected.
Disclosing according to one embodiment of present invention, the source IP addresses of described image data is the object IP address of this request bag.Preferably, described image data derives from the acquisition module that is installed on the equipment of described source IP addresses.
Disclosing according to another embodiment of the present invention, the source IP addresses of described image data is the source IP addresses in this request bag.Preferably, described image data derives from the browser plug-in that is installed on the equipment of described source IP addresses.Preferably, before determining to belong to the association new url of known specific website, gather the link that described request bag comprises and the repeated links removed wherein.
Disclosing according to one embodiment of present invention, the step of described removal repeated links comprises following fine division step:
The different multiple links of only its variable formed by accessing database are defined as repeated links;
One of them realizes removing repeated links only to retain repeated links.
Disclosing according to another embodiment of the present invention, the step of described removal repeated links comprises following fine division step:
Multiple links with same signature are defined as repeated links;
One of them realizes removing repeated links only to retain repeated links.
Disclosing according to one of embodiment of the present invention, described known specific website and/or its new url receive user's setting and given in advance by graphic user interface.Preferably, the content of setting that described graphic user interface receives comprises the domain name or IP address of pointing to website.
Disclosing according to one of embodiment of the present invention, by determining to ask the IP address pointed by the link in bag to belong to IP address pointed by described known specific website or IP address field belonging to it, this link being defined as belonging to the association new url of known specific website.
Disclosed according to one of embodiment of the present invention, by the registration feature information of the registration feature information with the domain name of known specific website that compare the domain name of the link in described request bag identical and this is linked be defined as belonging to known specific website associate new url.
Preferably, known specific website list is provided with for recording domain name and/or its corresponding IP address of one or more described known specific website.
Further, the step of the association new url belonging to known specific website is determined in the described link utilizing described request bag to comprise, and comprises following fine division step:
Extract the link of all request bags obtained;
Remove the repeated links pointed in the link of extracting and there is the webpage of same code;
Determine new url wherein, this new url is added into queue to be scanned.
Disclosing according to one embodiment of the invention, described step webpage pointed by described new url being implemented to vulnerability scanning, comprises following fine division step:
Described new url is obtained from the queue to be scanned for recording described new url;
Vulnerability scanning detection is implemented to the webpage that described new url directly maps.
Disclosing according to another embodiment of the present invention, the described webpage corresponding to described new url implements the step of vulnerability scanning, comprises following fine division step:
Described new url is obtained from the queue to be scanned for recording described new url;
Obtain the webpage of the new url mapping in described queue to be scanned and be added into local page storehouse;
Implement vulnerability scanning to the webpage in the web page library downloaded according to new url to detect.
Further, the method comprises subsequent step: display graphics user interface is to export the object information implemented vulnerability scanning and detect.
A kind of web portal security checkout gear provided by the invention, comprising:
Packet capturing unit, for containing the image data of hypertext transfer protocol requests bag by remote port receiving package;
Look into new unit, the association new url belonging to known specific website is determined in the link being suitable for utilizing described request bag to comprise;
Detecting unit, implements vulnerability scanning for the webpage corresponding to described new url and detects.
Disclosing according to one embodiment of present invention, the source IP addresses of the described image data that described packet capturing unit receives is the object IP address of this request bag.Preferably, described image data derives from the acquisition module that is installed on the equipment of described source IP addresses.
Disclosing according to another embodiment of the present invention, the source IP addresses of the described image data that described packet capturing unit receives is the source IP addresses in this request bag.Preferably, described image data derives from the browser plug-in that is installed on the equipment of described source IP addresses.
Preferably, described in look into new unit, be configured to before the association new url determining to belong to known specific website, gather the link that described request bag comprises and the repeated links removed wherein.
Disclosed according to one embodiment of present invention, described in look into new unit and comprise:
Look into baryon module, the different multiple links of only its variable for being formed by accessing database are defined as repeated links;
Remove submodule, one of them realizes removing repeated links to be suitable for implementing only to retain repeated links.
Disclosed according to another embodiment of the present invention, described in look into new unit and comprise:
Look into baryon module, for multiple links with same signature are defined as repeated links;
Remove submodule, one of them realizes removing repeated links to be suitable for implementing only to retain repeated links.
Disclosing according to one of embodiment of the present invention, this device also comprises setup unit, for display graphics user interface to receive user's setting, described known specific website given in advance and/or its new url therefrom.Preferably, the content of setting that described graphic user interface receives comprises the domain name or IP address of pointing to specific website.
Disclosed according to one of embodiment of the present invention, this device also comprises setup unit, is configured to the association new url being defined as this link to belong to known specific website by determining to ask the IP address pointed by the link in bag to belong to IP address pointed by described known specific website or IP address field belonging to it.
Disclosed according to one of embodiment of the present invention, this device also comprises setup unit, the registration feature information being configured to the domain name by comparing the link in described request bag identical with the registration feature information of the domain name of known specific website and this is linked be defined as belonging to described known specific website associate new url.
Preferably, this device also comprises known specific website list, for recording domain name and/or its corresponding IP address of one or more described known specific website.
Further, look into new unit described in comprise:
Extraction module, for extracting the link of all request bags obtained;
Duplicate removal module, has the repeated links of the webpage of same code for removing in link that extraction module extracts to point to;
Adding module, for determining new url wherein, this new url being added into queue to be scanned.
Disclosing according to one embodiment of the invention, described detecting unit comprises:
Acquiring unit, is configured to obtain described new url from the queue to be scanned for recording described new url;
Implementation unit, the webpage for mapping described new url is implemented vulnerability scanning and is detected.
Disclosing according to another embodiment of the present invention, described detecting unit comprises:
Acquiring unit, is configured to obtain described new url from the queue to be scanned for recording described new url;
Download unit, the webpage that the new url for downloading in described queue to be scanned maps also is added into local page storehouse;
Implementation unit, detects for implementing vulnerability scanning to the webpage in the web page library downloaded according to new url.
Further, this device comprises display unit, for display graphics user interface to export the object information implemented vulnerability scanning and detect.
Compared to prior art, the present invention at least tool has the following advantages:
1, the present invention realizes by remote port the image data comprising hypertext transfer protocol requests bag, the remote distributed that can realize similar C/S connects framework, therefore can receive the respective request bag that the request initiated to known specific website server produces, detect for implementing vulnerability scanning targetedly for the webpage of known specific website and association link thereof.The present invention one clearly can implement scanning for known specific website screening new url, two can pass through remote port real-time reception image data, and Real-time Obtaining and the quantity of the new url determined is minimum for the number of links of all known specific websites, usual non-new url is scanned in history use procedure, need not multiple scanning, and it is lower to the operand of the vulnerability scanning that these new urls are implemented, the response pressure caused server is also very little, therefore, the leak of the webpage pointed by the new url that the present invention newly reaches the standard grade for real time scan specific website provides technical conditions, avoid occurring timing or not timing scan the time-slot formed during the scanning leakage that causes and the security incident that may occur, for network manager provides more efficiently Hole Detection technical tool.
2, the present invention is further by the source IP addresses the limiting image data relation identical with the source IP addresses that comprises of request or object IP address, realize the control to image data source, the former is applicable to the webpage place being installed to developer, the request obtaining developer's initiation in the webpage debug phase realizes webpage vulnerability scanning more timely, the latter is applicable to be installed to the server place providing related web page to serve, and all request bags in like manner also can initiated outside in the very first time are caught in time.For a website, the access request first of its new url of reaching the standard grade generally is initiated for the needs of debugging by net administrator, even if do not debugged by net administrator, after it is uploaded onto the server, the accessed request bag also must initiated to this service for the webpage pointed by this new url based on first, and the present invention is by above-mentioned restriction, the request bag obtained derives from server or the client place of net administrator for debugging just, it is the accessed the only way which must be passed of this webpage, thus, the present invention can obtain the request bag for up-to-date webpage in the vast majority of circumstances, all webpages can be contained in theory, comprise isolated island webpage.But finally carry out the part just belonging to again new url in these request bags of vulnerability scanning.Therefore, the present invention can avoid all needing full dose to detect with the drawback avoiding scanning leakage in prior art at every turn, thus passes through the mode of more light weight, achieves comprehensive security sweep effect.
3, the present invention is further by the repeated links removed in new url, reduce the scanning webpage of code of belonging to the same generation in fact being carried out to repetition, for the link of such as news web page, forum Web pages and so on, carry out optimizing greatly, duplicate removal rate is very high, invalid operand during further reduction vulnerability scanning, improves the overall operation efficiency of machine.
4, the source of request bag of the present invention, both can by having increased browser plug-in to obtain on the browser of requesting party of initiating request, can also by installing client to obtain on the server of the described known specific website of erection, wholely realize framework very flexibly and open, be conducive to carrying out secondary development.
5, the present invention had both allowed user to add known specific website by graphic user interface, further provide the mode dynamically being determined known specific website by program self, further, can also warn accordingly after vulnerability scanning, there is very strong interactivity and comparatively excellent man-machine interaction effect.
In sum, present invention achieves more comprehensive and efficient web portal security detection technique scheme.
The aspect that the present invention adds and advantage will part provide in the following description, and these will become obvious from the following description, or be recognized by practice of the present invention.
Accompanying drawing explanation
The present invention above-mentioned and/or additional aspect and advantage will become obvious and easy understand from the following description of the accompanying drawings of embodiments, wherein:
Fig. 1 is the schematic diagram that web portal security checkout equipment of the present invention accesses an existing network topology;
Fig. 2 is the schematic diagram of the existing network topology being changed by Fig. 1 and obtain;
Fig. 3 is the schematic flow sheet of an embodiment of a kind of network security detection method of the present invention;
Fig. 4 is the segmentation schematic flow sheet of the step S12 of a kind of network security detection method of the present invention;
Fig. 5 is the schematic flow sheet of another embodiment of a kind of network security detection method of the present invention;
Fig. 6 is the principle schematic of an embodiment of a kind of network security checkout gear of the present invention;
Fig. 7 is the principle schematic of another embodiment of a kind of network security checkout gear of the present invention;
Fig. 8 is the structural representation looking into new unit in a kind of network security of the present invention checkout gear;
Embodiment
Be described below in detail embodiments of the invention, the example of described embodiment is shown in the drawings, and wherein same or similar label represents same or similar element or has element that is identical or similar functions from start to finish.Being exemplary below by the embodiment be described with reference to the drawings, only for explaining the present invention, and can not limitation of the present invention being interpreted as.
Those skilled in the art of the present technique are appreciated that unless expressly stated, and singulative used herein " ", " one ", " described " and " being somebody's turn to do " also can comprise plural form.Should be further understood that, the wording used in specification of the present invention " comprises " and refers to there is described feature, integer, step, operation, element and/or assembly, but does not get rid of and exist or add other features one or more, integer, step, operation, element, assembly and/or their group.Should be appreciated that, when we claim element to be " connected " or " coupling " to another element time, it can be directly connected or coupled to other elements, or also can there is intermediary element.In addition, " connection " used herein or " coupling " can comprise wireless connections or wirelessly to couple.Wording "and/or" used herein comprises one or more whole or arbitrary unit listing item be associated and all combinations.
Those skilled in the art of the present technique are appreciated that unless otherwise defined, and all terms used herein (comprising technical term and scientific terminology), have the meaning identical with the general understanding of the those of ordinary skill in field belonging to the present invention.It should also be understood that, those terms defined in such as general dictionary, should be understood to that there is the meaning consistent with the meaning in the context of prior art, unless and by specific definitions as here, otherwise can not explain by idealized or too formal implication.
Those skilled in the art of the present technique are appreciated that, here used " terminal ", " terminal equipment " had both comprised the equipment of wireless signal receiver, it only possesses the equipment of the wireless signal receiver without emissivities, comprise again the equipment receiving and launch hardware, it has and on bidirectional communication link, can perform the reception of two-way communication and launch the equipment of hardware.This equipment can comprise: honeycomb or other communication equipments, its honeycomb or other communication equipment of having single line display or multi-line display or not having multi-line display; PCS (Personal Communications Service, PCS Personal Communications System), it can combine voice, data processing, fax and/or its communication ability; PDA (Personal Digital Assistant, personal digital assistant), it can comprise radio frequency receiver, beep-pager, the Internet/intranet access, web browser, notepad, calendar and/or GPS (Global Positioning System, global positioning system) receiver; Conventional laptop and/or palmtop computer or other equipment, it has and/or comprises the conventional laptop of radio frequency receiver and/or palmtop computer or other equipment.Here used " terminal ", " terminal equipment " can be portable, can transport, be arranged in the vehicles (aviation, sea-freight and/or land), or be suitable for and/or be configured at local runtime, and/or with distribution form, any other position operating in the earth and/or space is run.Here used " terminal ", " terminal equipment " can also be communication terminal, access terminals, music/video playback terminal, can be such as PDA, MID (Mobile Internet Device, mobile internet device) and/or there is the mobile phone of music/video playing function, also can be the equipment such as intelligent television, Set Top Box.
Those skilled in the art of the present technique are appreciated that, the concepts such as server used here, high in the clouds, remote network devices, have effects equivalent, it includes but not limited to the cloud that computer, network host, single network server, multiple webserver collection or multiple server are formed.At this, cloud is formed by based on a large amount of computer of cloud computing (Cloud Computing) or the webserver, and wherein, cloud computing is the one of Distributed Calculation, the super virtual machine be made up of a group loosely-coupled computer collection.In embodiments of the invention, realize communicating by any communication mode between remote network devices, terminal equipment with WNS server, include but not limited to, the mobile communication based on 3GPP, LTE, WIMAX, the computer network communication based on TCP/IP, udp protocol and the low coverage wireless transmission method based on bluetooth, infrared transmission standard.
Those skilled in the art are to be understood that, " application ", " application program ", " application software " alleged by the present invention and the concept of similar statement, be those skilled in the art known same concept, refer to and be suitable for by the instruction of series of computation machine and the organic structure of related data resource the computer software that electronics runs.Unless specified, this name itself, not by programming language kind, rank, also not limited by the operating system of its operation of relying or platform.In the nature of things, this genus also not limited by any type of terminal.
Method of the present invention and device thereof, can be software by programming realization, be installed in computer equipment and run, thus form a website checkout equipment.In order to further illustrate each embodiment of the present invention, the framework that enterprise web site server realizes first can be understood.Mei Jia enterprise may comprise one or more website, and each enterprise web site all distribution frame can be set to one in multiple server.Generally speaking, as shown in Figure 1, each server 81,82 may directly be accessed a switch 80 and provide service by simple enterprise web site, more complicated, in network topology as shown in Figure 2, multiple server 81,82 may access different switches 80 respectively to provide service.The equipment of software of the present invention is installed, do not need directly to gather request bag at switch 80 place, but the image data containing hypertext transfer protocol requests bag by initiating to the server at known specific website place remote port belonging to of receiving that other functional modules gather and encapsulate, therefore, this equipment need not depend on machine room, and can direct accessing Internet, the image data that the remote port of its agreement in advance open gathers for receiving client.Hereinafter will address, the way of realization of described client can comprise client-side program and browser plug-in two kinds, on the server being applicable to be arranged on known specific website place respectively and for the browser of the computer terminal of debugging webpage.
Fig. 3 discloses one embodiment of the present of invention by the form of steps flow chart, and this embodiment belongs to the specific implementation of the core technology to website security detection method of the present invention, comprises the steps:
Step S11, by remote port receiving package containing the image data of hypertext transfer protocol requests bag.
As previously mentioned, the present invention can realize to described request bag by similar C/S distributed frame data acquisition.Specifically, remote port is that the software achieving website security detection method of the present invention (is installed on a service host, also i.e. checkout equipment of the present invention) with the communication port of its client institute agreement, utilize known ICP/IP protocol can realize this communication technology like a cork by those skilled in the art.After client acquires hypertext transfer protocol requests bag, need to send to this software through public network (also can be in like manner local area network (LAN)).Based on compunication general principle, data communication between client and checkout equipment is that primitive is expressed with message, therefore, the image data comprising hypertext transfer protocol requests bag of client collection, also transmits in units of message.The source IP addresses containing this message in message and the object IP address that will be sent to, namely the source IP addresses of this message is the IP address of the computer equipment at client place, and its object IP address is obviously the IP address of this checkout equipment.Also with including the source IP addresses initiating this request and the object IP address that will be sent in request bag.The source IP addresses of request bag is the IP address of computer of initiating request, and the IP address of the server at the place, website that to be it will access, its object IP address.Thus, as long as the source IP addresses in the respective request bag that checkout equipment is comprised by the source IP addresses of the message of image data that compares it and receive and this image data and the relation between object IP address, just correspondingly can determine that this request accordingly comes from the requesting party's client initiating request to known specific website server, also be to provide this known specific website to the server of this requesting party.The source of request bag can be identified thus, purposively further utilization is done to the link in those request bags.The alleged source identifying request bag herein, do not require or get rid of the present invention to pass through codes implement when programming, when process is run, the source IP addresses of the source IP addresses of image data, request bag and the object IP address of request bag are carried out technology extraction and compared, the present invention's explanation is herein only the source IP addresses indicating image data and the logical communication link of asking between the source IP addresses of wrapping or object IP address, those skilled in the art should understand completely to this, and can select these technological means flexibly.
The image data that checkout equipment receives, only can comprise the request bag or multiple request bag that utilize communication protocol to carry out encapsulating, can be set flexibly by those skilled in the art, especially be applicable to setting with the time interval, thus the quantity of the request bag that the image data once transmitted is comprised need not be identical.Such as, client setting was a chronomere with every 10 minutes, after constantly gathering request bag, request bag is packaged into described image data and is transferred to checkout equipment.In this time cycle, the request of initiation more or less, does not all affect enforcement of the present invention.
Described HTML (Hypertext Markup Language) (HTTP) request bag, for website visiting, comprises two kinds of forms, i.e. get and post request.Though two kinds of requests are different, also all belong to handling object of the present invention.Typically, the form of HTTP request bag mainly comprises: agreement, server domain name, port numbers, request bag path, get parameter name, post parameter name, extension name, the destination server network segment etc.No matter be the url all comprising webpage during get request bag or post request are wrapped.The URL of webpage, i.e. hyperlink, from its domain name to its page, the form of having an agreement.Wherein, the end of link is the description of the resource that it points to, and previous section is in addition its path.Such as network address http://www.360.cn/test/admin.php, wherein http: // characterize protocol format, www.360.cn is its domain name, test is the catalogue in this website, admin.php is the resource page pointed to, http://www.360.cn/test/, for the admin.php page, is the path of this link.And http://www.360.cn/test/admin/admin.php is obviously the link of the more deep layer of http://www.360.cn/test/admin.php.
Adapt to different ways of realization, one of any or its combination can obtain the image data comprising HTTP request bag gathered by client with following various ways:
One, client-side program acquisition module for gathering the request bag of initiating web access requests to it is installed on the server that website visiting service is provided.
According to above-mentioned analysis, an acquisition module can be developed, acquisition module is the client-side program process run on the server, client-side program is installed on the server providing website visiting to serve, especially being installed on erection has on the server of the known specific website alleged by the present invention, after this client-side program runs, all access request initiated for destination server with this server, its request bag will be gathered by this client-side program, client-side program at regular intervals (can certainly be real-time) just sends these request bags to described remote port to form image data with the form of the software protocol realized with the present invention in checkout equipment, after the present invention receives image data by this remote port, it is resolved, obtain wherein corresponding HTTP request bag.Can find out, in this case, the object IP address of outside request bag of initiating is the IP address of book server, and the source IP addresses expressing the message of image data is also the IP address of book server, utilize this corresponding relation, the checkout equipment source that just identifiable design goes out the image data that it receives is the server being responsible for making request bag wherein response.
Two, on the browser providing the server of known specific website access services to initiate the equipment of request, browser plug-in is being installed.
In like manner, a browser plug-in can be developed, attach it to for carrying out in the computer terminal of on-line debugging to the aforesaid webpage of known specific website that provides, thus, once browser execution, and when utilizing a certain links and accesses webpage, this plug-in unit just can obtain the request bag that this access produces, thus with reference to last mode, request bag formation image data is sent to checkout equipment by described remote port.Checkout equipment can be resolved it after obtaining the image data of browser plug-in, obtains wherein corresponding HTTP request bag.Can find out, in this case, the source IP addresses of the request bag that browser is initiated is the IP address of the computer at its place, and the source IP addresses expressing the message of image data is also the IP address of this computer, utilize this corresponding relation, the checkout equipment source that just identifiable design goes out the image data that it receives is the client initiating request.
Those skilled in the art should know, described acquisition module and described browser plug-in, and what both realized in itself is all the functions obtaining request bag, is computer program, just the form of expression and application specific details difference.And about the function how utilizing programming to obtain request bag, be known in the prior art, the present invention is the easy of explanation, and row describes in detail, and those skilled in the art can obtain relevant knowledge completely and put into practice it from prior art.Therefore be also appreciated that described acquisition module also can be implemented in described as initiating in the client computer of request.
The mode of above-mentioned two kinds of different acquisition request bags needs based on different application and proposes.Those HTTP request bags no matter adopt which kind of concrete mode, all by prior art, the HTTP request bag in image data can be extracted, so that can be further processed.
The association new url belonging to known specific website is determined in step S12, the link utilizing described request bag to comprise.
The present invention for website be specific, it is generally the one or more known website of the enterprise self applying method of the present invention, these websites have some common traits, its link is all explained on more specific IP address fields, its domain name everyone be the client of this enterprise or this enterprise, or, be the targeted website of this enterprise participation management.More specifically, this particular kind of relationship, refers to the website paid close attention to needed for the software that this method realizes.And whether belong to the website paid close attention to needed for this software, on technological layer, judging with the inventive method, interface specifically both can be provided artificially to set, also can be to link and/or to carry out comprehensive descision based on IP address and/or domain name registration characteristic information.Therefore, the basis of characterization of known specific website of the present invention, only can not be interpreted as certain domain name or its IP address, artificial expressly setting is not carried out though also should comprise, but be in fact the detected object that this enterprise will include in, comprise any link being resolved to the newly-increased domain name belonging in fact the IP address that the known specific website of part has occupied.
It can thus be appreciated that, relative to crawler technology, though the present invention does not need well-chosen seed URL, be necessary to provide the basis instrument about some specific websites, to set known specific website of the present invention.Corresponding to aforementioned explanation, the mode setting these known specific websites is also diversified.Provide the process of known specific website, the content no matter provided is the URLs of IP address or domain name and so on, is all the link providing website in itself, therefore this process nature is also the process determining new url of the present invention.Below the present invention is disclosed further for determining several concrete grammars of known specific website and/or its new url:
One, utilize graphic user interface that known specific website and/or its association new url are set.
Specifically, the software realized with the present invention is when running first, one graphic user interface will be provided, the setting of the known specific website of part is carried out for being supplied to user, user is by completing setting to this graphic user interface input content relevant with these known specific websites, thus one or more known specific website given in advance.These contents given in advance, both can be one or more domain name, such as so.com, 360.cn etc., also can be the IP address corresponding with server, and the contiguous ip address section be made up of IP address or discrete IP address field are interval.These arrange content, as previously mentioned, can be understood to an association new url in essence, can be stored in a known specific website list, so that the subsequent calls of this method.It is pointed out that this known specific website list, be in fact also equivalent to a chained library, therefore, chained library can be regarded as and carry out later use, or it is considered as the Data Source of chained library.Here alleged chained library, being similar to crawler technology, can being used directly as follow-up queue to be scanned, also can be only for follow-up queue to be scanned provides basic data.Therefore known, on this basis, these domain names for the known specific website of determining section or IP address and relevant information, just constitute new url of the present invention, or at least can be used for constructing new url of the present invention, become the handling object that software of the present invention is implemented to scan first.And profit continues to add new url in this way when follow-up maintenance, when the domain name of this new url is different from other known specific website domain name, in fact namely by expanding more multiple domain name and with the addition of new known specific website.
Two, domain name registration information is utilized to determine the association new url of known specific website.
The association new url of known specific website, comprises the all-links of all-links under the website (can by comprising registered domain name identification) that belongs to and registered and/or the unregistered website of domain name.For the latter, refer to the link that this step obtains from described request bag, comprise new domain name, when not belonging to the link range of current already present known specific website, cannot determine whether this link belongs to enterprise and have website by oneself, the need of when being considered as the association new url belonging to known specific website, need the association new url determining whether to be regarded as known specific website by technological means further.Therefore, can by calling the interface that domain name registration website provides, new domain name in this link is inquired about, determine its registration feature information, specifically comprise such as everyone, the domain name number of putting on record etc. of domain name, whether these registration feature information are identical with the registration feature information of the known specific website domain name existed at present, when both are identical, then this new url is considered as the association new url of known specific website, uses in the method; Otherwise abandon this request bag to disregard.Then can directly this new domain name and/or lower floor's new url be added in a foregoing known specific website list for subsequent use.Obviously, the operation of inquiry new domain name registration feature information, both can be artificial, also can utilize software simulating.When for the former, be actually the follow-up maintenance to aforementioned first kind of way.When for the latter, then the dynamic expansion that present invention achieves known specific website list is made to safeguard.If this known specific website list is described chained library or described queue to be scanned, be then in maintenance new url list in essence, this new url list can be used as the data basis of the hereinafter required multiple relevant treatment links of the present invention naturally.
Three, IP address is utilized dynamically to determine the association new url of known specific website.
Well-known, between domain name and IP address, there are mapping relations.Therefore, can determine corresponding IP address by known domain name, same website may server providing services pointed by multiple IP address, therefore, may there are the mapping relations of one-to-many, multi-to-multi between website and IP address.In practice, enterprise web site uses the IP address field be made up of contiguous ip address to set up its server usually.In view of this, utilize the known specific website existed at present, the IP address field occupied by it can be determined.New domain name in the link of request bag comprises the domain name a period of time not belonging to the known specific website existed at present, whether the IP address at this moment can compared pointed by this new domain name belongs to one of IP address that the known specific website that existed at present occupies, if so, then the described link in like manner this request can wrapped is considered as new known specific website association new url and adds in a foregoing known specific website list.In like manner, if this known specific website list is described chained library or described queue to be scanned, present treatment mode is in maintenance new url list in essence, and this new url list can be used as the data basis of the hereinafter required multiple relevant treatment links of the present invention naturally.
It can thus be appreciated that, one of the present invention's emphasis being different from crawler technology, be that the present invention has the known specific website determined, and, these known specific websites, both can initialization artificially given, also can be added by the software Dynamic Recognition that realizes with this method, and need not strictly depend on seed URL as crawler technology.And these known specific websites are a series of link in itself, a list both can have been used to carry out independent maintenance, also can by this list be used as chained library, even direct by this list be used as queue to be scanned.Specifically how to utilize this list, just database technology flexible R. concomitans in the method, will be apparent to those skilled in the art.Such as, in a kind of mode, namely known specific website list is queue to be scanned of the present invention in essence, and for new url, order is appended to list and encloses the mark that corresponding sign do not scan, and changes these and be designated the description that sign scanned after scanning.Another kind of mode, this list is independently, be mainly used in recording each domain name and corresponding IP address, and queue to be scanned is set in addition, when identifying association new url, the domain name of new url will be added in this list, and new url itself is then added in queue to be scanned, every link comprising this domain name later also all need not go parsing again, and is directly added in queue to be scanned.Another mode, known specific website list, chained library, queue to be scanned are all separate, the known specific website list storage domain name that only known specific website is relevant, this chained library is for storing all that identified relevant with known specific website linking, and queue to be scanned is only for storing the new url obtained from chained library, this mode ensure that the independence of all types of data, can be used as more complicated purposes.
As previously mentioned, one of above three kinds of modes are any, not only may be used for determining known specific website of the present invention, and, be also that the present invention is for determining whether to belong to the process of the association new url of known specific website in essence.In order to simplify follow-up explanation and understanding, being necessary to explain, in following description, according to a kind of mode above, above-mentioned known specific website list being equal to completely the queue to be scanned that the present invention hereinafter discloses.But this simplification should be enough to allow those skilled in the art be expanded to be comprised in the application scenarios utilizing chained library preservation effectively to link.
After the announcement of foregoing, understanding of the concept of known specific website of the present invention, those skilled in the art should be enough to implement this step.Further, after the above-mentioned determination giving known specific website and the decision method of association new url belonging to known specific website, the understanding of those skilled in the art to the more deep embodiment of this step will more be contributed to.In fact above two levels give the variants of two different levels of this step, therefore, utilize the link that described request bag comprises, and determine the association new url belonging to known specific website, and the enforcement of this technological means has obtained open fully.
In order to embody the superiority of invention further, the following fine division step disclosing this step further, embodies another embodiment realized according to this step.Refer to Fig. 4, the fine division step of this step comprises:
The links of all request bags that step S121, extraction have obtained.
The software realized by this method, after gathering the request bag of all acquisitions, carries out link to request bag and extracts.Owing to containing the url of webpage in http request bag, accordingly, can reduce from http request bag and be linked accordingly, be i.e. the url of webpage.More known technical Analysis can being carried out to these links in advance, whether effectively linking as analyzed it.
Effective link refers to the link normally can opening webpage or download file.Invalid link refers to that the page is invalid, cannot provide the page of any valuable information to user.When a certain link to occur without domain name, domain name not entirely, links imperfect, post protocol data bag and there be not during the phenomenons such as content and then this link be judged to be invalid link.Be that a certain of abcd.com is linked as example with domain name, if do not occur domain name abcd.com or only occur that a part for domain name is as ad.com in link, then this is linked as invalid link.
The link obtained from request bag is analyzed, judge this link whether as effective link, if link to occur without domain name, domain name not entirely, it is imperfect to link, post protocol data bag does not have the phenomenons such as content, judge to be linked as invalid link, invalid link does not participate in follow-up process; Be then effectively link if not, follow-up process effectively link.
The repeated links with the webpage of same code is pointed in the link that step S122, removal are extracted.
The link that every bar extracts, mainly refers to effective link wherein, all points to a webpage of corresponding known specific website in essence, but these also may exist a large amount of repeated links in effectively linking.So-called repeated links, refer to that these link, the webpage of sensing is the webpage with same code, is only available to original web page with different database access variablees, and cause webpage on linked contents, present difference, but the leak of these webpages point is identical.
Such as, article two, effectively link, is the beginning part identical each other, and end place is respectively/a.php?=1 with/a.php?=2, these two links are in fact only the data differences extracted from lane database, wherein " 1 ", " 2 " can be considered as variable, so the difference of two link in fact just variable is different, in this case, wherein any link is utilized can to point to other webpage pointed by link, therefore, only need retain wherein one link.Further, its afterbody variable can be removed, directly make the end place of link into/a.php, and delete the peer link of all band variablees, also can play identical effect.This repeated links webpage is more common in forum.
And for example, webpage end place in news website is common/and data/2011201 describes with/such the linking of data/2011202, wherein 2011201 and 2011202 with being considered as variable, except this Two Variables difference, article two, all the other words of link are all identical, therefore, be also point to two repeated links with the webpage of same code in essence.
In order to improve operation efficiency of the present invention, those skilled in the art should be the repeated links that the link extracted removes wherein by the means comprising known technology.Implementing the present invention to more contribute to those skilled in the art, below listing two kinds of methods that are optional or the removal repeated links innovated by the present invention also and implementing for reference:
Method one: first link is sorted, get adjacent link and compare analysis, when only variable all the other contents different are identical for each link of discovery, being defined as is the different multiple links of only its variable of being formed because of accessing database, thus be defined as repeated links, in this case, only retain in many repeated links, all the other are all deleted, to remove repeated links.
Method two: first sort to link, the webpage signature got pointed by adjacent link compares, and when finding that signature is identical, determining that these links belong to repeated links, only retaining a link wherein, delete other link, thus realize removing repeated links.
Sequence in above-mentioned two kinds of methods, and get the means of adjacent link, non-essential, those skilled in the art can employ all and can contribute to improving the known algorithm compared and replaced, and do not repeat for this reason.Can find out, by carrying out duplicate removal to repeated links, the link obtained just has certain uniqueness webpage and points to, and obviously contributes to the execution efficiency improving subsequent step.
Step S123, the association new url determined in the link after last step process, be added into queue to be scanned by this new url.
As previously mentioned, determine the process of new url, it is in fact also determining whether this link exists incidence relation with current already present known specific website, therefore the association new url belonging to known specific website is determined, not only comprise the domain name be recorded in known specific website list (queue to be scanned), IP address or more specifically link etc., also comprising some its domain names does not occur in the list, and its IP address mapped has been recorded in this list or fall into the link in IP address field that IP address that this list recorded forms or IP address field interval.Therefore, determine in this step to associate new url, also namely determine to three kinds of above-mentioned announcement the process that the method belonging to known specific website or belong to its association new url is applied in a flexible way.Obviously, easy understand, uses above-mentioned three kinds of methods to be flexibly, only can select wherein a kind of, also can select multiple arbitrarily simultaneously.Wherein the first, the mode manually registered, be suitable for therefrom registering a website domain name, after this concrete link (can be identified by identification-state in chained library or in queue to be scanned as previously mentioned) do not scanned under this domain names all, is all considered as the association new url of this website, the second wherein, no matter utilize domain name registration characteristic information to register, be by artificially to inquire about or program realizes, and all can play as the first effect in like manner, but the mode realized the in a program adoptable key that is this step, can improve intellectuality and the automaticity of program by this, wherein the third, the IP address pointed by link whether being fallen into the known specific website list existed at present by the IP address pointed by the link of comparison of request bag or the contiguous ip address segment limit be made up of it, determine whether the association new url being considered as the link of this request bag to belong to known specific website, this mode can the known specific website list of automatic expansion, if known specific website is a single-row list, so, can the domain name of this new url be added in this list, and this new url is added to chained library (if any) and queue to be scanned in, if namely known specific website list is used as queue to be scanned simultaneously, so, directly adding this new url is also the process of this new url being added to queue to be scanned to known specific website list.
After tapping into the screening of having gone about the said process of new url by a pair active chain of the present invention that several modes of above-mentioned announcement are any, obtain being that all new urls (if desired can on the basis of these new urls, utilize crawler technology, be regarded as the expansion of seed URL progress new url), for the ease of the execution of subsequent step, those new urls are added in foregoing queue to be scanned.No matter whether this queue to be scanned shares one with known specific website list is shown, still share one with described chained library further to show, or queue to be scanned is an independent table, etc., as previously mentioned, those skilled in the art all can utilize ordinary knowledge to register all new urls determined in this queue to be scanned, and only implement vulnerability scanning to those new urls follow-up.
Step S13, the webpage corresponding to described new url are implemented vulnerability scanning and are detected.
Through above-mentioned steps flexibly multiple variants process, after finally determining all new urls from the link of all requests bag, the webpage corresponding to these new urls can be concentrated to implement vulnerability scanning and to detect.Certainly, so-called concentrated, can be generally periodic in time.Because user asks continuous generation, this method constantly can obtain request bag, and can constantly analyze request bag, but just starts when can not wait until that user no longer sends request to carry out Scanning Detction.Therefore, this step and other step only have relation in logic, should not get rid of its interspersed relation in time with this logical relation.Such as, new url can be determined, while scan fixed new url before.Can constantly determine that receiving request wraps and determine new url with a process, by new url stored in queue to be scanned, the new url that another process is then constantly treated in scan queue implements scanning.Regardless of other step, how flexible realizes, this step only needs to pay close attention to the new url in described queue to be scanned, in like manner, no matter how flexible realizes this step, the interface that These steps finally provides also is a queue to be scanned storing new url, queue to be scanned becomes the interface between this step and step before undoubtedly, and those skilled in the art should know this principle.
Corresponding relation in the webpage that new url alleged by the present invention is corresponding, both can refer to and utilize domain name to map directly to the relation of webpage corresponding in Website server to the relation of IP address by new url, also can refer to this indirectly one-to-one relationship that will be stored in after the download of this corresponding web page in local page storehouse.Therefore, adapt to these two kinds concrete corresponding relations, the webpage pointed by new url that a pair any the present invention of following two kinds of modes can be taked to determine carries out vulnerability scanning detection.
Mode one, the new url that acquisition is recorded in wherein from described queue to be scanned, then, utilize the online webpage that this new url directly maps, by sending request to its Website server, the webpage utilizing Website server to return carries out vulnerability scanning detection.This mode can strengthen burden and the processing time of new url place server, but suitably can save the operand of the software utilizing this method to realize.
Mode two, the webpage first utilizing the new url recorded in queue to be scanned to remove to download these new urls directly to map, method for down loading can with mode one, these webpages are added in a local page storehouse, then vulnerability scanning are implemented to each webpage in these local page storehouses and detect.Or also as previously mentioned, two processes can be offered, one for constantly downloading online webpage that each new url maps to local page storehouse, another is then constantly implemented vulnerability scanning to the webpage in just-downloaded local page storehouse and detects.
In the manner described above, no matter specifically how to utilize the new url in queue to be scanned to carry out vulnerability scanning detection, obviously, all do not affect the vulnerability scanning Detection results do not invented and will reach.
When specifically carrying out vulnerability scanning detection, detect leak data and the enforcement of web portal security detected rule in conjunction with web portal security.Web portal security detect leak data comprise following one of at least: hang horse data, false swindle data, search mask data, sidenote data, altered data, leak data.Leak data are detected according to web portal security, according to the web portal security detected rule that web portal security detects leak data corresponding, safety detection is carried out to website, wherein, web portal security detected rule comprise following one of at least: hang horse rule, false swindle rule, shielding rules, sidenote rule, distort rule and leak is regular.The present invention mainly utilizes leak rule to scan webpage.Leak rule is used for determining according to leak data the leak that website exists.
According to leak data, according to leak rule, safety detection is carried out to website and comprise: obtain the leak feature in the leak property data base prestored, judge whether leak data meet leak feature, if leak data fit leak feature, be then defined as leak; If leak data do not meet leak feature, be then defined as non-leak.Determine according to judged result the leak that website exists, wherein, leak feature can be leak keyword.As, using webpage state code 404 as leak keyword; Or, using 404 content of pages as leak keyword; Or, by the normal webpage of access websites, extract the web page contents of this normal webpage, webpage state code and http head, access the non-existent webpage in this website, extract the web page contents of feedback webpage, webpage state code and http head, compare the web page contents of this normal webpage and this feedback webpage, webpage state code and http head, obtain 404 keywords as leak keyword; Again or, access non-existent webpage, using the feedback web page contents of webpage, webpage state code and http head as leak keyword etc., the present invention is not restricted this.
By above steps, method of the present invention just can complete the task of website being carried out to safety detection, the result after vulnerability scanning is stored in corresponding file or database, can for it.Further, in order to obtain better man-machine interaction effect, the embodiment that the present invention can also disclose with reference to Fig. 5 performs following steps alternatively:
Step S14, display graphics user interface are to export the object information implemented vulnerability scanning and detect.
The mode being suitable for programming due to this method realizes, therefore, a graphic user interface can be realized by this program, after executing abovementioned steps and completing vulnerability scanning detection, testing result is analyzed, adds up, object information after carrying out Mathematical treatment is outputted in this graphic user interface, net administrator can be made very clear, thus be convenient to net administrator and repair webpage leak.
After the multiple form of implementation disclosing said method of the present invention in detail, below in conjunction with modularized thoughts, disclose the embodiment of the corresponding device utilizing method of the present invention to realize further, so that those skilled in the art more thoroughly understand the present invention.It should be noted that the concept that this method adopts and principle, be in like manner applicable to corresponding device of the present invention, therefore following description will simplify part explanation.
Refer to Fig. 6, web portal security checkout gear of the present invention, be configured at one to be used as, in the computer equipment of security detection equipment, to comprise packet capturing unit 11, look into new unit 12, detecting unit 13, and embodiment discloses and comprises display unit 14 alternatively as shown in Figure 7.
Described packet capturing unit 11, for containing the image data of hypertext transfer protocol requests bag by remote port receiving package.
As previously mentioned, the present invention can realize to described request bag by similar C/S distributed frame data acquisition.Specifically, remote port is that the software achieving website security detection method of the present invention (is installed on a service host, also i.e. checkout equipment of the present invention) with the communication port of its client institute agreement, utilize known ICP/IP protocol can realize this communication technology like a cork by those skilled in the art.After client acquires hypertext transfer protocol requests bag, need to send to this software through public network (also can be in like manner local area network (LAN)).Based on compunication general principle, data communication between client and checkout equipment is that primitive is expressed with message, therefore, the image data comprising hypertext transfer protocol requests bag of client collection, also transmits in units of message.The source IP addresses containing this message in message and the object IP address that will be sent to, namely the source IP addresses of this message is the IP address of the computer equipment at client place, and its object IP address is obviously the IP address of this checkout equipment.Also with including the source IP addresses initiating this request and the object IP address that will be sent in request bag.The source IP addresses of request bag is the IP address of computer of initiating request, and the IP address of the server at the place, website that to be it will access, its object IP address.Thus, as long as the source IP addresses in the respective request bag that checkout equipment is comprised by the source IP addresses of the message of image data that compares it and receive and this image data and the relation between object IP address, just correspondingly can determine that this request accordingly comes from the requesting party's client initiating request to known specific website server, also be to provide this known specific website to the server of this requesting party.The source of request bag can be identified thus, purposively further utilization is done to the link in those request bags.The alleged source identifying request bag herein, do not require or get rid of the present invention to pass through codes implement when programming, when process is run, the source IP addresses of the source IP addresses of image data, request bag and the object IP address of request bag are carried out technology extraction and compared, the present invention's explanation is herein only the source IP addresses indicating image data and the logical communication link of asking between the source IP addresses of wrapping or object IP address, those skilled in the art should understand completely to this, and can select these technological means flexibly.
The image data that checkout equipment receives, only can comprise the request bag or multiple request bag that utilize communication protocol to carry out encapsulating, can be set flexibly by those skilled in the art, especially be applicable to setting with the time interval, thus the quantity of the request bag that the image data once transmitted is comprised need not be identical.Such as, client setting was a chronomere with every 10 minutes, after constantly gathering request bag, request bag is packaged into described image data and is transferred to checkout equipment.In this time cycle, the request of initiation more or less, does not all affect enforcement of the present invention.Described HTML (Hypertext Markup Language) (HTTP) request bag, for website visiting, comprises two kinds of forms, i.e. get and post request.Though two kinds of requests are different, also all belong to handling object of the present invention.Typically, the form of HTTP request bag mainly comprises: agreement, server domain name, port numbers, request bag path, get parameter name, post parameter name, extension name, the destination server network segment etc.No matter be the url all comprising webpage during get request bag or post request are wrapped.The URL of webpage, i.e. hyperlink, from its domain name to its page, the form of having an agreement.Wherein, the end of link is the description of the resource that it points to, and previous section is in addition its path.Such as network address http://www.360.cn/test/admin.php, wherein http: // characterize protocol format, www.360.cn is its domain name, test is the catalogue in this website, admin.php is the resource page pointed to, http://www.360.cn/test/, for the admin.php page, is the path of this link.And http://www.360.cn/test/admin/admin.php is obviously the link of the more deep layer of http://www.360.cn/test/admin.php.
Adapt to different ways of realization, one of any or its combination can obtain the image data comprising HTTP request bag gathered by client with following various ways:
One, client-side program acquisition module for gathering the request bag of initiating web access requests to it is installed on the server that website visiting service is provided.
According to above-mentioned analysis, an acquisition module can be developed, acquisition module is the client-side program process run on the server, client-side program is installed on the server providing website visiting to serve, especially being installed on erection has on the server of the known specific website alleged by the present invention, after this client-side program runs, all access request initiated for destination server with this server, its request bag will be gathered by this client-side program, client-side program at regular intervals (can certainly be real-time) just sends these request bags to described remote port to form image data with the form of the software protocol realized with the present invention in checkout equipment, after the present invention receives image data by this remote port, it is resolved, obtain wherein corresponding HTTP request bag.Can find out, in this case, the object IP address of outside request bag of initiating is the IP address of book server, and the source IP addresses expressing the message of image data is also the IP address of book server, utilize this corresponding relation, the checkout equipment source that just identifiable design goes out the image data that it receives is the server being responsible for making request bag wherein response.
Two, on the browser providing the server of known specific website access services to initiate the equipment of request, browser plug-in is being installed.
In like manner, a browser plug-in can be developed, attach it to for carrying out in the computer terminal of on-line debugging to the aforesaid webpage of known specific website that provides, thus, once browser execution, and when utilizing a certain links and accesses webpage, this plug-in unit just can obtain the request bag that this access produces, thus with reference to last mode, request bag formation image data is sent to checkout equipment by described remote port.Checkout equipment can be resolved it after obtaining the image data of browser plug-in, obtains wherein corresponding HTTP request bag.Can find out, in this case, the source IP addresses of the request bag that browser is initiated is the IP address of the computer at its place, and the source IP addresses expressing the message of image data is also the IP address of this computer, utilize this corresponding relation, the checkout equipment source that just identifiable design goes out the image data that it receives is the client initiating request.
Those skilled in the art should know, described acquisition module and described browser plug-in, and what both realized in itself is all the functions obtaining request bag, is computer program, just the form of expression and application specific details difference.And about the function how utilizing programming to obtain request bag, be known in the prior art, the present invention is the easy of explanation, and row describes in detail, and those skilled in the art can obtain relevant knowledge completely and put into practice it from prior art.Therefore be also appreciated that described acquisition module also can be implemented in described as initiating in the client computer of request.The mode of above-mentioned two kinds of different acquisition request bags needs based on different application and proposes.Those HTTP request bags no matter adopt which kind of concrete mode, all by prior art, the HTTP request bag in image data can be extracted, so that can be further processed.
Described looks into new unit 12, and the association new url belonging to known specific website is determined in the link being suitable for utilizing described request bag to comprise.
The present invention for website be specific, it is generally the one or more known website of the enterprise self applying device of the present invention, these websites have some common traits, its link is all explained on more specific IP address fields, its domain name everyone be the client of this enterprise or this enterprise, or, be the targeted website of this enterprise participation management.More specifically, this particular kind of relationship, refers to the required website paid close attention to of the software achieving this device.And whether belong to the website paid close attention to needed for this software, on technological layer, being judged by device of the present invention, interface specifically both can be provided artificially to set, also can be to link and/or to carry out comprehensive descision based on IP address and/or domain name registration characteristic information.Therefore, the basis of characterization of known specific website of the present invention, only can not be interpreted as certain domain name or its IP address, artificial expressly setting is not carried out though also should comprise, but be in fact the detected object that this enterprise will include in, comprise any link being resolved to the newly-increased domain name belonging in fact the IP address that the known specific website of part has occupied.
It can thus be appreciated that, relative to crawler technology, though the present invention does not need well-chosen seed URL, it may be necessary a setup unit 120 (consulting Fig. 8) and provide basis instrument about some specific websites, to set known specific website of the present invention.Corresponding to aforementioned explanation, the mode setting these known specific websites is also diversified.Provide the process of known specific website, the content no matter provided is the URLs of IP address or domain name and so on, is all the link providing website in itself, therefore this process nature is also the process determining new url of the present invention.Below the present invention is disclosed further for determining several specific embodiments of the setup unit 120 of known specific website and/or its new url:
One, described setup unit 120, can be configured to utilize graphic user interface to arrange known specific website and/or its association new url.
Specifically, the software realized with the present invention is when running first, a graphic user interface will be provided by this setup unit 120, the setting of the known specific website of part is carried out for being supplied to user, user is by completing setting to this graphic user interface input content relevant with these known specific websites, thus one or more known specific website given in advance.These contents given in advance, both can be one or more domain name, such as so.com, 360.cn etc., also can be the IP address corresponding with server, and the contiguous ip address section be made up of IP address or discrete IP address field are interval.These arrange content, as previously mentioned, can be understood to an association new url in essence, can be stored in a known specific website list, so that other functional module of this device is called.It is pointed out that this known specific website list, be in fact also equivalent to a chained library, therefore, chained library can be regarded as and carry out later use, or it is considered as the Data Source of chained library.Here alleged chained library, being similar to crawler technology, can being used directly as follow-up queue to be scanned, also can be only for follow-up queue to be scanned provides basic data.Therefore known, on this basis, these domain names for the known specific website of determining section or IP address and relevant information, just constitute new url of the present invention, or at least can be used for constructing new url of the present invention, become the handling object that software of the present invention is implemented to scan first.And profit continues to add new url in this way when follow-up maintenance, when the domain name of this new url is different from other known specific website domain name, in fact namely by expanding more multiple domain name and with the addition of new known specific website.
Two, described setup unit 120, can be configured to utilize domain name registration information to determine the association new url of known specific website.
The association new url of known specific website, comprises the all-links of all-links under the website (can by comprising registered domain name identification) that belongs to and registered and/or the unregistered website of domain name.For the latter, refer to the link obtained from described request bag, comprise new domain name, when not belonging to the link range of current already present known specific website, cannot determine whether this link belongs to enterprise and have website by oneself, the need of when being considered as the association new url belonging to known specific website, need the association new url determining whether to be regarded as known specific website by technological means further.Therefore, can by calling the interface that domain name registration website provides, new domain name in this link is inquired about, determine its registration feature information, specifically comprise such as everyone, the domain name number of putting on record etc. of domain name, whether these registration feature information are identical with the registration feature information of the known specific website domain name existed at present, when both are identical, then this new url is considered as the association new url of known specific website, uses in this device; Otherwise abandon this request bag to disregard.Then can directly this new domain name and/or its lower floor's new url be added in a foregoing known specific website list for subsequent use.Obviously, the operation of inquiry new domain name registration feature information, both can be artificial, also can utilize software simulating.When for the former, be actually the follow-up maintenance to aforementioned first kind of way.When for the latter, then the dynamic expansion that present invention achieves known specific website list is made to safeguard.If this known specific website list is described chained library or described queue to be scanned, be then in maintenance new url list in essence, this new url list can be used as the data basis of the hereinafter required multiple relevant treatment links of the present invention naturally.
Three, described setup unit 120, is configured to utilize IP address dynamically to determine the association new url of known specific website.
Well-known, between domain name and IP address, there are mapping relations.Therefore, can determine corresponding IP address by known domain name, same website may server providing services pointed by multiple IP address, therefore, may there are the mapping relations of one-to-many, multi-to-multi between website and IP address.In practice, enterprise web site uses the IP address field be made up of contiguous ip address to set up its server usually.In view of this, utilize the known specific website existed at present, the IP address field occupied by it can be determined.New domain name in the link of request bag comprises the domain name a period of time not belonging to the known specific website existed at present, whether the IP address at this moment can compared pointed by this new domain name belongs to one of IP address that the known specific website that existed at present occupies, if so, then the described link in like manner this request can wrapped is considered as new known specific website association new url and adds in a foregoing known specific website list.In like manner, if this known specific website list is described chained library or described queue to be scanned, present treatment mode is in maintenance new url list in essence, and this new url list can be used as the data basis of the hereinafter required multiple relevant treatment links of the present invention naturally.
It can thus be appreciated that, one of the present invention's emphasis being different from crawler technology, be that the present invention has the known specific website determined, and, these known specific websites, both can initialization artificially given, also can be added by the software Dynamic Recognition being assembled with this device, and need not strictly depend on seed URL as crawler technology.And these known specific websites are a series of link in itself, a list both can have been used to carry out independent maintenance, also can by this list be used as chained library, even direct by this list be used as queue to be scanned.Specifically how to utilize this list, just the flexible R. concomitans of database technology in this device, will be apparent to those skilled in the art.Such as, in a kind of mode, namely known specific website list is queue to be scanned of the present invention in essence, and for new url, order is appended to list and encloses the mark that corresponding sign do not scan, and changes these and be designated the description that sign scanned after scanning.Another kind of mode, this list is independently, be mainly used in recording each domain name and corresponding IP address, and queue to be scanned is set in addition, when identifying association new url, the domain name of new url will be added in this list, and new url itself is then added in queue to be scanned, every link comprising this domain name later also all need not go parsing again, and is directly added in queue to be scanned.Another mode, known specific website list, chained library, queue to be scanned are all separate, the known specific website list storage domain name that only known specific website is relevant, this chained library is for storing all that identified relevant with known specific website linking, and queue to be scanned is only for storing the new url obtained from chained library, this mode ensure that the independence of all types of data, can be used as more complicated purposes.
As previously mentioned, three kinds of execution modes of setup unit 120, not only all may be used for determining known specific website of the present invention, and, also may be used for the association new url determining to belong to known specific website in essence.In order to simplify follow-up explanation and understanding, being necessary to explain, in following description, according to a kind of mode above, above-mentioned known specific website list being equal to completely the queue to be scanned that the present invention hereinafter discloses.But this simplification should be enough to allow those skilled in the art be expanded to be comprised in the application scenarios utilizing chained library preservation effectively to link.
After the announcement of foregoing, understanding of the concept of known specific website of the present invention, those skilled in the art should be enough to implement Ben Chaxin unit 12.Further, after the above-mentioned multiple setup unit 120 of association new url given for determining known specific website and determining to belong to known specific website, the understanding of those skilled in the art to the more deep embodiment of Ben Chaxin unit 12 will more be contributed to.In fact above two levels give the variants of two different levels of Ben Chaxin unit 12, therefore, utilize the link that described request bag comprises, and determine the association new url belonging to known specific website, the enforcement of this technological means has obtained open fully.
In order to embody the superiority of invention further, the following internal structure in another embodiment disclosing Ben Chaxin unit 12 further, embodies the details of another embodiment realized according to Ben Chaxin unit 12.Refer to Fig. 8, Ben Chaxin unit 12 comprises extraction module 121, duplicate removal module 122 further and adds module 123:
Described extraction module 121, for extracting the link of all request bags obtained.
The software realized by this device, after gathering the request bag of all acquisitions, carries out link by extraction module 121 to request bag and extracts.Owing to containing the url of webpage in http request bag, accordingly, can reduce from http request bag and be linked accordingly, be i.e. the url of webpage.More known technical Analysis can being carried out to these links in advance, whether effectively linking as analyzed it.
Effective link refers to the link normally can opening webpage or download file.Invalid link refers to that the page is invalid, cannot provide the page of any valuable information to user.When a certain link to occur without domain name, domain name not entirely, links imperfect, post protocol data bag and there be not during the phenomenons such as content and then this link be judged to be invalid link.Be that a certain of abcd.com is linked as example with domain name, if do not occur domain name abcd.com or only occur that a part for domain name is as ad.com in link, then this is linked as invalid link.
The link obtained from request bag is analyzed, judge this link whether as effective link, if link to occur without domain name, domain name not entirely, it is imperfect to link, post protocol data bag does not have the phenomenons such as content, judge to be linked as invalid link, invalid link does not participate in follow-up process; Be then effectively link if not, follow-up process effectively link.
Described duplicate removal module 122, for removing the repeated links pointed in extracted link and have the webpage of same code.
The link that every bar extracts, mainly refers to effective link wherein, all points to a webpage of corresponding known specific website in essence, but these also may exist a large amount of repeated links in effectively linking.So-called repeated links, refer to that these link, the webpage of sensing is the webpage with same code, is only available to original web page with different database access variablees, and cause webpage on linked contents, present difference, but the leak of these webpages point is identical.
Such as, article two, effectively link, is the beginning part identical each other, and end place is respectively/a.php?=1 with/a.php?=2, these two links are in fact only the data differences extracted from lane database, wherein " 1 ", " 2 " can be considered as variable, so the difference of two link in fact just variable is different, in this case, wherein any link is utilized can to point to other webpage pointed by link, therefore, only need retain wherein one link.Further, its afterbody variable can be removed, directly make the end place of link into/a.php, and delete the peer link of all band variablees, also can play identical effect.This repeated links webpage is more common in forum.
And for example, webpage end place in news website is common/and data/2011201 describes with/such the linking of data/2011202, wherein 2011201 and 2011202 with being considered as variable, except this Two Variables difference, article two, all the other words of link are all identical, therefore, be also point to two repeated links with the webpage of same code in essence.
In order to improve operation efficiency of the present invention, those skilled in the art should be the repeated links that the link extracted removes wherein by the means comprising known technology.Duplicate removal module 122 of the present invention comprises further to be looked into baryon module and removes submodule, and the former is for determining repeated links, and the latter is for implementing removal operation.Implement the present invention to more contribute to those skilled in the art, the two kinds of Alternate embodiments below listing the concrete structure of the duplicate removal module 122 for removing repeated links are for reference:
One of version: described in look into baryon module and first link sorted, get adjacent link and compare analysis, when only variable all the other contents different are identical for each link of discovery, being defined as is the different multiple links of only its variable of being formed because of accessing database, thus be defined as repeated links, in this case, described removal submodule only retains one in many repeated links, all the other are all deleted, to remove repeated links.
Version two: described in look into baryon module and first link sorted, the webpage signature got pointed by adjacent link compares, when finding that signature is identical, determine that these links belong to repeated links, described removal submodule only retains a link wherein then, delete other link, thus realize removing repeated links.
Sequence in above-mentioned two kinds of versions, and get the means of adjacent link, non-essential, those skilled in the art can employ all and can contribute to improving the known algorithm compared and replaced, and do not repeat for this reason.Can find out, by carrying out duplicate removal to repeated links, the link obtained just has certain uniqueness webpage and points to, and obviously contributes to the execution efficiency improving other functional module of this device.
Described interpolation module 123, for determining the association new url in the link after looking into the process of new unit 12, is added into queue to be scanned by this new url.
As previously mentioned, determine the process of new url, it is in fact also determining whether this link exists incidence relation with current already present known specific website, therefore the association new url belonging to known specific website is determined, not only comprise the domain name be recorded in known specific website list (queue to be scanned), IP address or more specifically link etc., also comprising some its domain names does not occur in the list, and its IP address mapped has been recorded in this list or fall into the link in IP address field that IP address that this list recorded forms or IP address field interval.Therefore, determine to associate new url, the process of also namely apply in a flexible way to multiple setup unit 120 example of above-mentioned announcement (calling) in this interpolation module 123.Obviously, easy understand, the above-mentioned three kinds of structure examples using setup unit 120 are flexibly, only can select wherein a kind of, also can select multiple arbitrarily simultaneously.Wherein the first, the mode manually registered, be suitable for therefrom registering a website domain name, after this concrete link (can be identified by identification-state in chained library or in queue to be scanned as previously mentioned) do not scanned under this domain names all, is all considered as the new url of this website, the second wherein, utilize domain name registration characteristic information to register, no matter be by artificially inquiring about or program realization, all can play as the first effect in like manner, but the mode wherein realized in a program is the adoptable key of this interpolation module 123, can improve intellectuality and the automaticity of program by this, wherein the third, the IP address pointed by link whether being fallen into the known specific website list existed at present by the IP address pointed by the link of comparison of request bag or the contiguous ip address segment limit be made up of it, determine whether the association new url being considered as the link of this request bag to belong to known specific website, this mode can the known specific website list of automatic expansion, if known specific website is a single-row list, so, can the domain name of this new url be added in this list, and this new url is added to chained library (if any) and queue to be scanned in, if namely known specific website list is used as queue to be scanned simultaneously, so, directly adding this new url is also the process of this new url being added to queue to be scanned to known specific website list.
After active chain of the present invention being tapped into the screening of having gone about the said process of new url by several setup units 120 example of above-mentioned announcement, obtain being that all new urls (if desired can on the basis of these new urls, utilize crawler technology, be regarded as the expansion of seed URL progress new url), for the ease of the execution of other functional modules of the present invention, those new urls are added in foregoing queue to be scanned.No matter whether this queue to be scanned shares one with known specific website list is shown, still share one with described chained library further to show, or queue to be scanned is an independent table, etc., as previously mentioned, those skilled in the art all can utilize ordinary knowledge to register all new urls determined in this queue to be scanned, and only implement vulnerability scanning to those new urls follow-up.
Described detecting unit 13, implements vulnerability scanning for the webpage corresponding to described new url and detects.
Through above-mentioned steps flexibly multiple variants process, after finally determining all new urls from the link of all requests bag, detecting unit 13 can be utilized to concentrate the webpage corresponding to these new urls to implement vulnerability scanning and to detect.Certainly, so-called concentrated, can be generally periodic in time.Because user asks continuous generation, this device constantly can obtain request bag, and can constantly analyze request bag, but just starts when can not wait until that user no longer sends request to carry out Scanning Detction.Therefore, this detecting unit 13 and other functional module only have annexation, should not get rid of its interspersed relation in time with this annexation.Such as, new url can be determined, while scan fixed new url before.Can constantly determine that receiving request wraps and determine new url with a process, by new url stored in queue to be scanned, the new url that another process is then constantly treated in scan queue implements scanning.Regardless of other functional module, how flexible realizes, this detecting unit 13 only needs to pay close attention to the new url in described queue to be scanned, in like manner, no matter how flexible realizes this detecting unit 13, the interface that aforementioned each functional module finally provides also is a queue to be scanned storing new url, queue to be scanned becomes the interface between this detecting unit 13 and functional module before undoubtedly, and those skilled in the art should know this principle.
Corresponding relation in the webpage that new url alleged by the present invention is corresponding, both can refer to and utilize domain name to map directly to the relation of webpage corresponding in Website server to the relation of IP address by new url, also can refer to this indirectly one-to-one relationship that will be stored in after the download of this corresponding web page in local page storehouse.Therefore, adapt to these two kinds concrete corresponding relations, can provide two kinds of structure examples for the detecting unit 13 of this device, the webpage pointed by the new url all can determined the present invention by any one structure following carries out vulnerability scanning detection.
Structure example one, from described queue to be scanned, obtained the new url be recorded in wherein by an acquiring unit, then, utilize the online webpage that this new url directly maps, by sending request to its Website server, utilize the webpage that Website server returns, carry out vulnerability scanning detection by an implementation unit.This mode can strengthen burden and the processing time of new url place server, but suitably can save the operand utilizing and realize the software of this device.
Structure example two, from queue to be scanned, obtain new url by an acquiring unit after, the webpage being utilized described new url to remove to download these new urls directly to map by a download unit, method for down loading can with structure example one, these webpages are added in a local page storehouse, then by an implementation unit, vulnerability scanning are implemented to each webpage in these local page storehouses and detect.Or also as previously mentioned, two processes can be offered, one for constantly downloading online webpage that each new url maps to local page storehouse, another is then constantly implemented vulnerability scanning to the webpage in just-downloaded local page storehouse and detects.
In the manner described above, no matter specifically how to utilize the new url in queue to be scanned to carry out vulnerability scanning detection, obviously, all do not affect the vulnerability scanning Detection results do not invented and will reach.
When specifically carrying out vulnerability scanning detection, detect leak data and the enforcement of web portal security detected rule in conjunction with web portal security.Web portal security detect leak data comprise following one of at least: hang horse data, false swindle data, search mask data, sidenote data, altered data, leak data.Leak data are detected according to web portal security, according to the web portal security detected rule that web portal security detects leak data corresponding, safety detection is carried out to website, wherein, web portal security detected rule comprise following one of at least: hang horse rule, false swindle rule, shielding rules, sidenote rule, distort rule and leak is regular.The present invention mainly utilizes leak rule to scan webpage.Leak rule is used for determining according to leak data the leak that website exists.
According to leak data, according to leak rule, safety detection is carried out to website and comprise: obtain the leak feature in the leak property data base prestored, judge whether leak data meet leak feature, if leak data fit leak feature, be then defined as leak; If leak data do not meet leak feature, be then defined as non-leak.Determine according to judged result the leak that website exists, wherein, leak feature can be leak keyword.As, using webpage state code 404 as leak keyword; Or, using 404 content of pages as leak keyword; Or, by the normal webpage of access websites, extract the web page contents of this normal webpage, webpage state code and http head, access the non-existent webpage in this website, extract the web page contents of feedback webpage, webpage state code and http head, compare the web page contents of this normal webpage and this feedback webpage, webpage state code and http head, obtain 404 keywords as leak keyword; Again or, access non-existent webpage, using the feedback web page contents of webpage, webpage state code and http head as leak keyword etc., the present invention is not restricted this.
By above steps, device of the present invention just can complete the task of website being carried out to safety detection, the result after vulnerability scanning is stored in corresponding file or database, can for it.Further, in order to obtain better man-machine interaction effect, the present invention can also comprise display unit 14 alternatively:
Described display unit 14, for display graphics user interface to export the object information implemented vulnerability scanning and detect.
This display unit 14 is configured to for providing a graphic user interface, after detecting unit 13 completes vulnerability scanning detection, testing result is analyzed, adds up, object information after carrying out Mathematical treatment is outputted in this graphic user interface, net administrator can be made very clear, thus be convenient to net administrator and repair webpage leak.
In sum, the present invention can the known specific website of Timeliness coverage and new url thereof, and can implement Hole Detection to these new urls in real time, avoid undetected survey, and can avoid carrying out unnecessary detection to invalid link and repeated links, there is the advantage of efficient and timely maintaining web safety.
Embodiments of the invention disclose:
A1. a website security detection method, is characterized in that, comprises the following steps:
By the image data of remote port receiving package containing hypertext transfer protocol requests bag;
The association new url belonging to known specific website is determined in the link utilizing described request bag to comprise;
The webpage corresponding to described new url is implemented vulnerability scanning and is detected.
A2. the website security detection method according to claim A1, is characterized in that, the source IP addresses of described image data is the object IP address of this request bag.
3. the network security detection method according to claim A2, is characterized in that, described image data derives from the acquisition module that is installed on the equipment of described source IP addresses.
A4. the website security detection method according to claim A1, its spy is, the source IP addresses of described image data is the source IP addresses in this request bag.
A5. the website security detection method according to claim A4, is characterized in that, described image data derives from the browser plug-in that is installed on the equipment of described source IP addresses.
A6. the website security detection method according to claim A1, is characterized in that, before determining to belong to the association new url of known specific website, gathers the link that described request bag comprises and the repeated links removed wherein.
A7. the website security detection method according to claim A6, is characterized in that, the step of described removal repeated links comprises following fine division step:
The different multiple links of only its variable formed by accessing database are defined as repeated links;
One of them realizes removing repeated links only to retain repeated links.
A8. the website security detection method according to claim A6, is characterized in that, the step of described removal repeated links comprises following fine division step:
Multiple links with same signature are defined as repeated links;
One of them realizes removing repeated links only to retain repeated links.
A9. the website security detection method according to claim A1, is characterized in that, described known specific website and/or its new url receive user's setting and given in advance by graphic user interface.
A10. the website security detection method according to claim A8, is characterized in that, the content of the setting that described graphic user interface receives comprises the domain name or IP address of pointing to website.
A11. the website security detection method according to claim A1, it is characterized in that, by determining to ask the IP address pointed by the link in bag to belong to IP address pointed by described known specific website or IP address field belonging to it, this link being defined as belonging to the association new url of known specific website.
A12. the website security detection method according to claim A1, it is characterized in that, by the registration feature information of the registration feature information with the domain name of known specific website that compare the domain name of the link in described request bag identical and this is linked be defined as belonging to known specific website associate new url.
A13. the website security detection method according to claim A1, is characterized in that, is provided with known specific website list for recording domain name and/or its corresponding IP address of one or more described known specific website.
A14. the website security detection method according to claim A1, is characterized in that, the step of the association new url belonging to known specific website is determined in the described link utilizing described request bag to comprise, and comprises following fine division step:
Extract the link of all request bags obtained;
Remove the repeated links pointed in the link of extracting and there is the webpage of same code;
Determine new url wherein, this new url is added into queue to be scanned.
A15. the website security detection method according to claim A1, is characterized in that, the described webpage corresponding to described new url implements the step of vulnerability scanning, comprises following fine division step:
Described new url is obtained from the queue to be scanned for recording described new url;
Vulnerability scanning detection is implemented to the webpage that described new url maps.
A16. the website security detection method according to claim A1, is characterized in that, the described webpage corresponding to described new url implements the step of vulnerability scanning, comprises following fine division step:
Described new url is obtained from the queue to be scanned for recording described new url;
Obtain webpage that the new url in described queue to be scanned maps and be added into local page storehouse;
Implement vulnerability scanning to the webpage in the web page library downloaded according to new url to detect.
A17. the website security detection method according to claim A1, is characterized in that, the method comprises subsequent step: display graphics user interface is to export the object information implemented vulnerability scanning and detect.
B18. a web portal security checkout gear, is characterized in that, comprising:
Packet capturing unit, for containing the image data of hypertext transfer protocol requests bag by remote port receiving package;
Look into new unit, the association new url belonging to known specific website is determined in the link being suitable for utilizing described request bag to comprise;
Detecting unit, implements vulnerability scanning for the webpage corresponding to described new url and detects.
B19. the website security detection method according to claim B18, is characterized in that, the source IP addresses of the image data that described packet capturing unit obtains is the object IP address of this request bag.
B20. the network security detection method according to claim B19, is characterized in that, described image data derives from the acquisition module that is installed on the equipment of described source IP addresses.
B21. the website security detection method according to claim B18, its spy is, the source IP addresses of described packet capturing unit image data is the source IP addresses in this request bag.
B22. the website security detection method according to claim B21, is characterized in that, described image data derives from the browser plug-in that is installed on the equipment of described source IP addresses.
B23. the web portal security checkout gear according to claim B18, it is characterized in that, describedly look into new unit, be configured to before the association new url determining to belong to known specific website, gather the link that described request bag comprises and the repeated links removed wherein.
B24. the web portal security checkout gear according to claim B23, is characterized in that, described in look into new unit and comprise:
Look into baryon module, the different multiple links of only its variable for being formed by accessing database are defined as repeated links;
Remove submodule, one of them realizes removing repeated links to be suitable for implementing only to retain repeated links.
B25. the web portal security checkout gear according to claim B23, is characterized in that, described in look into new unit and comprise:
Look into baryon module, for multiple links with same signature are defined as repeated links;
Remove submodule, one of them realizes removing repeated links to be suitable for implementing only to retain repeated links.
B26. the web portal security checkout gear according to claim B18, is characterized in that, this device also comprises setup unit, for display graphics user interface to receive user's setting, described known specific website given in advance and/or its new url therefrom.
B27. the web portal security checkout gear according to claim B26, is characterized in that, the content of the setting that described graphic user interface receives comprises the domain name or IP address of pointing to website.
B28. the web portal security checkout gear according to claim B18, it is characterized in that, this device also comprises setup unit, is configured to the association new url being defined as this link to belong to known specific website by determining to ask the IP address pointed by the link in bag to belong to IP address pointed by described known specific website or IP address field belonging to it.
B29. the web portal security checkout gear according to claim B18, it is characterized in that, this device also comprises setup unit, the registration feature information being configured to the domain name by comparing the link in described request bag identical with the registration feature information of the domain name of known specific website and this is linked be defined as belonging to described known specific website associate new url.
B30. the web portal security checkout gear according to claim B18, is characterized in that, this device also comprises known specific website list, for recording domain name and/or its corresponding IP address of one or more described known specific website.
B31. the web portal security checkout gear according to claim B18, is characterized in that, described in look into new unit and comprise:
Extraction module, for extracting the link of all request bags obtained;
Duplicate removal module, has the repeated links of the webpage of same code for removing in link that extraction module extracts to point to;
Adding module, for determining new url wherein, this new url being added into queue to be scanned.
B32. the web portal security checkout gear according to claim B18, is characterized in that, described detecting unit comprises:
Acquiring unit, is configured to obtain described new url from the queue to be scanned for recording described new url;
Implementation unit, the webpage for mapping described new url is implemented vulnerability scanning and is detected.
B33. the web portal security checkout gear according to claim B18, is characterized in that, described detecting unit comprises:
Acquiring unit, is configured to obtain described new url from the queue to be scanned for recording described new url;
Download unit, the webpage that the new url for downloading in described queue to be scanned maps also is added into local page storehouse;
Implementation unit, detects for implementing vulnerability scanning to the webpage in the web page library downloaded according to new url.
B34. the web portal security checkout gear according to claim B18, is characterized in that, this device comprises display unit, for display graphics user interface to export the object information implemented vulnerability scanning and detect.
It should be noted that the algorithm provided at this is intrinsic not relevant to any certain computer, virtual system or miscellaneous equipment with formula.Various general-purpose system also can with use based on together with this example.According to description above, the structure constructed required by this type systematic is apparent.In addition, the present invention is not also for any certain programmed language.It should be understood that and various programming language can be utilized to realize content of the present invention described here, and the description done language-specific is above to disclose preferred forms of the present invention.
In specification provided herein, describe a large amount of detail.But can understand, embodiments of the invention can be put into practice when not having these details.In some instances, be not shown specifically known method, structure and technology, so that not fuzzy understanding of this description.
Similarly, be to be understood that, in order to simplify the present invention and to help to understand in various aspects of the present invention one or more, in the description above to exemplary embodiment of the present invention, each feature of the present invention is grouped together in single embodiment, figure or the description to it sometimes.But, the method and apparatus of the disclosure should be construed to the following intention of reflection: namely the present invention for required protection requires feature more more than the feature clearly recorded in each claim.Or rather, as claims reflect, all features of disclosed single embodiment before inventive aspect is to be less than.Therefore, the claims following embodiment are incorporated to this embodiment thus clearly, and wherein each claim itself is as independent embodiment of the present invention.
Those skilled in the art are appreciated that and adaptively can change the module in the equipment in embodiment and they are arranged in one or more equipment different from this embodiment.Module in embodiment or unit or assembly can be combined into a module or unit or assembly, and multiple submodule or subelement or sub-component can be put them in addition.Except at least some in such feature and/or process or unit be mutually repel except, any combination can be adopted to combine all processes of all features disclosed in this specification (comprising adjoint claim, summary and accompanying drawing) and so disclosed any method or equipment or unit.Unless expressly stated otherwise, each feature disclosed in this specification (comprising adjoint claim, summary and accompanying drawing) can by providing identical, alternative features that is equivalent or similar object replaces.
In addition, those skilled in the art can understand, although embodiments more described herein to comprise in other embodiment some included feature instead of further feature, the combination of the feature of different embodiment means and to be within scope of the present invention and to form different embodiments.。
All parts embodiment of the present invention with hardware implementing, or can realize with the software module run on one or more processor, or realizes with their combination.It will be understood by those of skill in the art that the some or all functions that microprocessor or digital signal processor (DSP) can be used in practice to realize according to the some or all parts in the web portal security checkout equipment of the embodiment of the present invention.The present invention can also be embodied as part or all equipment for performing method as described herein or device program (such as, computer program and computer program).Realizing program of the present invention and can store on a computer-readable medium like this, or the form of one or more signal can be had.Such signal can be downloaded from internet website and obtain, or provides on carrier signal, or provides with any other form.
The above is only some embodiments of the present invention; it should be pointed out that for those skilled in the art, under the premise without departing from the principles of the invention; can also make some improvements and modifications, these improvements and modifications also should be considered as protection scope of the present invention.

Claims (10)

1. a website security detection method, is characterized in that, comprises the following steps:
By the image data of remote port receiving package containing hypertext transfer protocol requests bag;
The association new url belonging to known specific website is determined in the link utilizing described request bag to comprise;
The webpage corresponding to described new url is implemented vulnerability scanning and is detected.
2. website security detection method according to claim 1, is characterized in that, the source IP addresses of described image data is the object IP address of this request bag.
3. network security detection method according to claim 2, is characterized in that, described image data derives from the acquisition module that is installed on the equipment of described source IP addresses.
4. website security detection method according to claim 1, its spy is, the source IP addresses of described image data is the source IP addresses in this request bag.
5. website security detection method according to claim 4, is characterized in that, described image data derives from the browser plug-in that is installed on the equipment of described source IP addresses.
6. website security detection method according to claim 1, is characterized in that, before determining to belong to the association new url of known specific website, gathers the link that described request bag comprises and the repeated links removed wherein.
7. website security detection method according to claim 6, is characterized in that, the step of described removal repeated links comprises following fine division step:
The different multiple links of only its variable formed by accessing database are defined as repeated links;
One of them realizes removing repeated links only to retain repeated links.
8. website security detection method according to claim 6, is characterized in that, the step of described removal repeated links comprises following fine division step:
Multiple links with same signature are defined as repeated links;
One of them realizes removing repeated links only to retain repeated links.
9. website security detection method according to claim 1, is characterized in that, described known specific website and/or its new url receive user's setting and given in advance by graphic user interface.
10. a web portal security checkout gear, is characterized in that, comprising:
Packet capturing unit, for containing the image data of hypertext transfer protocol requests bag by remote port receiving package;
Look into new unit, the association new url belonging to known specific website is determined in the link being suitable for utilizing described request bag to comprise;
Detecting unit, implements vulnerability scanning for the webpage corresponding to described new url and detects.
CN201410769106.8A 2014-12-12 2014-12-12 Website security detection method and device Active CN104363251B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410769106.8A CN104363251B (en) 2014-12-12 2014-12-12 Website security detection method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410769106.8A CN104363251B (en) 2014-12-12 2014-12-12 Website security detection method and device

Publications (2)

Publication Number Publication Date
CN104363251A true CN104363251A (en) 2015-02-18
CN104363251B CN104363251B (en) 2016-09-28

Family

ID=52530477

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410769106.8A Active CN104363251B (en) 2014-12-12 2014-12-12 Website security detection method and device

Country Status (1)

Country Link
CN (1) CN104363251B (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106657443A (en) * 2017-02-13 2017-05-10 杭州迪普科技股份有限公司 IP address duplication eliminating method and device
CN106921537A (en) * 2015-12-28 2017-07-04 中国电信股份有限公司 Website visiting quality detecting method, server and system
CN107566388A (en) * 2017-09-18 2018-01-09 杭州安恒信息技术有限公司 Industry control vulnerability detection method, apparatus and system
CN108063759A (en) * 2017-12-05 2018-05-22 西安交大捷普网络科技有限公司 Web vulnerability scanning methods
CN108848115A (en) * 2018-09-03 2018-11-20 杭州安恒信息技术股份有限公司 A kind of method, apparatus of web site scan, equipment and computer readable storage medium
CN109194670A (en) * 2018-09-19 2019-01-11 杭州安恒信息技术股份有限公司 A kind of any file download leak detection method in website
CN109818928A (en) * 2018-12-25 2019-05-28 北京奇安信科技有限公司 A kind of network security detection method, system, electronic equipment and medium
CN111327588A (en) * 2020-01-16 2020-06-23 深圳开源互联网安全技术有限公司 Network access security detection method, system, terminal and readable storage medium
CN116823162A (en) * 2023-06-27 2023-09-29 上海螣龙科技有限公司 Network asset scanning task management method, system and computer equipment

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101060444A (en) * 2007-05-23 2007-10-24 西安交大捷普网络科技有限公司 Bayesian statistical model based network anomaly detection method
JP2009200993A (en) * 2008-02-25 2009-09-03 Kddi Corp Failure detecting apparatus, failure detection method, and computer program
CN103023905A (en) * 2012-12-20 2013-04-03 北京奇虎科技有限公司 Device, method and system for detecting spamming links
US20130276126A1 (en) * 2010-10-22 2013-10-17 NSFOCUS Information Technology Co., Ltd. Website scanning device and method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101060444A (en) * 2007-05-23 2007-10-24 西安交大捷普网络科技有限公司 Bayesian statistical model based network anomaly detection method
JP2009200993A (en) * 2008-02-25 2009-09-03 Kddi Corp Failure detecting apparatus, failure detection method, and computer program
US20130276126A1 (en) * 2010-10-22 2013-10-17 NSFOCUS Information Technology Co., Ltd. Website scanning device and method
CN103023905A (en) * 2012-12-20 2013-04-03 北京奇虎科技有限公司 Device, method and system for detecting spamming links

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106921537A (en) * 2015-12-28 2017-07-04 中国电信股份有限公司 Website visiting quality detecting method, server and system
CN106657443A (en) * 2017-02-13 2017-05-10 杭州迪普科技股份有限公司 IP address duplication eliminating method and device
CN106657443B (en) * 2017-02-13 2020-01-03 杭州迪普科技股份有限公司 IP address duplication eliminating method and device
CN107566388A (en) * 2017-09-18 2018-01-09 杭州安恒信息技术有限公司 Industry control vulnerability detection method, apparatus and system
CN108063759A (en) * 2017-12-05 2018-05-22 西安交大捷普网络科技有限公司 Web vulnerability scanning methods
CN108848115A (en) * 2018-09-03 2018-11-20 杭州安恒信息技术股份有限公司 A kind of method, apparatus of web site scan, equipment and computer readable storage medium
CN109194670A (en) * 2018-09-19 2019-01-11 杭州安恒信息技术股份有限公司 A kind of any file download leak detection method in website
CN109818928A (en) * 2018-12-25 2019-05-28 北京奇安信科技有限公司 A kind of network security detection method, system, electronic equipment and medium
CN111327588A (en) * 2020-01-16 2020-06-23 深圳开源互联网安全技术有限公司 Network access security detection method, system, terminal and readable storage medium
CN116823162A (en) * 2023-06-27 2023-09-29 上海螣龙科技有限公司 Network asset scanning task management method, system and computer equipment
CN116823162B (en) * 2023-06-27 2024-04-09 上海螣龙科技有限公司 Network asset scanning task management method, system and computer equipment

Also Published As

Publication number Publication date
CN104363251B (en) 2016-09-28

Similar Documents

Publication Publication Date Title
CN104363251A (en) Website security detecting method and device
CN104363253B (en) Website security detection method and device
CN104378389B (en) Website security detection method and device
CN104363252B (en) Website security detection method and device
CN102930211B (en) A kind of multi-core browser intercepts method and the multi-core browser of malice network address
US20170264701A1 (en) System and method for context specific website optimization
CN104539605B (en) Website XSS leak detection methods and equipment
US8819819B1 (en) Method and system for automatically obtaining webpage content in the presence of javascript
CN106528657A (en) Control method and device for browser skipping to application program
CN104536890B (en) Test system, method and apparatus
CN103384888A (en) Systems and methods for malware detection and scanning
CN104980309A (en) Website security detecting method and device
US20220198025A1 (en) Web Attack Simulator
CN109561078A (en) A kind of exterior chain url resource transfer method and device
CN103023905B (en) A kind of equipment, method and system for detection of malicious link
CN104239786A (en) ROOT-free active defense configuration method and device
CN106453436A (en) Method and device for detecting network security
CN107463453A (en) Method, apparatus, equipment and the storage medium to be communicated between same terminal different application
CN106790593B (en) Page processing method and device
CN109600385B (en) Access control method and device
CN110266661A (en) A kind of authorization method, device and equipment
CN103607454B (en) The method that android system browser arranges privately owned proxy server
US10931688B2 (en) Malicious website discovery using web analytics identifiers
CN113422759A (en) Vulnerability scanning method, electronic device and storage medium
CN104040538A (en) Internet application interaction method, device and system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C41 Transfer of patent application or patent right or utility model
TR01 Transfer of patent right

Effective date of registration: 20161123

Address after: 100015 Chaoyang District Road, Jiuxianqiao, No. 10, building No. 3, floor 15, floor 17, 1701-26,

Patentee after: BEIJING QIANXIN TECHNOLOGY Co.,Ltd.

Address before: 100088 Beijing city Xicheng District xinjiekouwai Street 28, block D room 112 (Desheng Park)

Patentee before: BEIJING QIHOO TECHNOLOGY Co.,Ltd.

Patentee before: Qizhi software (Beijing) Co.,Ltd.

CB03 Change of inventor or designer information

Inventor after: Long Zhuan

Inventor after: Meng Jun

Inventor after: Liu Xuezhong

Inventor before: Long Zhuan

CB03 Change of inventor or designer information
CP03 Change of name, title or address

Address after: Room 332, 3 / F, Building 102, 28 xinjiekouwei street, Xicheng District, Beijing 100088

Patentee after: QAX Technology Group Inc.

Address before: 100015 15, 17 floor 1701-26, 3 building, 10 Jiuxianqiao Road, Chaoyang District, Beijing.

Patentee before: BEIJING QIANXIN TECHNOLOGY Co.,Ltd.

CP03 Change of name, title or address
TR01 Transfer of patent right

Effective date of registration: 20201230

Address after: 100044 2nd floor, building 1, yard 26, Xizhimenwai South Road, Xicheng District, Beijing

Patentee after: LEGENDSEC INFORMATION TECHNOLOGY (BEIJING) Inc.

Patentee after: QAX Technology Group Inc.

Address before: Room 332, 3 / F, Building 102, 28 xinjiekouwei street, Xicheng District, Beijing 100088

Patentee before: QAX Technology Group Inc.

TR01 Transfer of patent right
CP03 Change of name, title or address

Address after: 2nd Floor, Building 1, Yard 26, Xizhimenwai South Road, Xicheng District, Beijing, 100032

Patentee after: Qianxin Wangshen information technology (Beijing) Co.,Ltd.

Patentee after: QAX Technology Group Inc.

Address before: 100044 2nd floor, building 1, yard 26, Xizhimenwai South Road, Xicheng District, Beijing

Patentee before: LEGENDSEC INFORMATION TECHNOLOGY (BEIJING) Inc.

Patentee before: QAX Technology Group Inc.

CP03 Change of name, title or address