CN110765333A - Method and device for collecting website information, storage medium and electronic device - Google Patents

Method and device for collecting website information, storage medium and electronic device Download PDF

Info

Publication number
CN110765333A
CN110765333A CN201910750206.9A CN201910750206A CN110765333A CN 110765333 A CN110765333 A CN 110765333A CN 201910750206 A CN201910750206 A CN 201910750206A CN 110765333 A CN110765333 A CN 110765333A
Authority
CN
China
Prior art keywords
website
web
information
directory structure
request
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910750206.9A
Other languages
Chinese (zh)
Inventor
田跃
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Netshen Information Technology (beijing) Co Ltd
Qianxin Technology Group Co Ltd
Secworld Information Technology Beijing Co Ltd
Original Assignee
Netshen Information Technology (beijing) Co Ltd
Qianxin Technology Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Netshen Information Technology (beijing) Co Ltd, Qianxin Technology Group Co Ltd filed Critical Netshen Information Technology (beijing) Co Ltd
Priority to CN201910750206.9A priority Critical patent/CN110765333A/en
Publication of CN110765333A publication Critical patent/CN110765333A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1433Vulnerability analysis

Abstract

The invention provides a method and a device for acquiring website information, a storage medium and an electronic device, wherein the method comprises the following steps: determining a website domain name of a web website of a penetration target, wherein the penetration target is a network system connected through a network; performing directory scanning on the website domain name to obtain a directory structure of the web website; acquiring website information of the web website according to the directory structure; and detecting an external leak of the penetration target by using the website information. The invention solves the technical problem that website information cannot be acquired and vulnerabilities cannot be detected through a directory structure in the related technology.

Description

Method and device for collecting website information, storage medium and electronic device
Technical Field
The invention relates to the field of network security, in particular to a method and a device for acquiring website information, a storage medium and an electronic device.
Background
The network attack is an attack initiated by a hacker or a virus trojan and the like on the electronic equipment, huge loss is brought to a user by stealing files and the like, and the penetration test is a process for simulating the network attack so as to find problems in advance, make up for the problems in time and leave the trouble.
In the prior art, when website data is collected, a website directory is not scanned, and only leaked information is searched by a search engine.
In view of the above problems in the related art, no effective solution has been found at present.
Disclosure of Invention
The embodiment of the invention provides a method and a device for acquiring website information, a storage medium and an electronic device.
According to an embodiment of the present invention, there is provided a method for collecting website information, including: determining a website domain name of a web website of a penetration target, wherein the penetration target is a network system connected through a network; performing directory scanning on the website domain name to obtain a directory structure of the web website; acquiring website information of the web website according to the directory structure; and detecting the external vulnerability of the penetration target by using the website information.
Optionally, the acquiring the website information of the web website according to the directory structure includes: collecting at least one of the following information of the web site according to the directory structure: webpage files, webpage interfaces and background page addresses.
Optionally, performing directory scanning on the website domain name to obtain a directory structure of the web website, including: taking the website domain name as a root node, and scanning the website sub domain names of each level of sub nodes downwards in sequence until reaching leaf nodes; and setting corresponding domain name addresses in the root node and each level of nodes to obtain a directory structure of the web website.
Optionally, the acquiring the website information of the web website according to the directory structure includes: sending an access request to the web site according to the directory structure; when the web website returns a status code indicating that the page does not exist, collecting website information of the web website through a hypertext transfer protocol (HTTP) request; and when the web site normally responds to the access request, determining webpage content fed back by the web site as the website information of the web site.
Optionally, the collecting website information of the web website through the HTTP request includes: sending a HEAD request to the web site according to the directory structure; and receiving HTTP header information of the web site fed back by the web site based on the HEAD request.
Optionally, the acquiring the website information of the web website according to the directory structure includes: sending a GET request to the web website according to the directory structure; and receiving HTTP header information and presentation data of the web website fed back by the web website based on the GET request.
Optionally, after detecting the external vulnerability of the penetration target by using the website information, the method further includes: acquiring the operation authority of the penetration target by utilizing the external loophole; and executing the penetration operation on the network system by using the operation authority.
According to another embodiment of the present invention, there is provided an apparatus for collecting website information, including: the system comprises a determining module, a judging module and a judging module, wherein the determining module is used for determining a website domain name of a web website of a penetration target, and the penetration target is a network system connected through a network; the scanning module is used for carrying out directory scanning on the website domain name to obtain a directory structure of the web website; the acquisition module is used for acquiring the website information of the web website according to the directory structure; and the detection module is used for detecting the external leak of the penetration target by using the website information.
Optionally, the collecting module includes: the acquisition unit is used for acquiring at least one of the following information of the web website according to the directory structure: webpage files, webpage interfaces and background page addresses.
Optionally, the scanning module includes: the scanning unit is used for scanning the website sub domain names of each level of sub nodes downwards in sequence by taking the website domain names as root nodes until the website sub domain names reach leaf nodes; and the setting unit is used for setting corresponding domain name addresses in the root node and each level of nodes to obtain the directory structure of the web website.
Optionally, the scanning module includes: a sending unit, configured to send an access request to the web site according to the directory structure; the processing unit is used for collecting website information of the web website through a hypertext transfer protocol (HTTP) request when the web website returns a status code for indicating that a page does not exist; and when the web site normally responds to the access request, determining the web page content fed back by the web site as the site information of the web site.
Optionally, the processing unit includes: the first sending subunit is used for sending a HEAD request to the web website according to the directory structure; and the first receiving subunit is used for receiving HTTP header information of the web website fed back by the web website based on the HEAD request.
Optionally, the processing unit includes: the second sending subunit is configured to send a GET request to the web site according to the directory structure; and the second receiving subunit is used for receiving HTTP header information and presentation data of the web website fed back by the web website based on the GET request.
Optionally, the apparatus further comprises: the acquisition module is used for acquiring the operation authority of the permeable target by using the external vulnerability after the detection module uses the website information to detect the external vulnerability of the permeable target; and the infiltration module is used for executing infiltration operation on the network system by using the operation authority.
According to a further embodiment of the present invention, there is also provided a storage medium having a computer program stored therein, wherein the computer program is arranged to perform the steps of any of the above method embodiments when executed.
According to yet another embodiment of the present invention, there is also provided an electronic device, including a memory and a processor, the memory having a computer program stored therein, the processor being configured to execute the computer program to perform the steps in any of the above method embodiments.
According to the method and the device, the website domain name of the web website of the penetration target is determined, the website domain name is subjected to directory scanning to obtain the directory structure of the web website, the website information of the web website is collected according to the directory structure, and finally the website information is used for detecting the external loophole of the penetration target, so that the technical problem that the website information cannot be collected through the directory structure and the loophole cannot be detected in the related technology is solved, and more detailed parameter information can be provided during penetration testing.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the invention without limiting the invention to a lesser extent. In the drawings:
FIG. 1 is a block diagram of a hardware configuration of a computer device for collecting website information according to an embodiment of the present invention;
FIG. 2 is a flow chart of a method of collecting website information according to an embodiment of the invention;
FIG. 3 is a logical illustration of an embodiment of the present invention for obtaining website information via a GET request and a HEAD request;
FIG. 4 is an attack circuit diagram of a task node for a penetration target according to an embodiment of the present invention;
fig. 5 is a block diagram illustrating an exemplary embodiment of an apparatus for collecting website information.
Detailed Description
In order to make the technical solutions of the present application better understood by those skilled in the art, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only partial embodiments of the present application, and not all embodiments. All other embodiments obtained by a person of ordinary skill in the art without any inventive work based on the embodiments in the present application shall fall within the scope of protection of the present application. It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict.
It should be noted that the terms "first," "second," and the like in the description and claims of this application and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the application described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
Example 1
The method provided by the first embodiment of the present application may be executed in a computer device or a similar computing device. Taking the example of running on a computer device, fig. 1 is a hardware structure block diagram of a computer device for collecting website information according to an embodiment of the present invention. As shown in fig. 1, computer device 10 may include one or more (only one shown in fig. 1) processors 102 (processor 102 may include, but is not limited to, a processing device such as a microprocessor MCU or a programmable logic device FPGA) and a memory 104 for storing data, and optionally may also include a transmission device 106 for communication functions and an input-output device 108. It will be understood by those skilled in the art that the configuration shown in fig. 1 is merely illustrative and is not intended to limit the configuration of the computer device described above. For example, computer device 10 may also include more or fewer components than shown in FIG. 1, or have a different configuration than shown in FIG. 1.
The memory 104 may be used to store a computer program, for example, a software program and a module of an application software, such as a computer program corresponding to a method for collecting website information according to an embodiment of the present invention, and the processor 102 executes various functional applications and data processing by running the computer program stored in the memory 104, so as to implement the method described above. The memory 104 may include high speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, memory 104 may further include memory located remotely from processor 102, which may be connected to computer device 10 via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The transmission device 106 is used for receiving or transmitting data via a network. Specific examples of the network described above may include a wireless network provided by a communications provider of the computer device 10. In one example, the transmission device 106 includes a Network adapter (NIC), which can be connected to other Network devices through a base station so as to communicate with the internet. In one example, the transmission device 106 may be a Radio Frequency (RF) module, which is used for communicating with the internet in a wireless manner.
In this embodiment, a method for collecting website information is provided, and fig. 2 is a flowchart of a method for collecting website information according to an embodiment of the present invention, as shown in fig. 2, the flowchart includes the following steps:
step S202, determining a website domain name of a web website of a penetration target, wherein the penetration target is a network system connected through a network;
the penetration target of this embodiment is a network system composed of hardware, software and a network, which runs in a local area network or a wide area network and can be isolated from the wide area network through a switch, a firewall, and the like, the network system includes an electronic device and a data program, the network system includes a server, a database, a service system, an electronic device accessing to the local area network, an operating system installed on the electronic device, and the like, and is applied in various scenarios, such as a unit with strong confidentiality or strong security requirement.
Step S204, performing directory scanning on the website domain name to obtain a directory structure of the web website;
step S206, collecting website information of the web website according to the directory structure;
and step S208, detecting the external vulnerability of the penetration target by using the website information.
The external vulnerability of the embodiment is a defect of a network system which can be utilized by a third-party device, so that an attacker can access or destroy the system without authorization.
Through the steps, the website domain name of the web website of the penetration target is determined, the website domain name is subjected to directory scanning to obtain the directory structure of the web website, the website information of the web website is collected according to the directory structure, and finally the website information is used for detecting the external loophole of the penetration target, so that the technical problem that the website information cannot be collected through the directory structure and the loophole cannot be detected in the related technology is solved, and more detailed parameter information can be provided during penetration testing.
The execution main body of the embodiment may be an electronic device such as a computer or a tablet, and the electronic device is connected to a local area network where the penetration target is located, or connected to a wide area network.
In this embodiment, acquiring website information of a web website according to a directory structure includes: collecting at least one of the following information of the web site according to the directory structure: webpage files, webpage interfaces and background page addresses.
In an optional implementation manner of this embodiment, performing directory scanning on a domain name of a website to obtain a directory structure of a web website includes: taking the website domain name as a root node, and scanning the website sub domain names of each level of sub nodes downwards in sequence until reaching leaf nodes; and setting corresponding domain name addresses in the root node and each level of nodes to obtain a directory structure of the web website.
In one example, the penetration targets are network systems of the new wave, with the website domain names: www.sina.com.cn, wherein www.sina.com.cn/a is website sub-domain name.
In one embodiment of this embodiment, collecting website information of a web website according to a directory structure includes:
s11, sending an access request to the web site according to the directory structure;
s12, when the web site returns the status code indicating that the page does not exist, collecting the web site information of the web site through the HTTP request; and when the web site normally responds to the access request, determining the web page content fed back by the web site as the site information of the web site.
The status code of this embodiment is 200, and when the server successfully processes the request while accessing the website, the status code is returned, but the page may not exist by customization (404). The HTTP request of this embodiment includes multiple types, such as a HEAD request, a GET request, a POST request, a PUT request, a Delete request, an OPTIONS request, etc., where the GET, POST, PUT, and Delete in HTTP correspond to the check, change, add, and Delete of this resource, the GET is used for information acquisition, a request is sent to acquire a certain resource on the server, the page information specified by the GET request is returned to the entity body, and the HEAD only requests the header of the page.
In one example, collecting website information for a web site via an HTTP request includes: sending a HEAD request to a web website according to a directory structure; and receiving HTTP header information of the web website fed back by the web website based on the HEAD request.
In another example, collecting website information for a web site according to a directory structure includes: sending a GET request to a web website according to a directory structure; and receiving HTTP header information and presentation data of the web website fed back by the web website based on the GET request.
Fig. 3 is a logic diagram of acquiring website information through a GET request and a HEAD request according to an embodiment of the present invention. The Web directory scan is used for discovering a directory structure of a website, sensitive files existing in a website directory, website information leakage, background page addresses and the like. And multithreading default detection is supported, directory scanning is allowed in a HEAD and GET request mode, and the dictionary of the directory and script pages is optimized and classified. The method supports a self-help optimization technology, can ensure that the optimal dictionary is preferentially used for scanning each time, and can specify the state code for identification. The method can accurately identify the nonexistent page, ensures the accuracy of the result, and particularly solves the problem that the self-defined nonexistent (404) page with the state code of 200. In order to more comprehensively find the website directory, a recursive mode scanning is supported, and the domain name of the root node is recursively scanned to the domain name of the leaf node.
In this embodiment, after detecting an external vulnerability of a penetration target using website information, the method further includes: acquiring the operation authority of the penetration target by utilizing the external loophole; and performing the penetration operation on the network system by using the operation authority. Wherein the permeation operation comprises at least one of: and accessing a business system of the penetration target, accessing local data of the penetration target, and performing transverse penetration on an intranet of the penetration target. The service system includes a website server, a database, and the like, for example, the website server is frequently accessed, the same instruction is frequently sent, and the like, when the service system exceeds the upper processing limit, a downtime or a crash may be caused, and the local data in this embodiment includes data that can be shared in the local area network, data stored in each device connected through the local area network, and the like.
The embodiment packages the detected available vulnerabilities, integrates the complex vulnerability exploitation processes into the plug-in library, and can execute the response input one-key to obtain the echoed results when the vulnerabilities need to be performed, such as executing system commands. For example, after finding the weblogic deserialization vulnerability, the operations of command execution, file uploading, interactive shell rebounding and the like can be directly executed through high-level utilization functions. The penetrant only needs to input the target address, and the vulnerability discovery and utilization process can be carried out through one key. The method provides a function of independent vulnerability exploitation for vulnerabilities which cannot be automatically discovered completely, and a penetrant can exploit vulnerabilities in a one-key mode only by inputting corresponding parameters, such as the exploitation of a fastjson vulnerability. Meanwhile, the method can also be used for utilizing known vulnerabilities, such as inputting oracle account passwords, one-key right-lifting, executing system commands and the like. This function greatly simplifies the leak hole utilization process.
The embodiment instructs to execute the permeation operation by sending the permeation instruction to the permeation target, and before sending the permeation instruction to the target server of the permeation target, the permeation target needs to pass through a gateway and a protection System of the permeation target, including a WAF, an IDS (Intrusion Detection System), an IPS (Intrusion Prevention System), a monitoring device, a router, and a switch. The method for adding various means for bypassing WAF in the bottom-layer package sending program and automatically selecting WAF according to the target condition comprises the following steps: 1. bypassing the resource limitation detection type WAF by filling a large amount of useless data in the head of the data packet; 2. adopting encoding, deformation, function replacement of the same type, comment symbol processing, word segmentation and database grammatical characteristics to bypass a rule detection type WAF; 3. detecting the WAF type by bypassing the protocol layer by adopting protocol conversion, protocol format change and protocol replacement; 4. the WAF is bypassed using an autonomously discovered packet fragmentation transmission technique. The fragmentation transmission is to divide every three bytes of data to be transmitted into a plurality of data packets, and transmit the data packets to a target server independently, so as to avoid a detection means based on the content matching of the data packets, and embed the fragmentation technology of the embodiment in a bottom layer program for transmitting the HTTP data packets.
In this embodiment, in an implementation manner of this embodiment, the method further includes: and determining the external loophole as a dangerous entrance of the local area network, determining the operation authority as the illegal authority of the network system, and generating a penetration test report of the penetration target.
The present embodiment may customize the specified detection scheme according to the operating environment of the permeation target. For example, a scene of detecting a newly-developed bug, a scene of detecting a weak mail password, a scene of detecting an industrial control bug, and the like. The scene detection is supported, and scenes at least including routine tests, attack and defense exercises, shooting range exercises, safety capability assessment and the like can be quickly customized according to requirements, so that the requirement for discovering the vulnerability of the customized scene is met. The number of added targets is not limited by a single infiltration task, and the tasks can be executed in a distributed and concurrent mode, so that vulnerability discovery with high efficiency is guaranteed.
Fig. 4 is an attack circuit diagram of task nodes for a penetration target according to the embodiment of the present invention, which illustrates a flow direction from information collection to post-penetration attack, and each task node can execute penetration testing. In this embodiment, the implementation of each function may be implemented by a functional module disposed in the permeation device, including:
an information collection module: prior to the penetration test, various online means are used to collect information about the penetration target. The information collection module is mainly used for completing information collection of the infiltration target.
A vulnerability detection module: the module enables automated vulnerability detection of a penetration target. The leak detection is divided into two modes, a website URL detection mode and an IP address detection mode. The website URL detection method is to perform fingerprint identification on a target, collect fingerprint information such as middleware, a universal website frame, development language, an operating system and the like, and find vulnerability plug-in related to the target from a plug-in library to find the existing vulnerability. The IP address detection mode is to scan a port of a target, find out an externally open service, identify a corresponding service type, and search for a vulnerability plug-in related to the service type, so as to judge whether the vulnerability exists.
The vulnerability plug-in library comprises a plurality of vulnerability plug-ins, and vulnerability ranges cover Web, middleware, a database, network equipment, an operating system, intelligent equipment, a mobile terminal, industrial control equipment and other systems. Vulnerabilities of types not limited to SQL (structured query language) injection, XXE (Xml external entity injection), XSS (cross site scripting attack), arbitrary file uploads, arbitrary file downloads, arbitrary file operations, information leaks, weak passwords, local file containment, directory traversal, command execution, misconfigurations, etc. can be discovered. Some plug-ins also provide advanced functionality for one-key exploits. The high level functions include: executing commands, executing SQL, uploading files, rebounding Shell, uploading GTWebShell, downloading files, etc. The vulnerability plug-in library is maintained by 360-year penetration experience personnel.
The Web fingerprint repository can identify various CMSs (content management systems), up to a large number of total rules. The system service fingerprint is integrated with the NMAP tool fingerprint library, and the type and version identification of the conventional system service can be met. The scene detection is supported, and scenes at least including conventional tests, attack and defense exercises, shooting range exercises, safety capability assessment and the like can be quickly customized according to requirements, so that the requirement for discovering the vulnerability of the customized scene is met. The number of the added targets is not limited by a single task, and the tasks can be executed in a distributed and concurrent manner, so that the vulnerability discovery with high efficiency is guaranteed.
A vulnerability exploiting module: the exploit module is used to solve two problems: firstly, aiming at some vulnerabilities which cannot be automatically discovered completely, a single vulnerability exploiting function is provided; for example, when some target addresses cannot be automatically acquired through a crawler or other means, the penetrant can utilize the module by one-touch vulnerability by only manually filling corresponding parameters. Secondly, whether the specified vulnerability exists can be directly detected and the vulnerability can be further utilized. The function can simplify the complex vulnerability exploiting process, such as inputting oracle account number password, one-key right-lifting, executing system command and the like. In addition, the module also provides high-level functions of the vulnerability exploitation, including command execution, SQL execution, file uploading, Shell rebound, GTWebshell uploading, file downloading and the like, and the module can be used for the vulnerability exploitation.
A rear infiltration module: and performing transverse infiltration on the target through a rear infiltration module. For example: the method comprises the steps of discovering the network topology condition of an intranet, discovering the database loophole of the intranet, discovering the position of a mail server, and even acquiring the authority of an office network segment, an operation and maintenance host or a domain controller. The rear infiltration module comprises a remote control system, can control 16 platforms such as windows, linux, unix, android, ios, aix, bsd, cisco, osx and the like, and supports more than 30 frames such as X86, X64, arm, sparc, ppc and the like. For the controlled end, the generation of the controlled end with various formats is supported, including an executable file format. Such as more than 20 of exe, elf, powershell, vbs, dll, etc., and generation of the original Shellcode. The functions of host information collection, host right extraction, intranet network topology discovery, host evidence obtaining, password acquisition, system screenshot, keyboard recording and the like can be realized by connecting the back penetration module with the external network vulnerability made by other vulnerabilities and by using the back penetration plug-in.
A plug-in management module: the plug-in is quickly written according to the related documents, and the tool also provides the code automatic generation function to facilitate the writing of the plug-in. The plug-in library management supports the submission and the introduction of new plug-ins at any time, and the loading of the new plug-ins without delay is realized by utilizing a dynamic introduction and loading technology. In order to ensure the effectiveness and accuracy of the plug-in, the enabling and disabling operation functions of the plug-in are provided, so that the plug-in library rules can be conveniently configured at any time. An auditing mechanism of the plug-ins is added for better maintaining the plug-in library, and the plug-ins in the plug-in library are guaranteed to be high-quality plug-ins. The system has a perfect plug-in library management function, and can submit the plug-ins, check the plug-in list and examine the plug-ins.
Fingerprint management module: fingerprint management is designed mainly for maintaining fingerprint libraries, and all rule information in the fingerprint libraries can be viewed on the page. The fingerprint management module provides a function of submitting fingerprints, and facilitates the addition of fingerprint information by penetrants at any time. By utilizing the dynamic import technology, the loading of the new fingerprint into the fingerprint library without delay can be realized. The fingerprint rule is added to support a Web general framework, middleware, a development language, a third frame and the like. The identification means supports the modes of character strings, MD5, data packet headers, special page state codes and the like. In order to better maintain the fingerprint database, a fingerprint verification mechanism is added, and plug-ins in the fingerprint database are guaranteed to be high-quality fingerprint rules. The fingerprint management comprises functions of submitting fingerprints, listing fingerprints and checking the fingerprints.
The WAF bypasses the technology module: many WAF (Web application level intrusion prevention system) protection devices are deployed in network nodes, and this module is used to bypass the protection devices.
Through the above description of the embodiments, those skilled in the art can clearly understand that the method according to the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but the former is a better implementation mode in many cases. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a storage medium (e.g., ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal device (e.g., a mobile phone, a computer, a server, or a network device) to execute the method according to the embodiments of the present invention.
Example 2
In this embodiment, a device for acquiring website information is further provided, and the device is used to implement the foregoing embodiments and preferred embodiments, and the description of the device that has been already made is omitted. As used below, the term "module" may be a combination of software and/or hardware that implements a predetermined function. Although the means described in the embodiments below are preferably implemented in software, an implementation in hardware, or a combination of software and hardware is also possible and contemplated.
Fig. 5 is a block diagram of a structure of an apparatus for collecting website information according to an embodiment of the present invention, which may be applied to a server, as shown in fig. 5, the apparatus includes: a determination module 50, a scanning module 52, an acquisition module 54, a detection module 56, wherein,
a determining module 50, configured to determine a website domain name of a web website of a penetration target, where the penetration target is a network system connected through a network;
the scanning module 52 is configured to perform directory scanning on the website domain name to obtain a directory structure of the web website;
the acquisition module 54 is configured to acquire website information of the web website according to the directory structure;
and the detection module 56 is used for detecting the external vulnerability of the infiltration target by using the website information.
Optionally, the collecting module includes: the acquisition unit is used for acquiring at least one of the following information of the web website according to the directory structure: webpage files, webpage interfaces and background page addresses.
Optionally, the scanning module includes: the scanning unit is used for scanning the website sub domain names of each level of sub nodes downwards in sequence by taking the website domain names as root nodes until the website sub domain names reach leaf nodes; and the setting unit is used for setting corresponding domain name addresses in the root node and each level of nodes to obtain the directory structure of the web website.
Optionally, the scanning module includes: a sending unit, configured to send an access request to the web site according to the directory structure; the processing unit is used for collecting website information of the web website through a hypertext transfer protocol (HTTP) request when the web website returns a status code for indicating that a page does not exist; and when the web site normally responds to the access request, determining the web page content fed back by the web site as the site information of the web site.
Optionally, the processing unit includes: the first sending subunit is used for sending a HEAD request to the web website according to the directory structure; and the first receiving subunit is used for receiving HTTP header information of the web website fed back by the web website based on the HEAD request.
Optionally, the processing unit includes: the second sending subunit is configured to send a GET request to the web site according to the directory structure; and the second receiving subunit is used for receiving HTTP header information and presentation data of the web website fed back by the web website based on the GET request.
Optionally, the apparatus further comprises: the acquisition module is used for acquiring the operation authority of the permeable target by using the external vulnerability after the detection module uses the website information to detect the external vulnerability of the permeable target; and the infiltration module is used for executing infiltration operation on the network system by using the operation authority.
It should be noted that, the above modules may be implemented by software or hardware, and for the latter, the following may be implemented, but is not limited to this: the modules are all positioned in the same processor; alternatively, the modules are respectively located in different processors in any combination.
Example 3
Embodiments of the present invention also provide a storage medium having a computer program stored thereon, wherein the computer program is arranged to perform the steps of any of the above method embodiments when executed.
Alternatively, in the present embodiment, the storage medium may be configured to store a computer program for executing the steps of:
s1, determining a website domain name of a web website of a penetration target, wherein the penetration target is a network system connected through a network;
s2, performing directory scanning on the website domain name to obtain a directory structure of the web website;
and S3, acquiring the website information of the web website according to the directory structure.
Optionally, in this embodiment, the storage medium may include, but is not limited to: various media capable of storing computer programs, such as a usb disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic disk, or an optical disk.
Embodiments of the present invention also provide an electronic device comprising a memory having a computer program stored therein and a processor arranged to run the computer program to perform the steps of any of the above method embodiments.
Optionally, the electronic apparatus may further include a transmission device and an input/output device, wherein the transmission device is connected to the processor, and the input/output device is connected to the processor.
Optionally, in this embodiment, the processor may be configured to execute the following steps by a computer program:
s1, determining a website domain name of a web website of a penetration target, wherein the penetration target is a network system connected through a network;
s2, performing directory scanning on the website domain name to obtain a directory structure of the web website;
and S3, acquiring the website information of the web website according to the directory structure.
Optionally, the specific examples in this embodiment may refer to the examples described in the above embodiments and optional implementation manners, and this embodiment is not described herein again.
The above-mentioned serial numbers of the embodiments of the present application are merely for description and do not represent the merits of the embodiments.
In the above embodiments of the present application, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
In the embodiments provided in the present application, it should be understood that the disclosed technology can be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one type of division of logical functions, and there may be other divisions when actually implementing, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not implemented. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, units or modules, and may be in an electrical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a hardware form, and can also be realized in a software functional unit form.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and other various media capable of storing program codes.
The foregoing is only a preferred embodiment of the present application and it should be noted that, for a person skilled in the art, several modifications and improvements can be made without departing from the principle of the present application, and these modifications and improvements should also be considered as the protection scope of the present application.

Claims (10)

1. A method for collecting website information, comprising:
determining a website domain name of a web website of a penetration target, wherein the penetration target is a network system connected through a network;
performing directory scanning on the website domain name to obtain a directory structure of the web website;
acquiring website information of the web website according to the directory structure;
and detecting an external leak of the penetration target by using the website information.
2. The method of claim 1, wherein collecting website information for the web site according to the directory structure comprises:
collecting at least one of the following information of the web site according to the directory structure: webpage files, webpage interfaces and background page addresses.
3. The method of claim 1, wherein performing a directory scan on the website domain name to obtain a directory structure of the web website comprises:
taking the website domain name as a root node, and scanning the website sub domain names of each level of sub nodes downwards in sequence until reaching leaf nodes;
and setting corresponding domain name addresses in the root node and each level of nodes to obtain a directory structure of the web website.
4. The method of claim 1, wherein collecting website information for the web site according to the directory structure comprises:
sending an access request to the web website according to the directory structure;
when the web website returns a status code indicating that the page does not exist, collecting website information of the web website through a hypertext transfer protocol (HTTP) request; and when the web site normally responds to the access request, determining the web page content fed back by the web site as the site information of the web site.
5. The method of claim 4, wherein collecting website information for the web site via the HTTP request comprises:
sending a HEAD request to the web site according to the directory structure;
and receiving HTTP header information of the web website fed back by the web website based on the HEAD request.
6. The method of claim 4, wherein collecting website information for the web website according to the directory structure comprises:
sending a GET request to the web website according to the directory structure;
and receiving HTTP header information and presentation data of the web website fed back by the web website based on the GET request.
7. The method of claim 1, wherein after detecting an external vulnerability of the penetration target using the website information, the method further comprises:
acquiring the operation authority of the penetration target by utilizing the external loophole;
and executing the penetration operation on the network system by using the operation authority.
8. An apparatus for collecting website information, comprising:
the system comprises a determining module, a judging module and a judging module, wherein the determining module is used for determining a website domain name of a web website of a penetration target, and the penetration target is a network system connected through a network;
the scanning module is used for carrying out directory scanning on the website domain name to obtain a directory structure of the web website;
the acquisition module is used for acquiring the website information of the web website according to the directory structure;
and the detection module is used for detecting the external leak of the penetration target by using the website information.
9. A storage medium, in which a computer program is stored, wherein the computer program is arranged to perform the method of any of claims 1 to 7 when executed.
10. An electronic device comprising a memory and a processor, wherein the memory has stored therein a computer program, and wherein the processor is arranged to execute the computer program to perform the method of any of claims 1 to 7.
CN201910750206.9A 2019-08-14 2019-08-14 Method and device for collecting website information, storage medium and electronic device Pending CN110765333A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910750206.9A CN110765333A (en) 2019-08-14 2019-08-14 Method and device for collecting website information, storage medium and electronic device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910750206.9A CN110765333A (en) 2019-08-14 2019-08-14 Method and device for collecting website information, storage medium and electronic device

Publications (1)

Publication Number Publication Date
CN110765333A true CN110765333A (en) 2020-02-07

Family

ID=69329748

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910750206.9A Pending CN110765333A (en) 2019-08-14 2019-08-14 Method and device for collecting website information, storage medium and electronic device

Country Status (1)

Country Link
CN (1) CN110765333A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114676330A (en) * 2022-03-30 2022-06-28 南京厚建软件有限责任公司 Method for uniformly recovering interactive data of Internet platform
CN114816558A (en) * 2022-03-07 2022-07-29 深圳开源互联网安全技术有限公司 Script injection method and device and computer readable storage medium
CN115208789A (en) * 2022-07-14 2022-10-18 上海斗象信息科技有限公司 Method and device for determining directory blasting behavior, electronic equipment and storage medium
CN115941280A (en) * 2022-11-10 2023-04-07 北京源堡科技有限公司 Penetration method, device, equipment and medium based on web fingerprint information

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104009881A (en) * 2013-02-27 2014-08-27 广东电网公司信息中心 Method and device for system penetration testing
CN104363236A (en) * 2014-11-21 2015-02-18 西安邮电大学 Automatic vulnerability validation method
CN107707561A (en) * 2017-11-01 2018-02-16 北京知道创宇信息技术有限公司 penetration testing method and device
US20180270268A1 (en) * 2017-01-30 2018-09-20 XM Ltd. Verifying success of compromising a network node during penetration testing of a networked system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104009881A (en) * 2013-02-27 2014-08-27 广东电网公司信息中心 Method and device for system penetration testing
CN104363236A (en) * 2014-11-21 2015-02-18 西安邮电大学 Automatic vulnerability validation method
US20180270268A1 (en) * 2017-01-30 2018-09-20 XM Ltd. Verifying success of compromising a network node during penetration testing of a networked system
CN107707561A (en) * 2017-11-01 2018-02-16 北京知道创宇信息技术有限公司 penetration testing method and device

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114816558A (en) * 2022-03-07 2022-07-29 深圳开源互联网安全技术有限公司 Script injection method and device and computer readable storage medium
CN114816558B (en) * 2022-03-07 2023-06-30 深圳市九州安域科技有限公司 Script injection method, equipment and computer readable storage medium
CN114676330A (en) * 2022-03-30 2022-06-28 南京厚建软件有限责任公司 Method for uniformly recovering interactive data of Internet platform
CN114676330B (en) * 2022-03-30 2023-12-08 南京厚建软件有限责任公司 Method for uniformly recovering interactive data of Internet platform
CN115208789A (en) * 2022-07-14 2022-10-18 上海斗象信息科技有限公司 Method and device for determining directory blasting behavior, electronic equipment and storage medium
CN115208789B (en) * 2022-07-14 2023-06-09 上海斗象信息科技有限公司 Method and device for determining directory blasting behavior, electronic equipment and storage medium
CN115941280A (en) * 2022-11-10 2023-04-07 北京源堡科技有限公司 Penetration method, device, equipment and medium based on web fingerprint information
CN115941280B (en) * 2022-11-10 2024-01-26 北京源堡科技有限公司 Penetration method, device, equipment and medium based on web fingerprint information

Similar Documents

Publication Publication Date Title
CN110677381A (en) Penetration testing method and device, storage medium and electronic device
CN110881024B (en) Vulnerability detection method and device, storage medium and electronic device
CN110768951B (en) Method and device for verifying system vulnerability, storage medium and electronic device
CN108183916B (en) Network attack detection method and device based on log analysis
CN110765333A (en) Method and device for collecting website information, storage medium and electronic device
CN110879891A (en) Vulnerability detection method and device based on web fingerprint information
CN110880983A (en) Penetration testing method and device based on scene, storage medium and electronic device
CN111400722B (en) Method, apparatus, computer device and storage medium for scanning small program
CN108989355B (en) Vulnerability detection method and device
KR20090090685A (en) Method and system for determining vulnerability of web application
CN111783096B (en) Method and device for detecting security hole
Li et al. Towards fine-grained fingerprinting of firmware in online embedded devices
US10033761B2 (en) System and method for monitoring falsification of content after detection of unauthorized access
CN110768949B (en) Vulnerability detection method and device, storage medium and electronic device
CN113259392B (en) Network security attack and defense method, device and storage medium
CN110768948A (en) Vulnerability detection method and device, storage medium and electronic device
CN110768947B (en) Penetration test password sending method and device, storage medium and electronic device
CN106982188B (en) Malicious propagation source detection method and device
CN110768950A (en) Permeation instruction sending method and device, storage medium and electronic device
Cisar et al. Some ethical hacking possibilities in Kali Linux environment
Tang et al. {iOS}, your {OS}, everybody's {OS}: Vetting and analyzing network services of {iOS} applications
Lingenfelter et al. Analyzing variation among IoT botnets using medium interaction honeypots
CN113746781A (en) Network security detection method, device, equipment and readable storage medium
CN114666104A (en) Penetration testing method, system, computer equipment and storage medium
JP5613000B2 (en) Application characteristic analysis apparatus and program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB03 Change of inventor or designer information
CB03 Change of inventor or designer information

Inventor after: Gong Yushan

Inventor after: Tian Yue

Inventor before: Tian Yue

RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20200207