WO2018107784A1 - 检测网页后门的方法和装置 - Google Patents
检测网页后门的方法和装置 Download PDFInfo
- Publication number
- WO2018107784A1 WO2018107784A1 PCT/CN2017/096502 CN2017096502W WO2018107784A1 WO 2018107784 A1 WO2018107784 A1 WO 2018107784A1 CN 2017096502 W CN2017096502 W CN 2017096502W WO 2018107784 A1 WO2018107784 A1 WO 2018107784A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- url
- access request
- webpage
- address
- entry
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 78
- 238000001514 detection method Methods 0.000 claims abstract description 22
- 238000012545 processing Methods 0.000 claims description 32
- 230000004044 response Effects 0.000 claims description 32
- 230000008569 process Effects 0.000 claims description 31
- 238000010606 normalization Methods 0.000 claims description 14
- 238000010586 diagram Methods 0.000 description 10
- 230000006870 function Effects 0.000 description 5
- 206010042635 Suspiciousness Diseases 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 230000002452 interceptive effect Effects 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 238000012546 transfer Methods 0.000 description 3
- 230000006399 behavior Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- VYZAMTAEIAYCRO-UHFFFAOYSA-N Chromium Chemical compound [Cr] VYZAMTAEIAYCRO-UHFFFAOYSA-N 0.000 description 1
- 230000003044 adaptive effect Effects 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 239000003795 chemical substances by application Substances 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 239000002360 explosive Substances 0.000 description 1
- 239000000835 fiber Substances 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 238000005242 forging Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 230000005055 memory storage Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
- 239000002699 waste material Substances 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/14—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
- H04L63/1408—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
- H04L63/1416—Event detection, e.g. attack signature detection
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/14—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
- H04L63/1441—Countermeasures against malicious traffic
- H04L63/1483—Countermeasures against malicious traffic service impersonation, e.g. phishing, pharming or web spoofing
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/02—Network architectures or network communication protocols for network security for separating internal from external traffic, e.g. firewalls
- H04L63/0227—Filtering policies
- H04L63/0236—Filtering by address, protocol, port number or service, e.g. IP-address or URL
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/14—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
- H04L63/1408—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
- H04L63/1425—Traffic logging, e.g. anomaly detection
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/14—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
- H04L63/1441—Countermeasures against malicious traffic
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/20—Network architectures or network communication protocols for network security for managing network security; network security policies in general
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/02—Protocols based on web technology, e.g. hypertext transfer protocol [HTTP]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L69/00—Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
- H04L69/22—Parsing or analysis of headers
Definitions
- the present invention relates to the field of network security technologies, and in particular, to a method for detecting a back door of a webpage and a device for detecting a back door of the webpage.
- Webshell is a backdoor tool that exists as a web page file. You can get the operation rights of the website through webshell, such as uploading and downloading files, viewing the database, executing script commands, and so on.
- the Webshell file can be a web page file written using a dynamic server page (English: Active Server Page, ASP) application, or a web page file written in a hypertext preprocessor (English: Hypertext Preprocessor, PHP) language, or a common gateway interface. : Common Gateway Interface, CGI) program file.
- a host that provides web services and ports related to open web services in the network is also referred to as a web server or a web server.
- Web servers tend to be the target of webshell attacks. After the attacker successfully exploits the vulnerability of the open port and invades the web server, the webshell file is stored in the webpage directory of the web server, and is mixed with the normal webpage file. Thereafter, the attacker can access the webshell file stored in the web server through the browser to obtain the operation authority for the web server, thereby achieving the illegal purpose of controlling the web server and stealing information. Since the data between the attacker and the attacked website server is usually transmitted through the default port 80 port of the web service, the firewall does not normally block access to the 80-port hypertext transfer protocol in order not to affect the normal web access behavior of the network user. (English: HyperText Transfer Protocol, HTTP) traffic, so simple packet filtering does not prevent the above attacks.
- HTTP HyperText Transfer Protocol
- the prior art forms a webshell feature library by manually analyzing the code of the webshell file or analyzing the traffic generated by the attacker when accessing the webshell file to obtain the characteristics of the webshell. After the security device obtains the web traffic, the web traffic is matched with the features in the webshell feature library to implement the purpose of detecting the webshell.
- the security equipment consumes a large amount of processing resources, and the detection efficiency is low.
- the embodiment of the present application provides a method for detecting a back door of a webpage, which is used to alleviate the problem of low detection efficiency of the prior art.
- the first aspect provides a method for detecting a backdoor of a webpage, including: acquiring first web traffic of the protected host, where the first web traffic refers to that the webpage provided by the protected host is accessed in the first time period. Time And generating a webpage access record of the protected host according to the first web traffic, where the webpage access record is used to save at least one Uniform Resource Locator (URL), and access the at least An IP address of each URL in a URL, and a total number of times the URL is accessed, wherein each of the URLs identifies a web page provided by the protected host; according to the webpage access record, from the Determining a suspicious URL in the at least one URL, the total number of times the suspicious URL is accessed is less than a first threshold, and the ratio of the number of mutually different IP addresses accessing the suspicious URL to the total number of times the suspicious URL is accessed is less than a second threshold; and determining whether the webpage identified by the suspicious URL includes a backdoor feature in the backdoor feature database of the webpage, and detecting
- the embodiment of the present application builds a webpage access record that reflects the number of times that each webpage in the protected host is accessed, the IP distribution of the visitor, and the like based on the web traffic of the protected host that has occurred. Further, according to the webpage access record, a URL with a higher degree of suspiciousness is identified from a plurality of webpage URLs provided by the protected host, and subsequent detection of the webpage identified by the suspicious URL is performed without performing webpage backdoor detection on all the webpages.
- the above method reduces the number of web pages that need to be detected by the back door of the webpage, thereby improving the web detection performance.
- the application also provides a first specific structure of the webpage access record and detailed steps of how to construct the webpage access record.
- the suspicious URL can be quickly and easily determined by the web page access record of this structure.
- the webpage access record includes at least one entry, each of the at least one entry corresponding to one of the at least one URL respectively Each of the entries stores a total number of visited times and a list of IP addresses;
- the first web traffic generates a webpage access record of the protected host, including:
- the entry corresponding to the URL of the selected access request packet is found, the total number of times of the found entry is increased by one, and the record is recorded in the IP address list of the found entry. Describe the source IP address;
- the entry corresponding to the URL carried by the selected access request packet is created in the webpage access record, and the created entry is created.
- the total number of accessed entries of the entry is set to 1, and the source IP address is recorded in the IP address list of the created entry.
- determining, according to the webpage access record, determining a suspicious URL from the at least one URL includes:
- the second threshold is determined, and the URL corresponding to the selected entry is determined to be a suspicious URL.
- the application also provides a second specific structure of the webpage access record, and how to construct a webpage visit. Ask the detailed steps of the record.
- the second specific structure adds the information of the IP address count value based on the entry of the first specific structure, and the suspicious URL can be quickly determined by the webpage access record of the structure.
- the webpage access record includes at least one entry, each of the at least one entry corresponding to one of the at least one URL respectively The entry contains the total number of times visited, the IP address count value, and the IP address list;
- the first web traffic generates a webpage access record of the protected host, including:
- the entry corresponding to the URL of the selected access request packet is found, the total number of times of the searched entry is increased by one; and whether the IP address list of the searched entry is saved is determined.
- the source IP address if the source IP address is saved in the IP address list of the entry, the processing of the selected access request packet ends; if the IP of the found entry is If the source IP address is not saved in the address list, the IP address count value of the found entry is incremented by 1, and the source IP address is recorded in the IP address list of the found entry.
- the entry corresponding to the URL carried in the selected access request packet is created in the webpage access record, and the created entry is The total number of times of access is set to 1, the IP address count value of the created entry is set to 1, and the source IP address is recorded in the IP address list of the created entry.
- the determining, by the webpage access record, the suspicious URL from the at least one URL includes:
- the second threshold determines that the URL corresponding to the selected entry is a suspicious URL.
- This access process may not be successful when the terminal accesses the webpage through a browser. Recording the entry corresponding to the access failure page will occupy the storage space. Subsequent detection of the access failure page will also waste processing resources. To save storage resources and process resources, a possible implementation is to record only the entries corresponding to the access success page, as follows.
- the obtaining, by the first web traffic, the at least one access request message includes:
- the terminal accesses the webpage provided by the protected host through the installed browser. Due to the difference between the browser provider and the browser version, the access request may be caused by different browsers accessing the same webpage provided by the web server. Carry a different URL. If the security device generates an entry corresponding to a different URL, the actual situation that the access request message actually accesses the same webpage does not match, which causes a deviation in subsequent suspicious URL identification, and on the other hand, causes webpage access. The amount of recorded data is too large. In order to improve the accuracy of the suspicious URL identification, and save the storage space occupied by the webpage access record in the memory, the security device may first normalize the URL in the access request packet when generating the entry in the webpage access record, according to The normalized URL generation table entry. details as follows,
- searching, in the webpage access record, a URL corresponding to the selected access request packet The entries include:
- Performing at least one normalization process on the URL carried in the selected access request message to obtain a normalized URL where the normalization process includes one or more of the following (1) to (3) (1) converting the URL carried in the selected access request message into a predetermined encoding format, and (2) converting the character in the URL carried in the selected access request message into a predetermined capitalization type, and (3) removing parameters in the URL carried in the selected access request message;
- the entry corresponding to the URL carried in the access request packet is created in the webpage access record, specifically:
- An entry corresponding to the normalized URL is created in the webpage access record.
- the information recorded in the webpage access list may be further streamlined, and some information used to identify the suspicious URL may be deleted.
- the normal URL can be deleted, and the total number of times of access and the IP address of the normal URL in the entry corresponding to the normal URL are not maintained, thereby saving storage resources and processing resources for subsequent update entries. That is, in a seventh possible implementation manner of the first aspect, the method further includes:
- a normal URL from the at least one URL the normal URL being a URL of the at least one URL whose total number of visited times is greater than the first threshold, or a webpage backdoor detection result indication
- the identified webpage does not have a suspicious URL of the back door of the webpage
- the method further includes:
- the second web traffic refers to the traffic that occurs when the webpage provided by the protected host is accessed in the second time period after the first time period;
- Parsing the first access request packet so as to obtain the source IP address of the first access request packet and the carried URL; if the URL carried by the first access request packet is different from the normal URL, Web page visit If the URL of the first access request message is saved in the record, the total number of times of the URL of the saved first access request message is increased by one, and the first access request is accessed. Adding a source IP address of the first access request packet to an IP address of the URL carried in the text;
- Parsing the second access request packet so as to obtain the source IP address and the carried URL of the second access request packet; if the URL carried by the second access request packet is different from the normal URL, If the URL of the second access request packet is not saved in the webpage access record, the URL carried in the second access request packet is saved in the access record, and the second access request packet is carried.
- the IP address of the URL that is accessed by the second access request packet is the source IP address of the second access request packet.
- an apparatus for detecting a back door of a webpage having the functionality of implementing any one of the possible implementations of the method of the first aspect or the above aspect.
- the functions may be implemented by hardware or by corresponding software implemented by hardware.
- the hardware or software includes one or more modules corresponding to the functions described above.
- the embodiment of the present application provides a computer storage medium for storing computer software instructions used by the packet forwarding device, including any possible implementation for performing the above first aspect or the foregoing aspects.
- FIG. 1 is a schematic diagram of an application scenario of a method for detecting a back door of a webpage according to an embodiment of the present disclosure
- FIG. 2 is a schematic structural diagram of a security device according to an embodiment of the present application.
- FIG. 3 is a flowchart of a method for detecting a back door of a webpage according to an embodiment of the present application
- FIG. 4 is a schematic structural diagram of a hash table according to an embodiment of the present application.
- FIG. 5 is a flowchart of a method for constructing a webpage access record according to a first web traffic according to an embodiment of the present application
- FIG. 6 is a diagram showing an example of an entry provided by an embodiment of the present application.
- FIG. 7 is a schematic structural diagram of another hash table according to an embodiment of the present application.
- FIG. 8 is another flowchart of a method for detecting a back door of a webpage according to an embodiment of the present application.
- FIG. 9 is a schematic diagram of a webpage access record before a security device processes three access request messages according to an embodiment of the present disclosure
- FIG. 10 is a schematic diagram of a webpage access record after a security device processes three access request messages according to an embodiment of the present disclosure
- FIG. 11 is a schematic structural diagram of an apparatus for detecting a back door of a webpage according to an embodiment of the present application.
- a series of interactive messages between the browser and the web server generated by the terminal using the browser to access the web page is called web traffic.
- web servers often store millions of web files, and on the other hand, end users frequently perform web page access activities, resulting in rapid growth of web traffic.
- Existing security devices such as firewalls and deep packet inspection (DPI) are subject to performance constraints, and it is difficult to detect all the webpage data carried by the received web traffic one by one.
- DPI deep packet inspection
- the embodiment of the present application provides a method for detecting a back door of a webpage.
- the method builds a webpage access record that reflects the number of times each webpage in the protected host is accessed, the IP distribution of the visitor, etc. based on the web traffic of the protected host that has occurred. Further identifying, according to the webpage access record, a URL with a higher degree of suspiciousness from a Uniform Resource Locator (URL) of all webpages provided by the protected host, and subsequently focusing on detecting the webpage identified by the suspicious URL without Web page backdoor detection for all web pages.
- the above method reduces the number of web pages to be detected, thereby improving web detection performance.
- FIG. 1 is a schematic diagram of an application scenario of an embodiment of the present application.
- the network system includes a website server 11, a security device 12, and a plurality of terminals 13.
- the web server 11 is an example of a protected host.
- the protected host refers to a host capable of providing a webpage service.
- IIS Internet Information Services
- the terminal 13 refers to a terminal device having a webpage access function, such as a personal computer with a browser installed, a smart phone or a portable computer, and the like.
- a browser is an application for retrieving and presenting Internet information resources. Currently commonly used browsers include Internet Explorer, Mozilla Firefox, Google Chrome, and more.
- the terminal 13 can be located in a local area network, and accesses the website server 11 in the Internet through a Network Address Translation (NAT) device.
- NAT Network Address Translation
- the terminal 13 can also directly access the website server 11 in the Internet directly through the public IP address.
- the secure device 12 acquires web traffic generated when the terminal 13 accesses the web server 11.
- the security device 12 is disposed on a communication path between the terminal 13 and the website server 11, and the traffic accessing the website server 11 is forwarded to the website server via the security device 12.
- the security device 12 is a firewall disposed in front of the website server 11, and the website server 11 accesses the network through a firewall.
- the security device 12 maintains web traffic flowing through the security device 12 to access the web server 11.
- the security device 12 can also be deployed in a bypass manner, not shown in FIG. 1, for example, the website server 11 accesses the network through the gateway device 14, and the security device 12 is a DPI device connected to the gateway device 14.
- the gateway device 14 mirrors the traffic of the terminal 13 to the web server 11, and then sends the mirrored traffic obtained by the mirroring process to the DPI device.
- the specific deployment manner of the security device 12 is not limited in the embodiment of the present application, as long as the security device 12 can obtain the web traffic of the terminal 13 accessing the website server 11.
- the security device 12 can participate in the traffic forwarding process of other network devices.
- the IP addresses of one or more protected hosts may be pre-stored in the security device 12.
- the security device 12 filters out the traffic that occurs when the webpage provided by the protected host is accessed according to the pre-stored IP address of the protected host in combination with the protocol type of the web access, such as HTTP.
- the webpages provided by the plurality of protected hosts are detected by the method provided in the embodiment of the present application.
- the embodiment of the present application mainly describes the protected host as a website server as an example, and similar processing can be performed for multiple protected hosts.
- FIG. 2 is a schematic structural diagram of a security device provided by an embodiment of the present application.
- the security device can be the security device 12 of Figure 1.
- the security device includes a processor 210, a memory 220, a network interface 230, an input device 240, a display 250, and a bus 260.
- the processor 210, the memory 220, and the network interface 230, the input device 240, and the display 250 are connected to one another via a bus 304.
- the processor 210 may be one or more central processing units (CPUs). In the case where the processor 210 is a CPU, the CPU may be a single core CPU or a multi-core CPU.
- Memory 220 includes, but is not limited to, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), or portable read only memory (CD-ROM).
- RAM random access memory
- ROM read only memory
- EPROM erasable programmable read only memory
- CD-ROM portable read only memory
- the network interface 230 can be used as a wired interface, such as a Fiber Distributed Data Interface (FDDI) interface or a Gigabit Ethernet (GE) interface; the network interface 230 can also be a wireless interface. interface.
- FDDI Fiber Distributed Data Interface
- GE Gigabit Ethernet
- the processor 210 is configured to read the program code 222 stored in the memory 220, and perform the following operations after the operation.
- the processor 210 acquires the first web traffic of the protected host through the network interface 230, where the first web traffic of the protected host refers to the traffic that occurs when the webpage provided by the protected host is accessed during the first time period. .
- the web traffic on which the webpage access record is generated is referred to as the first web traffic in order to distinguish the web traffic obtained in the different stages.
- the received web traffic is referred to as the second web traffic.
- the second web traffic can be used to update the web page access record.
- the processor 210 generates a webpage access record 221 of the protected host by using the first web traffic, wherein the webpage access record stores at least one URL, accesses an IP address of each URL in the at least one URL, and each of the The total number of times a URL has been accessed, wherein each of the URLs identifies a web page provided by the protected host.
- the processor 210 stores the generated web page access record 221 in the memory 220.
- the processor 210 determines, according to the webpage access record, a suspicious URL from the at least one URL, the total number of visited suspicious URLs being less than a first threshold, and the number of mutually different IP addresses accessing the suspicious URL A ratio of the total number of times the suspicious URL is accessed is less than a second threshold.
- the processor 210 detects, according to the webpage backdoor feature library in the memory 220, whether the webpage identified by the suspicious URL has a webpage backdoor.
- the security device constructs a webpage access record that reflects the number of times each webpage in the protected host is accessed, the IP distribution of the visitor, and the like, and identifies a URL with a higher degree of suspiciousness from the URLs of all the webpages provided by the protected host. Subsequent focus on detecting web pages identified by suspicious URLs without detecting all web pages. The web detection performance is improved by reducing the number of web pages to be detected.
- FIG. 3 is a schematic flowchart of a method for detecting a back door of a webpage according to an embodiment of the present application. The method can be performed by the security device 12 of FIG.
- Step 31 Acquire a first web traffic of the protected host, where the first web traffic refers to traffic that occurs when a webpage provided by the protected host is accessed during the first time period.
- the IP address of the protected host is pre-stored in the security device.
- the source address or destination address of the packet passing through the security device is compared with the IP address of the protected host. If the source address or destination address of the packet is If the protected host has the same IP address and the protocol type is HTTP, the packet is saved to obtain the first web traffic of the protected host.
- the security device compares the source address or destination address of the packet in the mirrored traffic sent by the gateway device with the IP address of the protected host. If the source or destination address of the packet is the same as the IP address of the protected host and the protocol type is HTTP, the packet is saved. If the source or destination address of the packet is different from the IP address of the protected host, or the protocol type. Regardless of web access, the message is deleted, saving storage space.
- Step 32 Generate a webpage access record of the protected host according to the first web traffic.
- the web page access record is used to store the following information: at least one URL, an IP address accessing each of the at least one URL, and a total number of times the URL is accessed.
- Each of the URLs identifies a webpage provided by the protected host.
- the webpage access record includes a plurality of entries, and each entry corresponds to one of the at least one URL.
- Each entry not only saves the corresponding URL, but also saves the total number of times the URL corresponding to the entry is accessed, and the IP address of the URL corresponding to the entry.
- the security device can organize multiple entries in the webpage access record by using a variety of different data structures, such as multidimensional arrays, hash tables, and the like.
- the embodiment of the present application provides a hash table to save the foregoing webpage access record.
- a hash bucket is specifically used to implement the hash table.
- the IP address of each protected host corresponds to a hash bucket table.
- the IP address of each protected host is represented by 41
- the hash bucket table is represented by 42
- the hash bucket table 42 corresponding to each address 41 includes 256 hash buckets.
- Each of the hash buckets in the hash bucket table 42 is a virtual subgroup of entries in the hash table.
- Each hash bucket corresponds to a linked list of unequal lengths consisting of entries.
- the linked list is indicated by 43, and the entry is indicated by 44.
- Zero, one or more entries 44 are stored in the linked list 43.
- Each entry includes an index key and a value.
- the index key of each entry is the result of hashing the URL.
- the value is the URL itself. It also holds the total number of accesses CountVisit for recording the total number of accesses to the URL, and the IP address used to record access to the URL.
- the hash algorithm includes Message-Digest Algorithm 5 (MD5).
- Step 33 Determine, according to the webpage access record, a suspicious URL from the at least one URL, where the total number of visited suspicious URLs is less than a first threshold, and the number of mutually different IP addresses accessing the suspicious URL A ratio of the total number of times the suspicious URL is accessed is less than a second threshold.
- the first threshold and the second threshold are pre-stored in the security device, where the first threshold and the second threshold may be set by the network administrator according to experience and the actual network environment and input to the security device by using the input device 240 in FIG. 2, It can also be obtained by means of machine learning according to a pre-calibrated web traffic sample, which is not limited in this embodiment.
- the security device periodically discriminates the information stored in the entry in the hash table shown in FIG. 4 according to the first threshold and the second threshold, thereby identifying the suspicious URL.
- the first threshold is a natural number, and the range of values can be set according to experience, memory storage space, and discriminating period. As the discriminating period is longer, the storage space is larger, and the range of the first threshold value can be appropriately increased, thereby obtaining a more accurate recognition effect.
- the specific value can be flexibly set according to the actual situation. For example, the discriminating period is 10 days, and the first threshold value is 1000.
- the second threshold is a percentage between 0 and 1.
- the value of the second threshold can also be set according to experience and actual network environment. The smaller the value of the second threshold is, the lower the suspected URL false positive rate is identified, but there is a certain false negative rate. The larger the value of the second threshold is, the higher the false positive rate of the identified suspicious URL is, and the false negative rate will decrease. For example, the second threshold can be taken as 50%.
- Step 34 Determine whether the webpage identified by the suspicious URL includes a backdoor feature in the backdoor feature database of the webpage, and detect, according to the determination result, whether the webpage identified by the suspicious URL has a webpage backdoor.
- the browser first establishes a connection with the web server through a Transmission Control Protocol (TCP). Then, an access request message, such as an HTTP request GET message and an HTTP request post message, is sent to the website server through the established connection.
- TCP Transmission Control Protocol
- the access request message carries the URL of the page to be accessed.
- the website server After receiving the access request message, the website server searches for the corresponding webpage file from the webpage directory according to the URL carried in the access request message.
- the web server sends an access response message, such as an HTTP request response message, to the browser according to the search result.
- the access response message carries a status code.
- the HTTP 1.1 version defines five types of status codes.
- the status code is composed of three digits. The first number defines the category of the response, specifically
- the web server sends the webpage file to the browser through one or more response messages according to the size of the data volume of the found webpage file.
- the security device may further obtain a message that the browser interacts with the web server when the webpage identified by the suspicious URL is accessed. Then, the security device can detect whether the webpage carried by the interactive packet exists in the backdoor of the webpage according to the packet-based detection mode and the dataflow-based detection manner according to the backdoor feature database of the webpage.
- the security device may obtain, by using the following manner, a message that the browser interacts with the website server when the webpage identified by the suspicious URL is accessed.
- the security device searches for the interaction message generated when the terminal accesses the webpage identified by the suspicious URL from the first web traffic of the saved protected host. For example, the security device parses an access request packet in the first web traffic according to the relevant standard of the HTTP protocol, so that the information carried in the access request packet is:
- the URL that the security device receives in the access request message is www.google.com.hk/videohp behind the GET keyword.
- the security device compares the obtained URL with the suspicious URL. If the URL carried by the access request packet is consistent with the suspicious URL, the source address, the destination address, the source port, the destination port, the protocol type, and the serial number of the access request packet. And the timestamp and the like, obtaining all the packets of the data flow to which the access request message belongs from the first web traffic, and the obtained packet is the message that the browser interacts with the web server when accessing the webpage identified by the suspicious URL .
- the security device accesses the page identified by the suspicious URL through the browser installed on the security device, and saves a series of messages generated by the interaction with the web server in the process, thereby obtaining the webpage identified by the suspicious URL, the browser and the web server. Interactive message.
- each message that the browser interacts with the web server matches the feature in the webpage backdoor feature library, if it matches The hit feature satisfies a preset rule, for example, the match hit feature exceeds a predetermined number, and then the web page identified by the suspicious URL is confirmed to have a webpage back door.
- the multi-pattern matching state machine may be generated according to the features in the backdoor feature database of the webpage, and the content of the single packet is input into the state machine, and all the features matched by the packet can be found by one scan, thereby improving detection. performance.
- the security device when the security device obtains the webpage identified by the suspicious URL, after the browser interacts with the web server, the packet is reorganized to obtain the payload content of the data stream. Match the payload content to the features in the web backdoor feature library. And detecting, according to the matching hit result and the predetermined webpage backdoor identification rule, whether the webpage identified by the suspicious URL has a webpage backdoor.
- the predetermined webpage backdoor identification rule includes confirming that the webpage identified by the suspicious URL exists in the back door of the webpage if the features A, B, and C appear successively in the matching hit feature; or if the matching hit feature exceeds three, confirming the suspicious URL The identified web page has a back door to the web page.
- FIG. 5 is a flowchart of a method for constructing a webpage access record according to a first web traffic provided by an embodiment of the present application.
- Step 51 The security device performs protocol parsing on the first web traffic to obtain at least one access request packet in the first web traffic.
- the access request message refers to an HTTP request GET message sent by the browser to the website server.
- the destination IP address of the HTTP request GET packet is the IP address of the protected host.
- the security device performs steps 52-58 for each access request message in the at least one access request message until all access request messages are processed. Specifically, the security device may select an access request message from the at least one access request message according to a preset selection rule, for example, according to the time sequence, and select the access request message according to the timestamp carried in the access request message.
- Steps 52 to 510 take an access request packet as an example to describe the processing procedure in detail.
- Step 52 The security device obtains the destination IP address, the source address, and the carried URL of the access request packet through protocol parsing.
- Step 53 The security device searches for a record corresponding to the destination IP address in the webpage access record according to the destination IP address. That is, it is determined whether the destination IP address and the hash bucket table corresponding to the destination IP address have been recorded in the webpage access record. If the destination address is not recorded in the webpage access record, step 54 is performed; if the destination address is already recorded in the webpage access record, step 55 is performed.
- Step 54 The security device records the destination IP address, and creates a hash bucket table corresponding to the destination IP. Step 56 is further performed.
- the security device records the destination IP address in the webpage access record, and creates a hash bucket table corresponding to 256 hash buckets corresponding to the destination IP address. Initially, the linked list for each hash bucket in the hash bucket table is empty.
- Step 56 The security device calculates a URL carried in the access request packet according to a predetermined hash bucket hashing algorithm, and determines a hash bucket to which the URL carried in the access request packet belongs. Go to step 57.
- step 57 the security device creates an entry in the determined hash bucket.
- the index key of the created entry is the result of hashing the URL carried in the access request packet, and the URL is recorded in the created entry.
- the total number of accesses saved in the created entry is set to 1, and the source address parsed in step 52 is recorded in the IP address list of the entry.
- Step 55 The security device calculates a URL carried in the access request packet according to a predetermined hash bucket hashing algorithm, and determines a hash bucket to which the URL carried in the access request packet belongs. Step 58 is further performed.
- Step 58 The security device searches for the entry corresponding to the URL in the determined hash bucket corresponding linked list.
- the security device performs a hash operation on the URL, and searches for an entry indexed by the operation result in the linked list corresponding to the hash bucket. If there is no entry that is hashed in the result of the operation, step 59 is performed. If there is an entry indexed by the result of the operation, step 510 is performed.
- Step 59 The security device creates an entry indexed by the hash operation result, records the URL in the created entry, and records the source address carried in the access request packet in the IP address list of the entry, and sets the source address.
- the total number of accesses in the created entry is 1.
- Step 510 The security device records the source address carried in the access request packet in the IP address list of the entry indexed by the hash operation result, and the total access saved in the entry indexed by the hash operation result Add 1 to the number of times.
- the security device obtains an access request message carried in the first web traffic by protocol parsing.
- the destination IP address is 10.1.1.34
- the source address is 219.133.94.158
- the URL is www.google.com.hk/videohp.
- the destination address 10.1.1.34 is the same as the IP address of the protected host.
- the default hash algorithm in the security device is a 32-bit MD5 algorithm, that is, the input is a URL of arbitrary length, and the output is a 32-bit hexadecimal symbol.
- the result of performing a hash operation on www.google.com.hk/videohp in this example is a356bf63af5c8b348032bba8b44eceda.
- the purpose of the hash bucket hash algorithm is to classify any hash result into one of the 256 hash buckets.
- the hash bucket hash algorithm specifically divides the hash operation result into 16 groups, each group of 2 bits, sequentially performs phase-and-match operations, and finally obtains two hexadecimal symbols; then, two hexadecimal symbols are obtained.
- the symbol pair 256 takes the remainder, and the result of the remainder is used as the sequence number of the hash bucket.
- index key a356bf63af5c8b348032bba8b44eceda is looked up in the hash bucket 163. In this example, it is assumed that the index key does not exist in the hash bucket 163.
- the security device newly creates an entry of the index key a356bf63af5c8b348032bba8b44eceda at the end of the linked list corresponding to the hash bucket 163, or inserts a predetermined position of the linked list according to a predetermined rule. Record the source address 219.133.94.158 in the access request packet in the IP address list of the new entry in the entry. Set the total number of accesses in the created entry to 1.
- the entries created through the above processing are shown in Figure 6.
- step 33 of FIG. 3 firstly obtains a list of IP addresses in the entry when determining whether the URL corresponding to each entry is a suspicious entry.
- the IP List determines the IP addresses that are different from each other and calculates the number of IP addresses that are different from each other. Then take out the total number of times visited by CountVisit. If the value of the total number of times visited, CountVisit, is less than the first threshold, and the ratio of the calculated number of mutually different IP addresses to the value of the total number of times visited, CountVisit is less than the second threshold, it is determined that the URL corresponding to the URL is suspicious URL.
- the data structure of the entry 44 shown in FIG. 4 can also be improved, and an IP address count value Count IP is added, and the IP address count value is used to record that the access to the URL is different from each other.
- the method of constructing a web page access record shown in FIG. 5 also requires adaptive adjustment. Specifically, in step 57 or step 59, if the entry corresponding to the URL carried in the access request packet is not found, the entry corresponding to the URL carried in the access request packet is created in the webpage access record. The total number of times of the created entry is set to 1, the IP address count value of the created entry is set to 1, and the access request packet is recorded in the IP address list of the created entry. Source IP address.
- step 510 if the entry corresponding to the URL carried in the access request packet is found, the total number of times of the found entry is increased by one.
- the source IP address of the access request packet is saved in the IP address list of the searched entry, and the source IP address of the access request packet is saved in the IP address list of the searched entry. Then, the processing of the access request message ends. If the source IP address of the access request packet is not saved in the IP address list of the found entry, the IP address count value of the found entry is incremented by 1, and the found table is Record the source IP address of the access request message in the IP address list of the entry site.
- step 33 of FIG. 3 when determining whether the URL corresponding to each entry is a suspicious entry, it is only necessary to take out the total number of times visited, CountVisit and the IP address count value CountIP, to easily confirm the Whether the URL corresponding to the URL is a suspicious URL. Specifically, if the value of the total number of times visited, CountVi sit, is less than the first threshold, and the ratio of the value of the IP address count value CountIP to the value of the total number of times visited, CountVisit is less than the second threshold, it is determined that the URL corresponding to the URL is suspicious. URL.
- the access process may not be successful.
- the security device detected these failed access pages because the message that the browser interacted with the web server could not be obtained in step 34 of FIG.
- the processing resource may be wasted, and the URL corresponding table of the access failure page is saved in the webpage access record, the storage space is wasteted, and the webpage access record is constructed in the method shown in FIG. 5 to FIG.
- at least one access request message obtained from the first web traffic in step 51 can be improved as follows.
- the security device first selects at least one access response message from the first web traffic, and the status code carried in each selected webpage access response message indicates that the access is successful.
- the access response message is a message returned by the web server to the browser after receiving the access request message. This application only considers the access response message whose source address is the IP address of the protected host.
- the content of the access response message successfully accessed is as follows:
- the status code "200 OK" indicates that the access was successful.
- the security device determines each access request packet and each access response packet in the first web traffic according to the source address, the source port, the destination address, the destination port, the protocol type, the serial number, and the acknowledgment number carried in each packet. Corresponding relationship, so that the access request message corresponding to each of the access response messages indicating the successful access is obtained from the first web traffic as the obtained at least one access request message.
- the terminal may install a browser provided by different vendors or a different version of the browser.
- Different browsers may have different URLs in multiple access request messages when different browsers access the same web page provided by the web server due to differences in programming.
- the URLs carried therein are in different capitalization, encoding, or carrying different parameters.
- the security device processes the access request packets by carrying different URLs to create different entries in the webpage access record.
- this processing method is inconsistent with the fact that the access request messages actually access the same webpage, which causes the deviation of the subsequent suspicious URL identification, and on the other hand, the amount of webpage access record data is too large.
- the storage space occupied by the webpage access record in the memory is saved, optionally,
- the security device In the process of constructing the webpage access record by using the method shown in FIG. 5 to FIG. 7, the security device first parses the entry corresponding to the URL in the determined hash bucket corresponding linked list in step 58.
- the URL performs at least one of the following normalization processes.
- the parsed URL is converted into a predetermined encoding format.
- URLs may be encoded in GB2312, GBK, UTF8, and so on. In this example, all URLs are converted to GBK encoding.
- the URL 1 after removing the parameters is www.google.com.hk/videohp.
- the URL 2 after removing the parameters is www.google.com.hk/videohp.
- the URL 1 and the URL 2 are the same, and the same entry is corresponding to the webpage access record, thereby effectively controlling the size of the webpage access record and saving storage resources.
- the security device When the number of page files provided by the website server is large or growing, the security device separately stores an IP address for accessing each URL in the at least one URL, and the URL of each of the URLs, using the data structure shown in FIG. The total number of visits will consume more storage resources.
- the security device identifies the normal URL according to the first threshold, or the backdoor detection result of the webpage, and deletes the total number of times of accessing the normal URL and the total number of times of the normal URL saved in the webpage access record, and is not updated subsequently. Accessing the IP address of the normal URL and the total number of times the normal URL is accessed, thereby saving storage resources and processing resources.
- Steps 31 to 34 in FIG. 8 are the same as FIG. 3, and after step 32, further include:
- Step 35 The security device determines a normal URL, where the normal URL refers to a URL in the at least one URL whose total number of times visited is greater than a first threshold.
- step 34 the method further includes:
- Step 36 The security device determines a normal URL, where the normal URL refers to a suspicious URL indicating that the identified webpage does not exist in the back door of the webpage.
- step 37 the security device performs step 37 to delete the IP address of the normal URL saved in the webpage access record and the total number of times the normal URL is accessed. It should be noted that step 35 and step 36 may be performed alternatively or simultaneously.
- the embodiment of the present application further includes after step 37:
- Step 38 The security device acquires the second web traffic of the protected host.
- the second web traffic refers to traffic that occurs when a webpage provided by the protected host is accessed in a second time period after the first time period.
- Step 39 The security device obtains an access request message from the second web traffic, and parses the access request message, so as to obtain a source address and a carried URL of the access request message.
- Step 310 The security device determines whether the URL carried in the access request packet obtained in step 39 is the same as the normal URL. If the same, the processing of the access request ends. If there are still unprocessed visits in the second web traffic When requesting a message, it continues to process another unprocessed access request message. If it is different, go to step 311.
- Step 311 The security device determines whether the URL carried in the access request packet is saved in the webpage access record. If the URL carried in the access request packet is saved, step 312 is performed. If the URL carried in the access request packet is not saved, step 313 is performed.
- Step 312 The security device adds 1 to the total number of times the URL of the URL that is carried in the access request packet is saved, and adds the source of the access request packet to the IP address of the URL that is accessed by the access request packet. IP address. If there are still unprocessed access request messages in the second web traffic, continue processing another unprocessed access request message.
- the security device saves the URL carried in the access request packet in the webpage access record, and sets the total number of times the URL carried in the access request packet is 1 and sets the URL to be accessed by the access request packet.
- the IP address is the source IP address of the access request. If there are still unprocessed access request messages in the second web traffic, continue processing another unprocessed access request message.
- the method shown in FIG. 8 is exemplified by taking three different access request messages HTTP request 1, HTTP request 2, and HTTP request 3 in the second web traffic as an example.
- the specific 32-bit binary address is replaced by the "IP+Identification”, and the "URL+Identification” is used instead of the specific URL string.
- the web page access record constructed by using the data structure shown in FIG. 7 is as shown in FIG. 9.
- the URL 3 is a normal URL, and the total number of visited times and the IP address list corresponding to the URL 2 are not saved.
- the security device temporarily cannot recognize whether the URL 1 is a suspicious URL or a normal URL, and therefore saves the total number of visited URLs and the IP address list corresponding to the URL 3.
- the security device resolves HTTP request 1, HTTP request 2, and HTTP request 3.
- the destination addresses of the three access requests are all IP 0, which is the IP address of the protected host.
- the URL that is obtained by HTTP request 1 is URL 1
- the source IP address is IP 1.
- the URL carried by HTTP request 2 is URL 2.
- the source IP address is IP 2.
- the URL carried by HTTP request 3 is URL 3, and the source IP address is IP 3.
- HTTP request 1 For HTTP request 1, look up the hash bucket table corresponding to IP 0 in the hash table shown in FIG. 4, and compare whether the storage URL of each entry is the same as URL1.
- the URL 1 is different from the URL 3 as the normal URL, and the URL 1 is recorded in the webpage access record, the total number of times of the recorded URL 1 is increased by one, and the IP address of the access URL 1 is added.
- the source address IP1 of HTTP request 1 adds 1 to the IP address count value.
- the URL 2 carried by the HTTP request 2 is different from the URL 3 as the normal URL, and the URL 2 is not recorded in the webpage access record, and the entry corresponding to the URL 2 is newly created in the access record.
- Record URL 2 in the entry set the total number of accesses to URL 2 to 1, set the IP address count to 1, and record the source address IP 2 of HTTP request 3 in the IP address list of the newly created entry.
- the URL 3 carried by the HTTP request 3 is the same as the normal URL, and the processing of the HTTP request 3 ends.
- the web page access record processed by the above three access requests is as shown in FIG.
- the security device only needs to save the URL for the normal URL in the webpage access record.
- the IP address of the URL to be confirmed and the total number of times of the URL to be confirmed are saved.
- the URL to be confirmed is a normal URL or a suspicious URL according to the recorded IP address of the URL to be confirmed and the total number of times of the URL to be confirmed.
- the embodiment of the present application further provides an apparatus for detecting a back door of a webpage.
- the apparatus includes an obtaining unit 111, a record generating unit 112, and a determining unit 113, as follows.
- the obtaining unit 111 is configured to acquire first web traffic of the protected host, where the first web traffic refers to traffic that occurs when the webpage provided by the protected host is accessed in the first time period.
- a record generating unit 112 configured to generate a webpage access record of the protected host according to the first web traffic obtained by the obtaining unit 111, where the webpage access record is used to save at least one uniform resource locator URL, and access the at least one URL The IP address of each URL in the URL, and the total number of times each URL is accessed, wherein each of the URLs identifies a web page provided by the protected host.
- a determining unit 113 configured to determine, according to the webpage access record generated by the record generating unit 112, a suspicious URL from the at least one URL, the total number of visited suspicious URLs being less than a first threshold, and accessing the suspicious URL
- the ratio of the number of mutually different IP addresses to the total number of times the suspicious URL is accessed is less than a second threshold; and determining whether the webpage identified by the suspicious URL includes a backdoor feature in the backdoor feature library of the webpage, determined according to a backdoor feature
- the webpage access record in the embodiment of the present application includes at least one entry, each of the at least one entry corresponding to one of the at least one URL, each of the each The list contains the total number of times visited and the list of IP addresses.
- the structure of this entry is shown in Figure 4.
- the record generating unit is configured to obtain at least one access request message from the first web traffic, where the destination IP address of the access request message is an IP address of the protected host, and the at least one An access request message is selected in the access request message, and the selected access request message is processed as follows: until each access request message in the at least one access request message is processed:
- the selected access request packet And parsing the selected access request packet, so as to obtain the source IP address and the carried URL of the selected access request packet, and searching for the URL corresponding to the selected access request packet in the webpage access record If the entry corresponding to the URL carried by the selected access request packet is found, the total number of times of the found entry is increased by one, and the IP address of the found entry is added. The source IP address is recorded in the list; if the entry corresponding to the URL carried by the selected access request packet is not found, the selected access request packet is created in the webpage access record. The entry corresponding to the URL sets the total number of times the created entry is set to 1, and records the source IP address in the IP address list of the created entry.
- the determining unit 113 is specifically configured to: select an entry from the webpage access record; determine the number of mutually different IP addresses in the selected IP address list; if the selected The ratio of the total number of accessed entries of the entry to the first threshold is less than the second threshold, and the ratio of the determined number of mutually different IP addresses to the total number of times the selected entry is accessed is less than the second threshold. Determining that the URL corresponding to the selected entry is a suspicious URL
- the webpage access record includes at least one entry, each of the at least one entry corresponding to one of the at least one URL, where the entry is saved A total number of times, an IP address count value, and a list of IP addresses.
- the structure of the entry is shown in Figure 7.
- the record generating unit 112 is configured to obtain at least one access request message from the first web traffic, where the destination IP address of the access request message is an IP address of the protected host.
- Selecting an access request message from the at least one access request message, and selecting the access request The message performs the following processing until each access request message in the at least one access request message is processed:
- the processing of the selected access request packet is ended; if the IP address list of the found entry is not saved, Source IP address, the IP address count value of the found entry is incremented by 1, and the source IP address is recorded in the IP address list of the found entry; if the selected one is not found.
- the entry corresponding to the URL carried in the access request packet, the entry corresponding to the URL carried in the access request packet is created in the webpage access record, and the total number of times the created entry is set to 1, Calculating the IP address of the created entry The value is set to 1, and the source IP address is recorded in the IP address list of the created entry.
- the determining unit 113 is configured to: select an entry from the webpage access record; if the selected total number of entries of the selected entry is less than the first threshold, and the selected entry The ratio of the IP address count value to the total number of times the selected entry is accessed is less than the second threshold, and the URL corresponding to the selected entry is determined to be a suspicious URL.
- the record generating unit 112 selects at least one access response message from the first web traffic, and the status code carried in each access response message in the at least one access response message indicates that the access is successful,
- the source address of each access response message is an IP address of the protected host; and the access request message corresponding to each webpage access response message is obtained from the first web traffic, as the obtained At least one access request message.
- the record generating unit 112 searches for the entry corresponding to the URL carried by the selected access request message in the webpage access record, including: performing at least one on the URL carried in the selected access request packet
- the normalization process is performed to obtain a normalized URL, and the normalization process includes one or more of the following (1) to (3): (1) carrying the selected access request message
- the URL is converted into a predetermined encoding format
- (2) the character in the URL carried by the selected access request message is converted into a predetermined capitalization type
- (3) the URL carried in the selected access request message is removed.
- a parameter in the webpage access record is used to find an entry corresponding to the URL after the normalization process.
- the record generating unit 112 creates, in the webpage access record, an entry corresponding to the URL carried by the access request packet, where the table corresponding to the URL after the normalization process is created in the webpage access record. item.
- the determining unit 113 is further configured to determine, according to the webpage access record, a normal URL from the at least one URL, where the normal URL is that the total number of visited in the at least one URL is greater than the a URL of the first threshold, or a backdoor detection result of the webpage indicating that the identified webpage does not have a suspicious URL of the webpage backdoor; deleting the IP address of the normal URL saved in the webpage access record and the accessed total of the normal URL frequency.
- the obtaining unit 111 is further configured to acquire the second web traffic of the protected host, where the second web traffic refers to the protected host in a second time period after the first time period.
- the provided webpage was interviewed The traffic that occurs when asked.
- the record generating unit 112 is further configured to obtain the first access request message, the second access request message, and the third access request message from the second web traffic;
- Parsing the first access request packet so as to obtain the source IP address of the first access request packet and the carried URL; if the URL carried by the first access request packet is different from the normal URL, The URL of the first access request message is saved in the webpage access record, and the total number of times of the URL of the saved first access request message is increased by one, and the first access is accessed. The source IP address of the first access request packet is added to the IP address of the URL carried in the request packet.
- Parsing the second access request packet so as to obtain the source IP address and the carried URL of the second access request packet; if the URL carried by the second access request packet is different from the normal URL, If the URL of the second access request packet is not saved in the webpage access record, the URL carried in the second access request packet is saved in the webpage access record, and the second access request packet is carried.
- the IP address of the URL carried in the second access request message is set to be the source IP address of the second access request message.
- the device for detecting the back door of the webpage provided in the embodiment of the device may be integrated into the security device and applied to the scenario shown in FIG. 1 of the method embodiment to implement the function of the security device.
- functions that can be implemented by the device for detecting the backdoor of the webpage, and the process of interacting with other network element devices refer to the description of the security device in the method embodiment, and details are not described herein again.
Landscapes
- Engineering & Computer Science (AREA)
- Computer Security & Cryptography (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Computer Hardware Design (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Information Transfer Between Computers (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
一种检测网页后门的方法和装置,用以缓解现有技术检测效率低的问题。该方法包括:获取被保护主机的第一web流量;根据第一web流量生成被保护主机的网页访问记录,所述网页访问记录用于保存至少一个统一资源定位符URL、访问所述至少一个URL中的每个URL的IP地址、以及所述每个URL的被访问总次数,其中每个URL标识所述被保护主机提供的一个网页;根据网页访问记录,从至少一个URL中确定可疑URL,所述可疑URL的被访问总次数小于第一阈值、且访问所述可疑URL的互不相同的IP地址的数量与所述可疑URL的被访问总次数的比值小于第二阈值;以及确定可疑URL标识的网页是否包含后门特征,根据后门特征确定结果检测所述可疑URL标识的网页是否存在网页后门。
Description
本申请要求于2016年12月16日提交中国专利局、申请号为201611167905.3、申请名称为“检测网页后门的方法和装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
本发明涉及网络安全技术领域,尤其涉及一种检测网页后门的方法及一种检测网页后门的装置。
网页后门(webshell)是一种以网页文件形式存在的后门工具。通过webshell可以获得网站的操作权限,例如上传下载文件、查看数据库、执行脚本命令等。Webshell文件可以是使用动态服务器页面(英文:Active Server Page,ASP)应用编写的网页文件,或使用超文本预处理器(英文:Hypertext Preprocessor,PHP)语言编写的网页文件,或通用网关界面(英文:Common Gateway Interface,CGI)程序文件。
网络中提供网页服务、开放网页服务相关端口的主机也被称为网站服务器、或者web服务器。网站服务器往往会成为webshell的攻击目标。攻击者利用开放端口等漏洞成功入侵网站服务器后,将webshell文件存放于该网站服务器的网页目录中,与正常网页文件混在一起。此后,攻击者可以通过浏览器访问存放于上述网页服务器的webshell文件以获得对于网站服务器的操作权限,从而达到控制网站服务器、盗取信息等非法目的。由于攻击者与被攻击网站服务器之间的数据通常是通过网页服务的默认端口80端口来传输的,而防火墙为了不影响网络用户的正常网页访问行为通常不会阻止访问80端口的超文本传输协议(英文:HyperText Transfer Protocol,HTTP)流量,因此简单的报文过滤方式并不能阻止上述攻击行为。
为了检测网页后门,现有技术通过人工分析webshell文件的代码、或者分析攻击者访问webshell文件时产生的流量获取webshell的特征,形成webshell特征库。安全设备获得web流量后,将web流量与webshell特征库中的特征进行匹配,来实现检测webshell的目的。然而由于现有网络中web流量的数据量巨大,导致耗费安全设备大量处理资源,检测效率较低。
发明内容
本申请实施例提供一种检测网页后门的方法,用以缓解现有技术检测效率低的问题。
本申请实施例提供的技术方案如下:
第一方面,提供了一种检测网页后门的方法,包括:获取被保护主机的第一web流量,所述第一web流量是指在第一时间段中所述被保护主机提供的网页被访问时发
生的流量;根据所述第一web流量生成所述被保护主机的网页访问记录,所述网页访问记录用于保存至少一个统一资源定位符(英文:Uniform Resource Locator,URL)、访问所述至少一个URL中的每个URL的IP地址、以及所述每个URL的被访问总次数,其中所述每个URL标识所述被保护主机提供的一个网页;根据所述网页访问记录,从所述至少一个URL中确定可疑URL,所述可疑URL的被访问总次数小于第一阈值、且访问所述可疑URL的互不相同的IP地址的数量与所述可疑URL的被访问总次数的比值小于第二阈值;以及确定所述可疑URL标识的网页是否包含网页后门特征库中的后门特征,根据后门特征确定结果检测所述可疑URL标识的网页是否存在网页后门。
本申请实施例基于已发生的被保护主机的web流量,构建能够反映被保护主机中各个网页被访问的次数、访问者IP分布等情况的网页访问记录。进一步根据该网页访问记录从被保护主机提供的多个网页URL中识别可疑程度较高的URL,后续着重对可疑URL标识的网页进行检测,而无需对所有网页都进行网页后门检测。上述方法减少了需要进行网页后门检测的网页的数量,从而提高了web检测性能。
可选的,本申请还提供了网页访问记录的第一种具体结构,以及如何构建网页访问记录的详细步骤。通过这种结构的网页访问记录可以快捷地确定出可疑URL。即,
在第一方面的第一种可能的实现方式中,所述网页访问记录包括至少一个表项,所述至少一个表项中的每个表项分别与所述至少一个URL中的一个URL相对应,所述每个表项中保存有被访问总次数和IP地址列表;
所述第一web流量生成所述被保护主机的网页访问记录,包括:
从所述第一web流量中获得至少一个访问请求报文,所述访问请求报文的目的IP地址为所述被保护主机的IP地址;
从所述至少一个访问请求报文中选择一个访问请求报文,对选择出的访问请求报文进行以下处理,直到处理完所述至少一个访问请求报文中的每个访问请求报文为止:
解析选择出的访问请求报文,从而获得所述选择出的访问请求报文的源IP地址和携带的URL;
在所述网页访问记录中查找所述选择出的访问请求报文携带的URL对应的表项;
如果查找到所述选择出的访问请求报文携带的URL对应的表项,则将查找到的表项的被访问总次数加1,在所述查找到的表项的IP地址列表中记录所述源IP地址;
如果未查找到所述选择出的访问请求报文携带的URL对应的表项,则在所述网页访问记录中创建所述选择出的访问请求报文携带的URL对应的表项,将创建的表项的被访问总次数设置为1,在所述创建的表项的所述IP地址列表中记录所述源IP地址。
结合第一方面的第一种可能的实现方式,在第一方面的第二种可能的实现方式中所述根据所述网页访问记录,从所述至少一个URL中确定可疑URL,包括:
从所述网页访问记录中选择出一个表项;
确定选择出的表项的IP地址列表中互不相同的IP地址的数量;
如果所述选择出的表项的被访问总次数少于所述第一阈值、且确定出的互不相同的IP地址的数量与所述选择出的表项的被访问总次数的比值小于所述第二阈值,则确定所述选择出的表项对应的URL为可疑URL。
可选的,本申请还提供了网页访问记录的第二种具体结构,以及如何构建网页访
问记录的详细步骤。第二种具体结构在第一种具体结构的表项的基础上增加了IP地址计数值这一信息,通过这种结构的网页访问记录可以快捷地确定出可疑URL。即,
在第一方面的第三种可能的实现方式中,所述网页访问记录包括至少一个表项,所述至少一个表项中的每个表项分别与所述至少一个URL中的一个URL相对应,所述表项中保存有被访问总次数、IP地址计数值和IP地址列表;
所述第一web流量生成所述被保护主机的网页访问记录,包括:
从所述第一web流量中获得至少一个访问请求报文,所述访问请求报文的目的IP地址为所述被保护主机的IP地址;
从所述至少一个访问请求报文中选择出一个访问请求报文,对选择出的访问请求报文进行以下处理,直到处理完所述至少一个访问请求报文中的每个访问请求报文为止:
获取所述选择出的访问请求报文的源IP地址和携带的URL;
在所述网页访问记录查找所述选择出的访问请求报文携带的URL对应的表项;
如果查找到所述选择出的访问请求报文携带的URL对应的表项,则将查找到的表项的被访问总次数加1;确定所述查找到表项的IP地址列表中是否已保存所述源IP地址,如果所述查找到表项的IP地址列表中已保存所述源IP地址,则对所述选择出的访问请求报文处理结束;如果所述查找到的表项的IP地址列表中未保存所述源IP地址,则将所述查找到的表项的IP地址计数值加1,并在所述查找到的表项的IP地址列表中记录所述源IP地址;
如果未查找到所述选择出的访问请求报文携带的URL对应的表项,则在所述网页访问记录中创建所述访问请求报文携带的URL对应的表项,将创建的表项的被访问总次数设置为1,将所述创建的表项的IP地址计数值设置为1,并在所述创建的表项的所述IP地址列表中记录所述源IP地址。
结合第一方面的第三种可能的实现方式,在第一方面的第四种可能的实现方式中,所述根据所述网页访问记录,从所述至少一个URL中确定可疑URL,包括:
从所述网页访问记录中选择出一个表项;
如果选择出的表项的被访问总次数少于所述第一阈值、且所述选择出的表项的IP地址计数值与所述选择出的表项的被访问总次数的比值小于所述第二阈值,则确定所述选择出的表项对应的URL为可疑URL。
终端通过浏览器访问网页时,这一访问过程有可能并未成功。记录访问失败页面对应的表项将占用存储空间,后续对访问失败页面进行检测也会浪费处理资源。为了节约存储资源和处理资源,一种可能的实现方式是仅记录访问成功页面对应的表项,具体如下。
结合第一方面的第一种或第三种可能的实现方式,在第一方面的第五种实现方式中,从所述第一web流量中获得至少一个访问请求报文包括:
从所述第一web流量中选择至少一个访问应答报文,所述至少一个访问应答报文中的每个访问应答报文携带的状态码指示访问成功,所述每个访问应答报文的源地址为所述被保护主机的IP地址;
从所述第一web流量中获取所述每个网页访问应答报文分别对应的访问请求报文,
作为获得的所述至少一个访问请求报文。
终端通过安装的浏览器访问被保护主机提供的网页是,由于浏览器提供商、浏览器版本的差异,有可能造成不同浏览器访问网站服务器提供的同一网页时,产生的多个访问请求报文中携带不同的URL。如果安全设备据此生成不同URL对应的表项,一方面与这些访问请求报文实际上访问的是同一网页这一实际情况不符,造成后续可疑URL识别时的偏差,另一方面会造成网页访问记录数据量过大。为了提高可疑URL识别的准确性,节约网页访问记录在存储器中占有的存储空间,安全设备在生成网页访问记录中的表项时,可以先对访问请求报文中的URL进行正规化处理,根据正规化处理后的URL生成表项。具体如下,
结合第一方面的第一种或第三种可能的实现方式,在第一方面的第六种实现方式中,在所述网页访问记录中查找所述选择出的访问请求报文携带的URL对应的表项,包括:
对所述选择出的访问请求报文携带的URL执行至少一种正规化处理,得到正规化处理后的URL,所述正规化处理包括以下(1)~(3)中的一种或多种:(1)将所述选择出的访问请求报文携带的URL转换为预定编码格式,(2)将所述选择出的访问请求报文携带的URL中的字符转换为预定大小写类型,和(3)去除所述选择出的访问请求报文携带的URL中参数;
在所述网页访问记录中查找正规化处理后的URL对应的表项;
相应地,在所述网页访问记录中创建所述访问请求报文携带的URL对应的表项,具体为:
在所述网页访问记录中创建所述正规化处理后的URL对应的表项。
为了进一步降低网页访问列表占用的存储资源,可以对网页访问列表中记录的信息进行进一步精简,删除一些对识别可疑URL所用不大的信息。例如可以识别正常URL后删除并不再维护正常URL对应的表项中的被访问总次数和访问正常URL的IP地址,从而节省存储资源和后续更新表项耗费的处理资源。即,在第一方面的第七种可能的实现方式中,所述方法还包括:
根据所述网页访问记录,从所述至少一个URL中确定正常URL,所述正常URL是所述至少一个URL中的被访问总次数大于所述第一阈值的URL,或者网页后门检测结果指示所标识的网页不存在网页后门的可疑URL;
删除所述网页访问记录中保存的访问所述正常URL的IP地址和所述正常URL的被访问总次数。
结合第一方面的第七种可能的实现方式,在第一方面的第八种可能的实现方式中,所述方式还包括:
获取所述被保护主机的第二web流量,所述第二web流量是指在所述第一时间段之后的第二时间段中所述被保护主机提供的网页被访问时发生的流量;
从所述第二web流量中获得第一访问请求报文、第二访问请求报文和第三访问请求报文;
解析所述第一访问请求报文,从而获得所述第一访问请求报文的源IP地址和携带的URL;如果所述第一访问请求报文携带的URL与所述正常URL不同、且所述网页访
问记录中已保存所述第一访问请求报文携带的URL,则将已保存的所述第一访问请求报文携带的URL的被访问总次数加1,在访问所述第一访问请求报文携带的URL的IP地址中增加所述第一访问请求报文的源IP地址;
解析所述第二访问请求报文,从而获得所述第二访问请求报文的源IP地址和携带的URL;如果所述第二访问请求报文携带的URL与所述正常URL不同、且所述网页访问记录中未保存所述第二访问请求报文携带的URL,则在所述访问记录中保存所述第二访问请求报文携带的URL,设置所述第二访问请求报文携带的URL的被访问总次数为1,设置访问所述第二访问请求报文携带的URL的IP地址为所述第二访问请求报文的源IP地址;
解析所述第三访问请求报文,从而获得所述第三访问请求报文携带的URL;如果所述第三访问请求报文携带的URL与所述正常URL相同,对所述第三访问请求的处理结束。
第二方面,提供了一种检测网页后门的装置,该装置具有实现上述第一方面所述方法或上述方面的任意一种可能的实现方式的功能。所述功能可以通过硬件实现,也可以通过硬件执行相应的软件实现。所述硬件或软件包括一个或多个与上述功能相对应的模块。
第三方面,本申请实施例提供了一种计算机存储介质,用于储存为上述报文转发设备所用的计算机软件指令,其包含用于执行上述第一方面或上述方面的任意一种可能的实现方式所设计的程序。
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作一简单地介绍,显而易见地,下面描述中的附图是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为本申请实施例提供的检测网页后门的方法的应用场景示意图;
图2为本申请实施例提供的安全设备的结构示意图;
图3为本申请实施例提供的检测网页后门的方法的流程图;
图4为本申请实施例提供的哈希表的结构示意图;
图5为本申请实施例提供的根据第一web流量构建网页访问记录的方法的流程图;
图6为本申请实施例提供的一个表项的实例图;
图7为本申请实施例提供的另一种哈希表的结构示意图;
图8为本申请实施例提供的检测网页后门的方法的另一流程图;
图9为本申请实施例提供的安全设备处理三个访问请求报文之前网页访问记录的示意图;
图10为本申请实施例提供的安全设备处理三个访问请求报文之后网页访问记录的示意图;
图11为本申请实施例提供的检测网页后门的装置的结构示意图。
下面将结合各个附图对本发明技术方案的实现原理、具体实施方式及其对应能够达到的有益效果进行详细的阐述。
终端使用浏览器访问网页这一行为产生的浏览器和网站服务器之间一系列的交互报文被称为web流量。随着网络中信息的爆炸性增长,一方面网站服务器往往会存储着以千百万计的网页文件,另一方面终端用户频繁进行网页访问活动,导致web流量急速增长。现有以防火墙、深度报文检测(英文:Deep Packet Inspection,DPI)等为例的安全设备受性能的制约,难以对接收到的web流量所承载的所有网页数据进行逐一检测,这也成为现有web安全技术的难点之一。
现有web检测性能不高的主要原因之一是由于待检测的网页数量巨大,对此本申请实施例提供了一种检测网页后门的方法。该方法基于已发生的被保护主机的web流量,构建能够反映被保护主机中各个网页被访问的次数、访问者IP分布等情况的网页访问记录。进一步根据该网页访问记录从被保护主机提供的所有网页的统一资源定位符(英文:Uniform Resource Locator,URL)中识别可疑程度较高的URL,后续着重对可疑URL标识的网页进行检测,而无需对所有网页都进行网页后门检测。上述方法减少了待检测网页的数量,从而提高了web检测性能。
下面结合各个附图对本申请实施例技术方案的主要实现原理、具体实施方式及其对应能够达到的有益效果进行详细的阐述。
附图1为本申请实施例应用场景示意图。网络系统中包括网站服务器11、安全设备12、和多个终端13。其中网站服务器11是被保护主机的一个示例。在本发明实施例中,被保护主机是指能提供网页服务的主机。在主机中安装Apache或微软公司的互联网信息服务(英文:Internet Information Services,IIS)应用软件后,主机可以作为网站服务器向网络中的其他用户提供网页服务。
终端13在本申请实施例中是指具有网页访问功能的终端设备,例如安装有浏览器的个人计算机、智能手机或者便携手计算机等等。浏览器是一种用于检索并展示互联网信息资源的应用程序。当前常用的浏览器包括Internet Explorer、Mozilla Firefox、谷歌公司的Chrome等等。终端13可以位于局域网中,通过网络地址转换(英文:Network Address Translation,NAT)设备访问互联网中的网站服务器11。终端13也可以直接通过公有IP地址直接访问互联网中的网站服务器11。
安全设备12获取终端13访问网站服务器11时产生的web流量。如图1所示,安全设备12设置于终端13与网站服务器11之间的通信路径上,访问网站服务器11的流量都经由安全设备12转发给网站服务器。例如,安全设备12是设置于网站服务器11之前的防火墙,网站服务器11通过防火墙接入网络。在这种部署方式下,安全设备12保存流经安全设备12访问网站服务器11的web流量。安全设备12也可以以旁路方式部署,图1中未示出,例如网站服务器11通过网关设备14接入网络,安全设备12是与网关设备14相连的DPI设备。网关设备14对终端13访问网站服务器11的流量进行镜像处理,再将镜像处理得到的镜像流量发送给DPI设备。本申请实施例对安全设备12的具体部署方式不做限定,只要安全设备12能够获得终端13访问网站服务器11的web流量即可。
由于真实网络环境往往比较复杂,安全设备12可以参与其他网络设备的流量转发过程。在这种情况下,可以在安全设备12中预先存储一个或多个被保护主机的IP地址。安全设备12根据预先存储的被保护主机的IP地址结合web访问相关的协议类型,例如HTTP,从获得的所有流量中筛选出被保护主机提供的网页被访问时发生的流量。
采用本申请实施例提供的方法对多个被保护主机提供的网页进行检测。为了描述简明,本申请实施例主要仅以被保护主机为一个网站服务器为例进行说明,对于多个被保护主机情况可以执行相类似的处理。
附图2是本申请实施例提供的安全设备的结构示意图。安全设备可以是附图1中的安全设备12。安全设备包括处理器210、存储器220、网络接口230、输入设备240、显示器250和总线260。其中处理器210、存储器220以及网络接口230、输入设备240和显示器250通过总线304相互连接。
处理器210可以是一个或多个中央处理器(英文:Central Processing Unit,CPU),在处理器210是一个CPU的情况下,该CPU可以是单核CPU,也可以是多核CPU。
存储器220包括但不限于是随机存取存储器(RAM)、只读存储器(ROM)、可擦除可编程只读存储器(EPROM或者快闪存储器)、或便携式只读存储器(CD-ROM)。
所述网络接口230用于可以是有线接口,例如光纤分布式数据接口(英文:Fiber Distributed Data Interface,FDDI)、千兆以太网(英文:Gigabit Ethernet,GE)接口;网络接口230也可以是无线接口。
处理器210用于读取存储器220中存储的程序代码222,运行后执行以下操作。
具体地,处理器210通过网络接口230获取被保护主机的第一web流量,其中被保护主机的第一web流量是指在第一时间段所述被保护主机提供的网页被访问时发生的流量。为了区分不同阶段获取的web流量,本申请实施例将生成网页访问记录时所依据的web流量称为第一web流量。将生成网页访问记录后,接收到的web流量称为第二web流量。第二web流量可以用于更新网页访问记录。
处理器210通过所述第一web流量生成所述被保护主机的网页访问记录221,其中网页访问记录保存至少一个URL、访问所述至少一个URL中的每个URL的IP地址,以及所述每个URL的被访问总次数,其中所述每个URL标识所述被保护主机提供的一个网页。处理器210将生成的网页访问记录221存储于存储器220中。
处理器210根据所述网页访问记录,从所述至少一个URL中确定可疑URL,所述可疑URL的被访问总次数小于第一阈值、且访问所述可疑URL的互不相同的IP地址的数量与所述可疑URL的被访问总次数的比值小于第二阈值。处理器210根据存储器220中的网页后门特征库,检测所述可疑URL标识的网页是否存在网页后门。
由于仅有攻击者知晓webshell文件在网站服务器的网页目录中的存放位置,正常用户并不知晓webshell文件的存放位置,因此往往只有攻击者访问webshell文件,正常用户通常不会访问webshell文件。与此相比,网站服务器向公众提供的正常网页文件被大量正常用户频繁访问。因此webshell文件的访问分布情况与正常网页文件的访问分布情况有很大差异。正常网页文件具有被访问频率高、访问者IP分布广的特点,而webshall文件具有访问频率低、访问者IP较为单一的特点。当然,攻击者可以通过设置代理服务器、伪造IP地址等方式在一定程度上逃避监测。因此,本申请根据访
问行为的差异识别出可疑URL,再进一步对可疑URL标识的网页进行检测。
本申请实施例中安全设备构建能够反映被保护主机中各个网页被访问的次数、访问者IP分布等情况的网页访问记录,从被保护主机提供的所有网页的URL中识别可疑程度较高的URL,后续着重对可疑URL标识的网页进行检测,而不用对所有网页都进行检测。由于减少了待检测网页的数量,从而提高了web检测性能。
下面结合各个流程图,对本申请提供的检测网页后门的方法进行详细描述。
附图3是本申请实施例提供的检测网页后门的方法的原理流程图。该方法可以由附图1中的安全设备12执行。
步骤31,获取被保护主机的第一web流量,其中第一web流量是指在第一时间段被保护主机提供的网页被访问时发生的流量。
安全设备中预先存储有被保护主机的IP地址。采用直路部署的情况下,安全设备接入网络后,将流经所述安全设备的报文的源地址或目的地址与被保护主机的IP地址进行比较,如果报文的源地址或目的地址与被保护主机的IP地址相同、且协议类型为HTTP,则保存报文,从而获得被保护主机的第一web流量。采用旁路部署的情况下,安全设备将网关设备发来的镜像流量中的报文的源地址或目的地址与被保护主机的IP地址进行比较。如果报文的源地址或目的地址与被保护主机的IP地址相同、且协议类型为HTTP,则保存报文;如果报文的源地址或目的地址与被保护主机的IP地址不同,或者协议类型与web访问无关,则删除报文,从而节省存储空间。
步骤32,根据第一web流量生成所述被保护主机的网页访问记录。网页访问记录用于保存以下信息:至少一个URL、访问所述至少一个URL中的每个URL的IP地址,以及所述每个URL的被访问总次数。其中所述每个URL标识所述被保护主机提供的一个网页。
具体地,网页访问记录中包含多个表项,每个表项与所述至少一个URL中的一个URL对应。每个表项不仅保存对应的URL,还保存该表项对应的URL被访问的总次数,以及访问该表项对应的URL的IP地址。
安全设备可以采用多种不同的数据结构,例如多维数组、哈希表等来组织网页访问记录中的多个表项。
为了便于查找和更新存储的信息,本申请实施例提供了一种哈希表来保存上述网页访问记录。如图4所示,具体采用哈希桶来实现哈希表。每个被保护主机的IP地址对应一个哈希桶(Bucket)表。例如本实施例中每个被保护主机的IP地址用41表示,哈希桶表用42表示,每个地址41分别对应的哈希桶表42包括256个哈希桶。
哈希桶表42中的每个哈希桶是哈希表内表项的虚拟子群组。每个哈希桶对应一个由表项组成的长度不等的链表。在图4中链表用43表示,表项用44表示。链表43中存储有0个,1个或多个表项44。每个表项包括索引键和值。每个表项的索引键是对URL进行哈希运算得到的结果,值为URL本身,还保存有用于记录访问该URL的总次数的访问总次数CountVisit,以及用于记录访问该URL的IP地址列表IP Li st等等信息。哈希算法包括信息摘要算法5(Message-Digest Algorithm 5,MD5)。
在后续其他实施例中,将结合附图5至附图7介绍构建附图4所示的哈希表的详细过程。
步骤33,根据所述网页访问记录,从所述至少一个URL中确定可疑URL,所述可疑URL的被访问总次数小于第一阈值、且访问所述可疑URL的互不相同的IP地址的数量与所述可疑URL的被访问总次数的比值小于第二阈值。安全设备中预先保存有第一阈值和第二阈值,其中第一阈值和第二阈值可以是网络管理人员根据经验和实际网络环境设定并通过附图2中的输入设备240输入安全设备的,也可以是根据预先标定的web流量样本,通过机器学习的方式获得的,本实施例对此不进行限定。
可选地,安全设备定期根据第一阈值、第二阈值对附图4所示的哈希表中表项存储的信息进行判别,从而识别可疑URL。第一阈值为自然数、取值范围可以根据经验、存储器的存储空间和判别周期设定。随着判别周期越长,存储空间越大,第一阈值的取值范围也可以适当增大,从而获得更准确的识别效果。具体取值可以根据实际情况灵活设定。例如判别周期为10天,第一阈值的取值为1000。
第二阈值为0到1之间的百分数。第二阈值的取值也可以根据经验和实际网络环境设定。第二阈值的取值越小,识别出的可疑URL误报率越低,但是会有一定的漏报率。第二阈值的取值越大,识别出的可疑URL误报率越高,漏报率将会降低。例如,第二阈值可以取50%。
步骤34,确定所述可疑URL标识的网页是否包含网页后门特征库中的后门特征,根据确定结果检测所述可疑URL标识的网页是否存在网页后门。
通常在网页访问过程中,浏览器先通过基于传输控制协议(英文:Transmission Control Protocol,TCP)与网站服务器建立连接。然后通过建立的连接向网站服务器发送访问请求报文,例如HTTP request GET报文、HTTP request Post报文。访问请求报文携带待访问页面的URL。
网站服务器接收到访问请求报文后,根据访问请求报文中携带的URL,从网页目录中查找到对应的网页文件。网站服务器根据查找结果向浏览器发送访问应答报文,例如HTTP request Response报文。访问应答报文中携带状态码,例如HTTP 1.1版本中定义了5类状态码,状态码由三位数字组成,第一个数字定义了响应的类别,具体地
1XX 提示信息-表示请求已被成功接收,继续处理;
2XX 成功-表示请求已被成功接收,理解,接受;
3XX 重定向-要完成请求必须进行更进一步的处理;
4XX 客户端错误-请求有语法错误或请求无法实现;
5XX 服务器端错误-服务器未能实现合法的请求。
如果状态码指示访问成功,网站服务器根据查找到的网页文件的数据量大小,将网页文件通过一个或多个响应报文发送给浏览器。
安全设备通过步骤31~步骤33得到可疑URL后,可以进一步得到可疑URL所标识的网页被访问时浏览器与网站服务器交互的报文。然后安全设备可以通过基于报文的检测方式和基于数据流的检测方式,根据网页后门特征库,检测上述交互报文承载的网页是否存在网页后门。
具体地,安全设备可以通过以下方式获取可疑URL所标识的网页被访问时浏览器与网站服务器交互的报文。
方式一
安全设备从保存的被保护主机的第一web流量中查找到终端访问可疑URL所标识的网页时产生的交互报文。例如,安全设备根据HTTP协议的相关标准,对第一web流量中的一个访问请求报文进行解析,从而得到该访问请求报文中携带的信息为:
Internet Protocol Version 4,Src:219.133.94.158,Dst:10.1.1.34
Transmission Control Protocol,Src Port:1272(1272),Dst Port:80(80),Seq:1,Ack:1,Len:89
Hypertext Transfer Protocol
GET http://www.google.com.hk/videohp HTTP/1.1
Accpet-Language:en-us
UA-CPU:X86
Accept-Encoding:gzip,deflate
User-Agent:Mozilla/4.0
Host:www.google.com.hk
Connection:Keep-Alive
Cache-Control:no-cache
安全设备得到访问请求报文携带的URL是GET关键字后面的www.google.com.hk/videohp。安全设备将得到的URL与可疑URL进行比较,若访问请求报文携带的URL与可疑URL一致,则根据该访问请求报文的源地址、目的地址、源端口、目的端口、协议类型、序列号、时间戳等信息,从第一web流量中获得该访问请求报文所属数据流的所有报文,得到的报文即为访问可疑URL所标识的网页时,浏览器与网站服务器交互的报文。
方式二
安全设备通过该安全设备上安装的浏览器访问可疑URL所标识的页面,保存该过程中与网站服务器交互产生的一系列报文,从而得到访问可疑URL所标识的网页时,浏览器与网站服务器交互的报文。
在采用基于报文的检测方式的情况下,安全设备将得到的访问可疑URL所标识的网页时,浏览器与网站服务器交互的每个报文与网页后门特征库中的特征进行匹配,如果匹配命中的特征满足预设规则,例如匹配命中的特征超过预定数量,则确认可疑URL所标识的网页存在网页后门。在实施过程中,可以预先根据网页后门特征库中的特征生成多模式匹配状态机,将单个报文的内容输入状态机,通过一次扫描即可找到该报文匹配的所有特征,从而提高了检测性能。
在采用基于数据流的检测方式的情况下,安全设备得到访问可疑URL所标识的网页时,浏览器与网站服务器交互的各个报文后,对报文进行流重组从而得到数据流的载荷内容,将载荷内容与网页后门特征库中的特征进行匹配。根据匹配命中结果以及预定的网页后门识别规则,检测所述可疑URL标识的网页是否存在网页后门。预定的网页后门识别规则包括如果匹配命中的特征中先后出现特征A、B、C,则确认可疑URL所标识的网页存在网页后门;或者,如果匹配命中的特征超过3个,则确认可疑URL所标识的网页存在网页后门。
附图5是本申请实施例提供的根据第一web流量构建网页访问记录的方法的流程图。
步骤51,安全设备对第一web流量进行协议解析,得到第一web流量中的至少一个访问请求报文。在本实施例中,访问请求报文是指浏览器向网站服务器发送的HTTP request GET报文。HTTP request GET报文的目的IP地址为所述被保护主机的IP地址。安全设备对至少一个访问请求报文中的每个访问请求报文执行步骤52~58,直到处理完所有访问请求报文为止。具体地安全设备可以按照预设的选择规则,从至少一个访问请求报文中逐个选取访问请求报文,例如按照时间先后顺序,根据访问请求报文携带的时间戳,依次选取访问请求报文。
步骤52~510以一个访问请求报文为例,对处理过程进行详细说明。
步骤52,安全设备通过协议解析获得该访问请求报文的目的IP地址、源地址和携带的URL。
步骤53,安全设备根据目的IP地址在网页访问记录中查找该目的IP地址对应的记录。即判断在网页访问记录中是否已记录有该目的IP地址、以及该目的IP地址对应的哈希桶表。如果网页访问记录中未记录该目的地址,则执行步骤54;如果网页访问记录中已记录该目的地址,则执行步骤55。
步骤54,安全设备记录该目的IP地址,并创建该目的IP对应的哈希桶表。进一步执行步骤56。
具体地,安全设备在网页访问记录中记录目的IP地址,创建该目的IP地址对应的包含256个哈希桶的哈希桶表。初始时,哈希桶表中的每个哈希桶对应的链表为空。
步骤56,安全设备根据预定的哈希桶散列算法,对该访问请求报文中携带的URL进行计算,确定该访问请求报文中携带的URL所属的哈希桶。进一步执行步骤57。
步骤57,安全设备在确定出的哈希桶中创建一个表项。所创建的表项的索引键是对该访问请求报文中携带的URL进行哈希运算得到的结果,将该URL记录在创建的表项中。并且设置该创建的表项中保存的访问总次数为1,在该表项的IP地址列表中记录步骤52解析得到的源地址。
步骤55,安全设备根据预定的哈希桶散列算法,对该访问请求报文中携带的URL进行计算,确定该访问请求报文中携带的URL所属的哈希桶。进一步执行步骤58。
步骤58,安全设备在确定出的哈希桶对应链表中查找该URL对应的表项。
安全设备对该URL进行哈希运算,在查找到的哈希桶对应的链表中查找以哈希在运算结果为索引的表项。如果不存在以哈希在运算结果为索引的表项,则执行步骤59。如果存在以哈希在运算结果为索引的表项,则执行步骤510。
步骤59,安全设备创建以哈希运算结果为索引的表项,在创建的表项中记录该URL,在在该表项的IP地址列表中记录该访问请求报文中携带的源地址,设置创建的表项中的访问总次数为1。
步骤510,安全设备在以哈希运算结果为索引的表项的IP地址列表中记录该访问请求报文中携带的源地址,将该以哈希运算结果为索引的表项中保存的访问总次数加1。
例如,安全设备通过协议解析获得第一web流量中的一个访问请求报文中携带的
目的IP地址为10.1.1.34,源地址为219.133.94.158,URL为www.google.com.hk/videohp。其中目的地址10.1.1.34与被保护主机的IP地址相同。
安全设备中预设的哈希算法为32位MD5算法,即输入为任意长度的URL,输出为32位16进制符号。本实例中对www.google.com.hk/videohp执行哈希运算的结果为a356bf63af5c8b348032bba8b44eceda。
哈希桶散列算法的目的是将任意一个哈希结果划归到256个哈希桶中的一个哈希桶中。在本实例中哈希桶散列算法具体是将哈希运算结果依次划分为16组,每组2位,依次执行相与运算,最终得到两个16进制符号;然后将两个16进制符号对256取余,将取余结果作为哈希桶的序号。
例如,a3|56|bf|63|af|5c|8b|34|80|32|bb|a8|b4|4e|ce|da=ab,ab%256=163,确认www.google.com.hk/videohp属于哈希桶163。
在哈希桶163中查找索引键为a356bf63af5c8b348032bba8b44eceda的表项。在本实例中假设哈希桶163中不存在索引键为
a356bf63af5c8b348032bba8b44eceda的表项,则安全设备在哈希桶163对应的链表的末尾新建索引键为a356bf63af5c8b348032bba8b44eceda的表项,或者按照预定规则插入链表的预定位置。在该表项中记录www.google.com.hk/videohp,在新建表项的IP地址列表中该访问请求报文中携带的源地址219.133.94.158,将创建表项中的访问总次数设置为1。经过上述处理创建的表项如图6所示。
相应地,采用附图5所示的方法构建出网页访问记录后,附图3的步骤33在确定每个表项对应的URL是否是可疑表项时,首先获取该表项中的IP地址列表IP List,从中确定出互不相同的IP地址,计算互不相同的IP地址的数量。然后取出被访问总次数CountVisit。如果被访问总次数CountVisit的值小于第一阈值、且计算出的互不相同的IP地址的数量与被访问总次数CountVisit的值的比值小于第二阈值,则确定该URL对应的URL是为可疑URL。
为了提高识别可疑URL的效率,还可以对附图4所示的表项44的数据结构进行改进,增加一项IP地址计数值Count IP,IP地址计数值用于记录访问该URL的互不相同的IP地址的数量。并且在IP地址列表IP Lisit中仅记录互不相同的IP地址,如附图7所示。
相应地,附图5所示的构建网页访问记录的方法也需要进行适应性调整。具体地,在步骤57或者步骤59中,如果未查找到访问请求报文携带的URL对应的表项,则在所述网页访问记录中创建所述访问请求报文携带的URL对应的表项,将创建的表项的被访问总次数设置为1,将所述创建的表项的IP地址计数值设置为1,并在所述创建的表项的IP地址列表中记录该访问请求报文的源IP地址。
在步骤510中,如果查找到访问请求报文携带的URL对应的表项,则将查找到的表项的被访问总次数加1。需要进一步确定所述查找到表项的IP地址列表中是否已保存该访问请求报文的源IP地址,如果所述查找到表项的IP地址列表中已保存该访问请求报文的源IP地址,则对所述访问请求报文处理结束。如果所述查找到的表项的IP地址列表中未保存该访问请求报文的源IP地址,则将所述查找到的表项的IP地址计数值加1,并在所述查找到的表项的IP地址列表中记录该访问请求报文的源IP地
址。
通过上述改进,在附图3的步骤33中,在确定每个表项对应的URL是否是可疑表项时,只需要取出被访问总次数CountVisit和IP地址计数值CountIP,就可以简便地确认该URL对应的URL是否为可疑URL。具体地,如果被访问总次数CountVi sit的值小于第一阈值、且IP地址计数值CountIP的值与被访问总次数CountVisit的值的比值小于第二阈值,则确定该URL对应的URL是为可疑URL。
可选地,终端通过浏览器访问网页时,这一访问过程有可能并未成功。对于攻击者来说,如果访问webshell文件失败,将无法攻击成功。如果安全设备对这些访问失败的页面进行检测将没有实际意义,因为在附图3的步骤34中无法得到浏览器与网站服务器交互的报文。为了避免后续对访问失败页面进行检测可能浪费处理资源、以及在网页访问记录中保存访问失败页面的URL对应表项浪费存储空间,在附图5~附图7所示的方法构建网页访问记录的过程中,在步骤51从第一web流量中获取到的至少一个访问请求报文时可以进行如下改进。
安全设备首先从第一web流量中选择至少一个访问应答报文,其中选中的每个网页访问应答报文携带的状态码指示访问成功。访问应答报文是网站服务器接收到访问请求报文后,向浏览器返回的报文。本申请仅考虑源地址为所述被保护主机的IP地址的访问应答报文的。
例如,访问成功的访问应答报文解析后的内容如下
HTTP/1.1 200 OK
Date:Wed,10 Jun 2009 11:22:58GMT
Server:Microsoft-IIS/6.0
X-Powered-By:ASP.NET
Content-Length:4218
Content-Type:text/html
Cache-control:private
其中状态码“200 OK”指示访问成功。
此后,安全设备根据各个报文携带的源地址、源端口、目的地址、目的端口、协议类型、序列号、确认号等信息,确定第一web流量中各访问请求报文和各访问应答报文的对应关系,从而从第一web流量中获取所述每个指示访问成功的访问应答报文分别对应的访问请求报文,作为获得的所述至少一个访问请求报文。
此外,终端通过浏览器访问网站服务器时,由于终端可能安装不同厂商提供浏览器、或者不同版本的浏览器。不同的浏览器由于程序设计方面的差异,会导致不同浏览器访问网站服务器提供的同一网页时,产生的多个访问请求报文中携带不同的URL。具体地,尽管这些多个访问请求报文访问同一网页,但是其中携带的URL采用不同的大小写方式、或者编码方式、或者携带不同的参数。安全设备会将这些访问请求报文按照携带不同URL进行处理,从而在网页访问记录中创建不同的表项。这样一方面,这种处理方式与这些访问请求报文实际上访问的是同一网页这一实际情况不符,造成后续可疑URL识别时的偏差,另一方面会造成网页访问记录数据量过大。为了提高可疑URL识别的准确性,节约网页访问记录在存储器中占有的存储空间,可选地,
在采用附图5~附图7所示的方法构建网页访问记录的过程中,安全设备在步骤58在确定出的哈希桶对应链表中查找该URL对应的表项之前,先对解析得到的URL进行以下几种正规化处理中的至少一种正规化处理。
一、将解析得到的URL中的字符转换为预定大小写类型。例如将所有字符统一转换为小写。
二、将解析得到的URL转换为预定编码格式。URL可能采用的编码方式有GB2312、GBK、UTF8等等。在本实例中将所有URL均转换为GBK编码。
三、去除解析得到的URL中参数。
例如解析得到的URL 1为www.google.com.hk/videohp?hl=zh-cn&tab=wv,去除参数后的URL 1为www.google.com.hk/videohp。解析得到的URL 2为www.google.com.hk/videohp?hl=zh-cn&tab=wv&aq=f,去除参数后的URL 2为www.google.com.hk/videohp。
这样正规化处理后的URL 1和URL 2相同,在网页访问记录中对应同一个表项,从而有效控制网页访问记录的规模,节约存储资源。
在网站服务器提供的页面文件数目较多或者不断增长时,安全设备采用图4所示的数据结构分别存储访问所述至少一个URL中的每个URL的IP地址,以及所述每个URL的被访问总次数将占用较多存储资源。可选地,安全设备根据第一阈值、或者网页后门检测结果识别正常URL,删除网页访问记录中保存的访问所述正常URL的IP地址和所述正常URL的被访问总次数,后续不再更新访问所述正常URL的IP地址和所述正常URL的被访问总次数,从而节省存储资源和处理资源。
基于上述考虑,对附图3所示的检测网页后门的方法进行改进,改进后的流程图请参照附图8。附图8中的步骤31~步骤34与附图3相同,在步骤32之后,还包括:
步骤35,安全设备确定正常URL,其中正常URL是指所述至少一个URL中的被访问总次数大于第一阈值的URL。
在步骤34之后,还包括:
步骤36,安全设备确定正常URL,其中正常URL是指网页后门检测结果指示所标识的网页不存在网页后门的可疑URL。
在步骤35、36之后,安全设备执行步骤37,删除所述网页访问记录中保存的访问所述正常URL的IP地址和所述正常URL的被访问总次数。需要说明的是,步骤35和步骤36可以择一执行或同时执行。
由于当前信息的增长速度很快,网站服务器提供的正常网页数量也不断增长,需要适时更新网页访问记录。本申请实施例为了适应这种现状,在步骤37之后还包括:
步骤38,安全设备获取所述被保护主机的第二web流量。所述第二web流量是指在所述第一时间段之后的第二时间段中所述被保护主机提供的网页被访问时发生的流量。
步骤39,安全设备从所述第二web流量中获得访问请求报文,解析所述访问请求报文,从而获得所述访问请求报文的源地址和携带的URL。
步骤310,安全设备判断步骤39得到的访问请求报文携带的URL与正常URL是否相同,如果相同,对所述访问请求的处理结束。如果第二web流量中还有未处理的访
问请求报文,则继续处理另一个未处理的访问请求报文。如果不同,执行步骤311。
步骤311,安全设备判断网页访问记录中是否保存有所述访问请求报文携带的URL,如果保存有所述访问请求报文携带的URL,执行步骤312。如果未保存有所述访问请求报文携带的URL,执行步骤313。
步骤312,安全设备将已保存的所述访问请求报文携带的URL的被访问总次数加1,在访问所述访问请求报文携带的URL的IP地址中增加所述访问请求报文的源IP地址。如果第二web流量中还有未处理的访问请求报文,则继续处理另一个未处理的访问请求报文。
步骤313,安全设备在网页访问记录中保存所述访问请求报文携带的URL,设置所述访问请求报文携带的URL的被访问总次数为1,设置访问所述访问请求报文携带的URL的IP地址为该访问请求的源IP地址。如果第二web流量中还有未处理的访问请求报文,则继续处理另一个未处理的访问请求报文。
以第二web流量中的三个不同访问请求报文HTTP request 1、HTTP request 2和HTTP request 3为例,对附图8所示的方法进行举例说明。这里为了简明起见,仅以“IP+标识”的方式代替具体的32位2进制地址,用“URL+标识”的方式代替具体URL字符串。在本实例中安全设备处理三个访问请求报文之前,采用图7所示的数据结构构建出的网页访问记录如图9所示。其中,URL 3为正常URL,不保存URL 2对应的被访问总次数和IP地址列表。安全设备暂时无法识别URL 1是否为可疑URL或是正常URL,因此保存URL 3对应的被访问总次数和IP地址列表。
安全设备解析HTTP request 1、HTTP request 2和HTTP request 3得到这三个访问请求的目的地址均为IP 0,为被保护主机的IP地址。获得HTTP request 1携带的URL为URL 1、源IP地址为IP 1。HTTP request 2携带的URL为URL 2、源IP地址为IP 2。HTTP request 3携带的URL为URL 3、源IP地址为IP 3。
对于HTTP request 1,在附图4所示的哈希表中查找IP 0对应的哈希桶表,依次比较各表项保存URL与URL1是否相同。在本实例中URL 1与作为正常URL的URL3不同、且所述网页访问记录中已记录URL 1,则将已记录的URL 1的被访问总次数加1,在访问URL 1的IP地址中增加HTTP request 1的源地址IP 1,将IP地址计数值加1。
在本实例中HTTP request 2携带的URL 2与作为正常URL的URL3不同、且所述网页访问记录中未记录所述URL 2,则在所述访问记录中新建URL 2对应的表项,在新建表项中记录URL 2,设置URL 2的被访问总次数为1,设置IP地址计数值为1,在新建表项的IP地址列表中记录HTTP request 3的源地址IP 2。
在本实例中HTTP request 3携带的URL 3与正常URL相同,对HTTP request 3的处理结束。对上述三个访问请求处理后的网页访问记录如图10所示。
通过上述处理,安全设备在网页访问记录中对于正常URL只需要保存URL即可。对于新增的网页对应的URL、或者尚不能确认是正常URL还是可疑URL的待确认URL,保存待确认URL的IP地址,以及所述待确认URL的被访问总次数。以便后续根据记录的待确认URL的IP地址以及所述待确认URL的被访问总次数,确认待确认URL是正常URL还是可疑URL。一方面保证随着正常网页数目的快速增长,网页访问记录的数据量不至于急速增长,节约存储空间;另一方面能够识别出新出现的webshell文件,保证
了识别效果。
相应地,本申请实施例还提供了一种检测网页后门的装置,如图11所示,该装置包括获取单元111,记录生成单元112和确定单元113,具体如下。
获取单元111,用于获取被保护主机的第一web流量,所述第一web流量是指在第一时间段中所述被保护主机提供的网页被访问时发生的流量。
记录生成单元112,用于根据获取单元111获得的第一web流量生成所述被保护主机的网页访问记录,所述网页访问记录用于保存至少一个统一资源定位符URL、访问所述至少一个URL中的每个URL的IP地址、以及所述每个URL的被访问总次数,其中所述每个URL标识所述被保护主机提供的一个网页。
确定单元113,用于根据记录生成单元112生成的所述网页访问记录,从所述至少一个URL中确定可疑URL,所述可疑URL的被访问总次数小于第一阈值、且访问所述可疑URL的互不相同的IP地址的数量与所述可疑URL的被访问总次数的比值小于第二阈值;以及确定所述可疑URL标识的网页是否包含网页后门特征库中的后门特征,根据后门特征确定结果检测所述可疑URL标识的网页是否存在网页后门。
可选地,本申请实施例中所述网页访问记录包括至少一个表项,所述至少一个表项中的每个表项分别与所述至少一个URL中的一个URL相对应,所述每个表项中保存有被访问总次数和IP地址列表。该表项的结构如图4所示。
所述记录生成单元,具体用于从所述第一web流量中获得至少一个访问请求报文,所述访问请求报文的目的IP地址为所述被保护主机的IP地址;从所述至少一个访问请求报文中选择一个访问请求报文,对选择出的访问请求报文进行以下处理,直到处理完所述至少一个访问请求报文中的每个访问请求报文为止:
解析选择出的访问请求报文,从而获得所述选择出的访问请求报文的源IP地址和携带的URL;在所述网页访问记录中查找所述选择出的访问请求报文携带的URL对应的表项;如果查找到所述选择出的访问请求报文携带的URL对应的表项,则将查找到的表项的被访问总次数加1,在所述查找到的表项的IP地址列表中记录所述源IP地址;如果未查找到所述选择出的访问请求报文携带的URL对应的表项,则在所述网页访问记录中创建所述选择出的访问请求报文携带的URL对应的表项,将创建的表项的被访问总次数设置为1,在所述创建的表项的所述IP地址列表中记录所述源IP地址。
相应地,所述确定单元113,具体用于从所述网页访问记录中选择出一个表项;确定选择出的表项的IP地址列表中互不相同的IP地址的数量;如果所述选择出的表项的被访问总次数少于所述第一阈值、且确定出的互不相同的IP地址的数量与所述选择出的表项的被访问总次数的比值小于所述第二阈值,则确定所述选择出的表项对应的URL为可疑URL
可选地,所述网页访问记录包括至少一个表项,所述至少一个表项中的每个表项分别与所述至少一个URL中的一个URL相对应,所述表项中保存有被访问总次数、IP地址计数值和IP地址列表。表项的结构如图7所示。
所述记录生成单元112,具体用于从所述第一web流量中获得至少一个访问请求报文,所述访问请求报文的目的IP地址为所述被保护主机的IP地址。
从所述至少一个访问请求报文中选择出一个访问请求报文,对选择出的访问请求
报文进行以下处理,直到处理完所述至少一个访问请求报文中的每个访问请求报文为止:
获取所述选择出的访问请求报文的源IP地址和携带的URL;在所述网页访问记录查找所述选择出的访问请求报文携带的URL对应的表项;如果查找到所述选择出的访问请求报文携带的URL对应的表项,则将查找到的表项的被访问总次数加1;确定所述查找到表项的IP地址列表中是否已保存所述源IP地址,如果所述查找到表项的IP地址列表中已保存所述源IP地址,则对所述选择出的访问请求报文处理结束;如果所述查找到的表项的IP地址列表中未保存所述源IP地址,则将所述查找到的表项的IP地址计数值加1,并在所述查找到的表项的IP地址列表中记录所述源IP地址;如果未查找到所述选择出的访问请求报文携带的URL对应的表项,则在所述网页访问记录中创建所述访问请求报文携带的URL对应的表项,将创建的表项的被访问总次数设置为1,将所述创建的表项的IP地址计数值设置为1,并在所述创建的表项的所述IP地址列表中记录所述源IP地址。
相应地,确定单元113,具体用于从所述网页访问记录中选择出一个表项;如果选择出的表项的被访问总次数少于所述第一阈值、且所述选择出的表项的IP地址计数值与所述选择出的表项的被访问总次数的比值小于所述第二阈值,则确定所述选择出的表项对应的URL为可疑URL。
可选地,记录生成单元112从所述第一web流量中选择至少一个访问应答报文,所述至少一个访问应答报文中的每个访问应答报文携带的状态码指示访问成功,所述每个访问应答报文的源地址为所述被保护主机的IP地址;从所述第一web流量中获取所述每个网页访问应答报文分别对应的访问请求报文,作为获得的所述至少一个访问请求报文。
可选地,记录生成单元112在所述网页访问记录查找所述选择出的访问请求报文携带的URL对应的表项,包括:对所述选择出的访问请求报文携带的URL执行至少一种正规化处理,得到正规化处理后的URL,所述正规化处理包括以下(1)~(3)中的一种或多种:(1)将所述选择出的访问请求报文携带的URL转换为预定编码格式,(2)将所述选择出的访问请求报文携带的URL中的字符转换为预定大小写类型,和(3)去除所述选择出的访问请求报文携带的URL中参数;在所述网页访问记录中查找正规化处理后的URL对应的表项。
所述记录生成单元112在所述网页访问记录中创建所述访问请求报文携带的URL对应的表项,具体为:在所述网页访问记录中创建所述正规化处理后的URL对应的表项。
可选地,所述确定单元113,还用于根据所述网页访问记录,从所述至少一个URL中确定正常URL,所述正常URL是所述至少一个URL中的被访问总次数大于所述第一阈值的URL,或者网页后门检测结果指示所标识的网页不存在网页后门的可疑URL;删除所述网页访问记录中保存的访问所述正常URL的IP地址和所述正常URL的被访问总次数。
可选地,获取单元111,还用于获取所述被保护主机的第二web流量,所述第二web流量是指在所述第一时间段之后的第二时间段中所述被保护主机提供的网页被访
问时发生的流量。
相应地,记录生成单元112,还用于从所述第二web流量中获得第一访问请求报文、第二访问请求报文和第三访问请求报文;
解析所述第一访问请求报文,从而获得所述第一访问请求报文的源IP地址和携带的URL;如果所述第一访问请求报文携带的URL与所述正常URL不同、且所述网页访问记录中已保存所述第一访问请求报文携带的URL,则将已保存的所述第一访问请求报文携带的URL的被访问总次数加1,在访问所述第一访问请求报文携带的URL的IP地址中增加所述第一访问请求报文的源IP地址。
解析所述第二访问请求报文,从而获得所述第二访问请求报文的源IP地址和携带的URL;如果所述第二访问请求报文携带的URL与所述正常URL不同、且所述网页访问记录中未保存所述第二访问请求报文携带的URL,则在所述网页访问记录中保存所述第二访问请求报文携带的URL,设置所述第二访问请求报文携带的URL的被访问总次数为1,设置访问所述第二访问请求报文携带的URL的IP地址为所述第二访问请求报文的源IP地址。
解析所述第三访问请求报文,从而获得所述第三访问请求报文携带的URL;如果所述第三访问请求报文携带的URL与所述正常URL相同,对所述第三访问请求的处理结束。
本装置实施例中提供的检测网页后门的装置,可以集成在安全设备中,应用于方法实施例一附图1所示的场景中,实现其中安全设备的功能。检测网页后门的装置可以实现的其他附加功能、以及与其他网元设备的交互过程,请参照方法实施例中对安全设备的描述,在这里不再赘述。
本说明书中的各个实施例之间相同相似的部分互相参见即可,每个实施例重点说明的都是与其他实施例的不同之处。尤其,对于装置实施例而言,由于其基本相似于方法实施例,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。
显然,本领域的技术人员可以对本发明进行各种改动和变型而不脱离本发明的范围。这样,倘若本发明的这些修改和变型属于本发明权利要求及其等同技术的范围之内,则本发明也意图包含这些改动和变型在内。
Claims (19)
- 一种检测网页后门的方法,其特征在于,包括:获取被保护主机的第一web流量,所述第一web流量是指在第一时间段中所述被保护主机提供的网页被访问时发生的流量;根据所述第一web流量生成所述被保护主机的网页访问记录,所述网页访问记录用于保存至少一个统一资源定位符URL、访问所述至少一个URL中的每个URL的IP地址、以及所述每个URL的被访问总次数,其中所述每个URL标识所述被保护主机提供的一个网页;根据所述网页访问记录,从所述至少一个URL中确定可疑URL,所述可疑URL的被访问总次数小于第一阈值、且访问所述可疑URL的互不相同的IP地址的数量与所述可疑URL的被访问总次数的比值小于第二阈值;以及确定所述可疑URL标识的网页是否包含网页后门特征库中的后门特征,根据后门特征确定结果检测所述可疑URL标识的网页是否存在网页后门。
- 根据权利要求1所述的方法,其特征在于,所述网页访问记录包括至少一个表项,所述至少一个表项中的每个表项分别与所述至少一个URL中的一个URL相对应,所述每个表项中保存有被访问总次数和IP地址列表;所述第一web流量生成所述被保护主机的网页访问记录,包括:从所述第一web流量中获得至少一个访问请求报文,所述访问请求报文的目的IP地址为所述被保护主机的IP地址;从所述至少一个访问请求报文中选择一个访问请求报文,对选择出的访问请求报文进行以下处理,直到处理完所述至少一个访问请求报文中的每个访问请求报文为止:解析选择出的访问请求报文,从而获得所述选择出的访问请求报文的源IP地址和携带的URL;在所述网页访问记录中查找所述选择出的访问请求报文携带的URL对应的表项;如果查找到所述选择出的访问请求报文携带的URL对应的表项,则将查找到的表项的被访问总次数加1,在所述查找到的表项的IP地址列表中记录所述源IP地址;如果未查找到所述选择出的访问请求报文携带的URL对应的表项,则在所述网页访问记录中创建所述选择出的访问请求报文携带的URL对应的表项,将创建的表项的被访问总次数设置为1,在所述创建的表项的所述IP地址列表中记录所述源IP地址。
- 根据权利要求2所述的方法,其特征在于,所述根据所述网页访问记录,从所述至少一个URL中确定可疑URL,包括:从所述网页访问记录中选择出一个表项;确定选择出的表项的IP地址列表中互不相同的IP地址的数量;如果所述选择出的表项的被访问总次数少于所述第一阈值、且确定出的互不相同的IP地址的数量与所述选择出的表项的被访问总次数的比值小于所述第二阈值,则确定所述选择出的表项对应的URL为可疑URL。
- 根据权利要求1所述的方法,其特征在于,所述网页访问记录包括至少一个表项,所述至少一个表项中的每个表项分别与所述至少一个URL中的一个URL相对应,所述表项中保存有被访问总次数、IP地址计数值和IP地址列表;所述第一web流量生成所述被保护主机的网页访问记录,包括:从所述第一web流量中获得至少一个访问请求报文,所述访问请求报文的目的IP地址为所述被保护主机的IP地址;从所述至少一个访问请求报文中选择出一个访问请求报文,对选择出的访问请求报文进行以下处理,直到处理完所述至少一个访问请求报文中的每个访问请求报文为止:获取所述选择出的访问请求报文的源IP地址和携带的URL;在所述网页访问记录查找所述选择出的访问请求报文携带的URL对应的表项;如果查找到所述选择出的访问请求报文携带的URL对应的表项,则将查找到的表项的被访问总次数加1;确定所述查找到表项的IP地址列表中是否已保存所述源IP地址,如果所述查找到表项的IP地址列表中已保存所述源IP地址,则对所述选择出的访问请求报文处理结束;如果所述查找到的表项的IP地址列表中未保存所述源IP地址,则将所述查找到的表项的IP地址计数值加1,并在所述查找到的表项的IP地址列表中记录所述源IP地址;如果未查找到所述选择出的访问请求报文携带的URL对应的表项,则在所述网页访问记录中创建所述访问请求报文携带的URL对应的表项,将创建的表项的被访问总次数设置为1,将所述创建的表项的IP地址计数值设置为1,并在所述创建的表项的所述IP地址列表中记录所述源IP地址。
- 根据权利要求4所述的方法,其特征在于,所述根据所述网页访问记录,从所述至少一个URL中确定可疑URL,包括:从所述网页访问记录中选择出一个表项;如果选择出的表项的被访问总次数少于所述第一阈值、且所述选择出的表项的IP地址计数值与所述选择出的表项的被访问总次数的比值小于所述第二阈值,则确定所述选择出的表项对应的URL为可疑URL。
- 根据权利要求2或4所述的方法,其特征在于,从所述第一web流量中获得至少一个访问请求报文包括:从所述第一web流量中选择至少一个访问应答报文,所述至少一个访问应答报文中的每个访问应答报文携带的状态码指示访问成功,所述每个访问应答报文的源地址为所述被保护主机的IP地址;从所述第一web流量中获取所述每个网页访问应答报文分别对应的访问请求报文,作为获得的所述至少一个访问请求报文。
- 根据权利要求2或4所述的方法,其特征在于,在所述网页访问记录中查找所述选择出的访问请求报文携带的URL对应的表项,包括:对所述选择出的访问请求报文携带的URL执行至少一种正规化处理,得到正规化处理后的URL,所述正规化处理包括以下(1)~(3)中的一种或多种:(1)将所述选择出的访问请求报文携带的URL转换为预定编码格式,(2)将所述选择出的访问请求报文携带的URL中的字符转换为预定大小写类型,和(3)去除所述选择出的访问请求报文携带的URL中参数;在所述网页访问记录中查找正规化处理后的URL对应的表项;相应地,在所述网页访问记录中创建所述访问请求报文携带的URL对应的表项,具体为:在所述网页访问记录中创建所述正规化处理后的URL对应的表项。
- 根据权利要求1所述的方法,其特征在于,还包括:根据所述网页访问记录,从所述至少一个URL中确定正常URL,所述正常URL是所述至少一个URL中的被访问总次数大于所述第一阈值的URL,或者网页后门检测结果指示所标识的网页不存在网页后门的可疑URL;删除所述网页访问记录中保存的访问所述正常URL的IP地址和所述正常URL的被访问总次数。
- 根据权利要求8所述的方法,其特征在于,还包括:获取所述被保护主机的第二web流量,所述第二web流量是指在所述第一时间段之后的第二时间段中所述被保护主机提供的网页被访问时发生的流量;从所述第二web流量中获得第一访问请求报文、第二访问请求报文和第三访问请求报文;解析所述第一访问请求报文,从而获得所述第一访问请求报文的源IP地址和携带的URL;如果所述第一访问请求报文携带的URL与所述正常URL不同、且所述网页访问记录中已保存所述第一访问请求报文携带的URL,则将已保存的所述第一访问请求报文携带的URL的被访问总次数加1,在访问所述第一访问请求报文携带的URL的IP地址中增加所述第一访问请求报文的源IP地址;解析所述第二访问请求报文,从而获得所述第二访问请求报文的源IP地址和携带的URL;如果所述第二访问请求报文携带的URL与所述正常URL不同、且所述网页访问记录中未保存所述第二访问请求报文携带的URL,则在所述网页访问记录中保存所述第二访问请求报文携带的URL,设置所述第二访问请求报文携带的URL的被访问总次数为1,设置访问所述第二访问请求报文携带的URL的IP地址为所述第二访问请求报文的源IP地址;解析所述第三访问请求报文,从而获得所述第三访问请求报文携带的URL;如果所述第三访问请求报文携带的URL与所述正常URL相同,对所述第三访问请求的处理结束。
- 一种检测网页后门的装置,其特征在于,包括:获取单元,用于获取被保护主机的第一web流量,所述第一web流量是指在第一时间段中所述被保护主机提供的网页被访问时发生的流量;记录生成单元,用于根据所述第一web流量生成所述被保护主机的网页访问记录,所述网页访问记录用于保存至少一个统一资源定位符URL、访问所述至少一个URL中的每个URL的IP地址、以及所述每个URL的被访问总次数,其中所述每个URL标识所述被保护主机提供的一个网页;确定单元,用于根据所述网页访问记录,从所述至少一个URL中确定可疑URL,所述可疑URL的被访问总次数小于第一阈值、且访问所述可疑URL的互不相同的IP地址的数量与所述可疑URL的被访问总次数的比值小于第二阈值;以及确定所述可疑URL标识的网页是否包含网页后门特征库中的后门特征,根据后门特征确定结果检测 所述可疑URL标识的网页是否存在网页后门。
- 根据权利要求10所述的装置,其特征在于,所述网页访问记录包括至少一个表项,所述至少一个表项中的每个表项分别与所述至少一个URL中的一个URL相对应,所述每个表项中保存有被访问总次数和IP地址列表,所述记录生成单元,具体用于从所述第一web流量中获得至少一个访问请求报文,所述访问请求报文的目的IP地址为所述被保护主机的IP地址;从所述至少一个访问请求报文中选择一个访问请求报文,对选择出的访问请求报文进行以下处理,直到处理完所述至少一个访问请求报文中的每个访问请求报文为止:解析选择出的访问请求报文,从而获得所述选择出的访问请求报文的源IP地址和携带的URL;在所述网页访问记录中查找所述选择出的访问请求报文携带的URL对应的表项;如果查找到所述选择出的访问请求报文携带的URL对应的表项,则将查找到的表项的被访问总次数加1,在所述查找到的表项的IP地址列表中记录所述源IP地址;如果未查找到所述选择出的访问请求报文携带的URL对应的表项,则在所述网页访问记录中创建所述选择出的访问请求报文携带的URL对应的表项,将创建的表项的被访问总次数设置为1,在所述创建的表项的所述IP地址列表中记录所述源IP地址。
- 根据权利要求11所述的装置,其特征在于,所述确定单元,具体用于从所述网页访问记录中选择出一个表项;确定选择出的表项的IP地址列表中互不相同的IP地址的数量;如果所述选择出的表项的被访问总次数少于所述第一阈值、且确定出的互不相同的IP地址的数量与所述选择出的表项的被访问总次数的比值小于所述第二阈值,则确定所述选择出的表项对应的URL为可疑URL。
- 根据权利要求10所述的方法,其特征在于,所述网页访问记录包括至少一个表项,所述至少一个表项中的每个表项分别与所述至少一个URL中的一个URL相对应,所述表项中保存有被访问总次数、IP地址计数值和IP地址列表;所述记录生成单元,具体用于从所述第一web流量中获得至少一个访问请求报文,所述访问请求报文的目的IP地址为所述被保护主机的IP地址;从所述至少一个访问请求报文中选择出一个访问请求报文,对选择出的访问请求报文进行以下处理,直到处理完所述至少一个访问请求报文中的每个访问请求报文为止:获取所述选择出的访问请求报文的源IP地址和携带的URL;在所述网页访问记录查找所述选择出的访问请求报文携带的URL对应的表项;如果查找到所述选择出的访问请求报文携带的URL对应的表项,则将查找到的表项的被访问总次数加1;确定所述查找到表项的IP地址列表中是否已保存所述源IP地址,如果所述查找到表项的IP地址列表中已保存所述源IP地址,则对所述选择出的访问请求报文处理结束;如果所述查找到的表项的IP地址列表中未保存所述源IP地址,则将所述查找到的表项的IP地址计数值加1,并在所述查找到的表项的IP地址列表中记录所述源IP地址;如果未查找到所述选择出的访问请求报文携带的URL对应的表项,则在所述网页 访问记录中创建所述访问请求报文携带的URL对应的表项,将创建的表项的被访问总次数设置为1,将所述创建的表项的IP地址计数值设置为1,并在所述创建的表项的所述IP地址列表中记录所述源IP地址。
- 根据权利要求13所述的装置,其特征在于,所述确定单元,具体用于从所述网页访问记录中选择出一个表项;如果选择出的表项的被访问总次数少于所述第一阈值、且所述选择出的表项的IP地址计数值与所述选择出的表项的被访问总次数的比值小于所述第二阈值,则确定所述选择出的表项对应的URL为可疑URL。
- 根据权利要求12或14所述的装置,其特征在于,所述记录生成单元从所述第一web流量中选择至少一个访问应答报文,所述至少一个访问应答报文中的每个访问应答报文携带的状态码指示访问成功,所述每个访问应答报文的源地址为所述被保护主机的IP地址;从所述第一web流量中获取所述每个网页访问应答报文分别对应的访问请求报文,作为获得的所述至少一个访问请求报文。
- 根据权利要求12或14所述的方法,其特征在于,所述记录生成单元在所述网页访问记录查找所述选择出的访问请求报文携带的URL对应的表项,包括:对所述选择出的访问请求报文携带的URL执行至少一种正规化处理,得到正规化处理后的URL,所述正规化处理包括以下(1)~(3)中的一种或多种:(1)将所述选择出的访问请求报文携带的URL转换为预定编码格式,(2)将所述选择出的访问请求报文携带的URL中的字符转换为预定大小写类型,和(3)去除所述选择出的访问请求报文携带的URL中参数;在所述网页访问记录中查找正规化处理后的URL对应的表项;所述记录生成单元在所述网页访问记录中创建所述访问请求报文携带的URL对应的表项,具体为:在所述网页访问记录中创建所述正规化处理后的URL对应的表项。
- 根据权利要求10所述的装置,其特征在于,所述确定单元,还用于根据所述网页访问记录,从所述至少一个URL中确定正常URL,所述正常URL是所述至少一个URL中的被访问总次数大于所述第一阈值的URL,或者网页后门检测结果指示所标识的网页不存在网页后门的可疑URL;删除所述网页访问记录中保存的访问所述正常URL的IP地址和所述正常URL的被访问总次数。
- 根据权利要求17所述的装置,其特征在于,所述获取单元,还用于获取所述被保护主机的第二web流量,所述第二web流量是指在所述第一时间段之后的第二时间段中所述被保护主机提供的网页被访问时发生的流量;所述记录生成单元,还用于从所述第二web流量中获得第一访问请求报文、第二访问请求报文和第三访问请求报文;解析所述第一访问请求报文,从而获得所述第一访问请求报文的源IP地址和携带的URL;如果所述第一访问请求报文携带的URL与所述正常URL不同、且所述网页访 问记录中已保存所述第一访问请求报文携带的URL,则将已保存的所述第一访问请求报文携带的URL的被访问总次数加1,在访问所述第一访问请求报文携带的URL的IP地址中增加所述第一访问请求报文的源IP地址;解析所述第二访问请求报文,从而获得所述第二访问请求报文的源IP地址和携带的URL;如果所述第二访问请求报文携带的URL与所述正常URL不同、且所述网页访问记录中未保存所述第二访问请求报文携带的URL,则在所述网页访问记录中保存所述第二访问请求报文携带的URL,设置所述第二访问请求报文携带的URL的被访问总次数为1,设置访问所述第二访问请求报文携带的URL的IP地址为所述第二访问请求报文的源IP地址;解析所述第三访问请求报文,从而获得所述第三访问请求报文携带的URL;如果所述第三访问请求报文携带的URL与所述正常URL相同,对所述第三访问请求的处理结束。
- 一种安全设备,其特征在于,包括存储器,处理器,网络接口和总线,所述存储器、所述处理器和所述网络接口通过所述总线相互连接,其特征在于,所述网络接口,用于获取被保护主机的第一web流量,所述第一web流量是指在第一时间段中所述被保护主机提供的网页被访问时发生的流量;所述处理器读取所述存储器中存储的程序代码后,执行以下操作:根据所述第一web流量生成所述被保护主机的网页访问记录,所述网页访问记录用于保存至少一个统一资源定位符URL、访问所述至少一个URL中的每个URL的IP地址、以及所述每个URL的被访问总次数,其中所述每个URL标识所述被保护主机提供的一个网页;根据所述网页访问记录,从所述至少一个URL中确定可疑URL,所述可疑URL的被访问总次数小于第一阈值、且访问所述可疑URL的互不相同的IP地址的数量与所述可疑URL的被访问总次数的比值小于第二阈值;以及确定所述可疑URL标识的网页是否包含网页后门特征库中的后门特征,根据后门特征确定结果检测所述可疑URL标识的网页是否存在网页后门。
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP17881349.9A EP3547635B1 (en) | 2016-12-16 | 2017-08-08 | Method and device for detecting webshell |
US16/440,795 US11863587B2 (en) | 2016-12-16 | 2019-06-13 | Webshell detection method and apparatus |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611167905.3 | 2016-12-16 | ||
CN201611167905.3A CN108206802B (zh) | 2016-12-16 | 2016-12-16 | 检测网页后门的方法和装置 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US16/440,795 Continuation US11863587B2 (en) | 2016-12-16 | 2019-06-13 | Webshell detection method and apparatus |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2018107784A1 true WO2018107784A1 (zh) | 2018-06-21 |
Family
ID=62557860
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2017/096502 WO2018107784A1 (zh) | 2016-12-16 | 2017-08-08 | 检测网页后门的方法和装置 |
Country Status (4)
Country | Link |
---|---|
US (1) | US11863587B2 (zh) |
EP (1) | EP3547635B1 (zh) |
CN (1) | CN108206802B (zh) |
WO (1) | WO2018107784A1 (zh) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109040071A (zh) * | 2018-08-06 | 2018-12-18 | 杭州安恒信息技术股份有限公司 | 一种web后门攻击事件的确认方法 |
CN109684844A (zh) * | 2018-12-27 | 2019-04-26 | 北京神州绿盟信息安全科技股份有限公司 | 一种webshell检测方法及装置 |
CN109831429A (zh) * | 2019-01-30 | 2019-05-31 | 新华三信息安全技术有限公司 | 一种Webshell检测方法及装置 |
CN109962922A (zh) * | 2019-04-04 | 2019-07-02 | 北京网聘咨询有限公司 | 关于简历的反ats行为的处理方法及系统 |
CN116506195A (zh) * | 2023-05-09 | 2023-07-28 | 山东云天安全技术有限公司 | 一种webshell文件检测方法、电子设备及介质 |
Families Citing this family (21)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108959928A (zh) * | 2018-06-29 | 2018-12-07 | 北京奇虎科技有限公司 | 一种网页后门的检测方法、装置、设备及存储介质 |
CN109582473A (zh) * | 2018-10-26 | 2019-04-05 | 阿里巴巴集团控股有限公司 | 基于区块链的跨链数据访问方法和装置 |
CN109492692A (zh) * | 2018-11-07 | 2019-03-19 | 北京知道创宇信息技术有限公司 | 一种网页后门检测方法、装置、电子设备及存储介质 |
CN112218131A (zh) * | 2019-07-09 | 2021-01-12 | 中国移动通信集团吉林有限公司 | 机顶盒工作方法及装置、电子设备和计算机可读存储介质 |
CN110851840B (zh) * | 2019-11-13 | 2022-03-11 | 杭州安恒信息技术股份有限公司 | 基于网站漏洞的web后门检测方法及装置 |
CN112839010B (zh) * | 2019-11-22 | 2023-08-04 | 北京数安鑫云信息技术有限公司 | 标记样本的方法、系统、设备及介质 |
CN112839014B (zh) * | 2019-11-22 | 2023-09-22 | 北京数安鑫云信息技术有限公司 | 建立识别异常访问者模型的方法、系统、设备及介质 |
CN111163095B (zh) * | 2019-12-31 | 2022-08-30 | 奇安信科技集团股份有限公司 | 网络攻击分析方法、网络攻击分析装置、计算设备和介质 |
US11882151B2 (en) * | 2020-06-01 | 2024-01-23 | Jpmorgan Chase Bank, N.A. | Systems and methods for preventing the fraudulent sending of data from a computer application to a malicious third party |
CN113779571B (zh) * | 2020-06-10 | 2024-04-26 | 天翼云科技有限公司 | WebShell检测装置、WebShell检测方法及计算机可读存储介质 |
CN111800390A (zh) * | 2020-06-12 | 2020-10-20 | 深信服科技股份有限公司 | 异常访问检测方法、装置、网关设备及存储介质 |
CN112118225B (zh) * | 2020-08-13 | 2021-09-03 | 紫光云(南京)数字技术有限公司 | 一种基于RNN的Webshell检测方法及装置 |
CN112118089B (zh) * | 2020-09-18 | 2021-04-30 | 广州锦行网络科技有限公司 | 一种webshell监控方法及系统 |
CN114465741B (zh) * | 2020-11-09 | 2023-09-26 | 腾讯科技(深圳)有限公司 | 一种异常检测方法、装置、计算机设备及存储介质 |
CN114697049B (zh) * | 2020-12-14 | 2024-04-12 | 中国科学院计算机网络信息中心 | WebShell检测方法及装置 |
CN113014601B (zh) * | 2021-03-26 | 2023-07-14 | 深信服科技股份有限公司 | 一种通信检测方法、装置、设备和介质 |
CN113239352B (zh) * | 2021-04-06 | 2022-05-17 | 中国科学院信息工程研究所 | 一种Webshell检测方法及系统 |
CN114679306B (zh) * | 2022-03-17 | 2024-03-12 | 新华三信息安全技术有限公司 | 一种攻击检测方法及装置 |
US12019746B1 (en) * | 2022-06-28 | 2024-06-25 | Ut-Battelle, Llc | Adaptive malware binary rewriting |
CN116248413B (zh) * | 2023-05-09 | 2023-07-28 | 山东云天安全技术有限公司 | 一种webshell文件的流量检测方法、设备及介质 |
CN118157989B (zh) * | 2024-05-08 | 2024-10-01 | 安徽华云安科技有限公司 | Webshell内存马检测方法、装置、设备以及存储介质 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103701793A (zh) * | 2013-12-20 | 2014-04-02 | 北京奇虎科技有限公司 | 服务器肉鸡的识别方法和装置 |
CN104468477A (zh) * | 2013-09-16 | 2015-03-25 | 杭州迪普科技有限公司 | 一种WebShell的检测方法及系统 |
US20150256551A1 (en) * | 2012-10-05 | 2015-09-10 | Myoung Hun Kang | Log analysis system and log analysis method for security system |
CN105187396A (zh) * | 2015-08-11 | 2015-12-23 | 小米科技有限责任公司 | 识别网络爬虫的方法及装置 |
CN105760379A (zh) * | 2014-12-16 | 2016-07-13 | 中国移动通信集团公司 | 一种基于域内页面关联关系检测 webshell 页面的方法及装置 |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7496962B2 (en) * | 2004-07-29 | 2009-02-24 | Sourcefire, Inc. | Intrusion detection strategies for hypertext transport protocol |
CN102609341A (zh) * | 2011-07-08 | 2012-07-25 | 李康 | 硬件设备自动化测试系统及其测试方法 |
CN104618343B (zh) * | 2015-01-06 | 2018-11-09 | 中国科学院信息工程研究所 | 一种基于实时日志的网站威胁检测的方法及系统 |
CN105553974A (zh) * | 2015-12-14 | 2016-05-04 | 中国电子信息产业集团有限公司第六研究所 | 一种http慢速攻击的防范方法 |
US10581903B2 (en) * | 2016-06-16 | 2020-03-03 | Level 3 Communications, Llc | Systems and methods for preventing denial of service attacks utilizing a proxy server |
-
2016
- 2016-12-16 CN CN201611167905.3A patent/CN108206802B/zh active Active
-
2017
- 2017-08-08 EP EP17881349.9A patent/EP3547635B1/en active Active
- 2017-08-08 WO PCT/CN2017/096502 patent/WO2018107784A1/zh unknown
-
2019
- 2019-06-13 US US16/440,795 patent/US11863587B2/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150256551A1 (en) * | 2012-10-05 | 2015-09-10 | Myoung Hun Kang | Log analysis system and log analysis method for security system |
CN104468477A (zh) * | 2013-09-16 | 2015-03-25 | 杭州迪普科技有限公司 | 一种WebShell的检测方法及系统 |
CN103701793A (zh) * | 2013-12-20 | 2014-04-02 | 北京奇虎科技有限公司 | 服务器肉鸡的识别方法和装置 |
CN105760379A (zh) * | 2014-12-16 | 2016-07-13 | 中国移动通信集团公司 | 一种基于域内页面关联关系检测 webshell 页面的方法及装置 |
CN105187396A (zh) * | 2015-08-11 | 2015-12-23 | 小米科技有限责任公司 | 识别网络爬虫的方法及装置 |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109040071A (zh) * | 2018-08-06 | 2018-12-18 | 杭州安恒信息技术股份有限公司 | 一种web后门攻击事件的确认方法 |
CN109040071B (zh) * | 2018-08-06 | 2021-02-09 | 杭州安恒信息技术股份有限公司 | 一种web后门攻击事件的确认方法 |
CN109684844A (zh) * | 2018-12-27 | 2019-04-26 | 北京神州绿盟信息安全科技股份有限公司 | 一种webshell检测方法及装置 |
CN109684844B (zh) * | 2018-12-27 | 2020-11-20 | 北京神州绿盟信息安全科技股份有限公司 | 一种webshell检测方法、装置以及计算设备、计算机可读存储介质 |
CN109831429A (zh) * | 2019-01-30 | 2019-05-31 | 新华三信息安全技术有限公司 | 一种Webshell检测方法及装置 |
CN109962922A (zh) * | 2019-04-04 | 2019-07-02 | 北京网聘咨询有限公司 | 关于简历的反ats行为的处理方法及系统 |
CN109962922B (zh) * | 2019-04-04 | 2021-08-06 | 北京网聘咨询有限公司 | 关于简历的反ats行为的处理方法及系统 |
CN116506195A (zh) * | 2023-05-09 | 2023-07-28 | 山东云天安全技术有限公司 | 一种webshell文件检测方法、电子设备及介质 |
CN116506195B (zh) * | 2023-05-09 | 2023-10-27 | 山东云天安全技术有限公司 | 一种webshell文件检测方法、电子设备及介质 |
Also Published As
Publication number | Publication date |
---|---|
US11863587B2 (en) | 2024-01-02 |
CN108206802B (zh) | 2020-11-17 |
EP3547635A4 (en) | 2019-12-11 |
US20190334948A1 (en) | 2019-10-31 |
CN108206802A (zh) | 2018-06-26 |
EP3547635A1 (en) | 2019-10-02 |
EP3547635B1 (en) | 2021-03-31 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2018107784A1 (zh) | 检测网页后门的方法和装置 | |
US9003518B2 (en) | Systems and methods for detecting covert DNS tunnels | |
US9736260B2 (en) | Redirecting from a cloud service to a third party website to save costs without sacrificing security | |
US10225255B1 (en) | Count-based challenge-response credential pairs for client/server request validation | |
WO2018121331A1 (zh) | 攻击请求的确定方法、装置及服务器 | |
US7849502B1 (en) | Apparatus for monitoring network traffic | |
US7590716B2 (en) | System, method and apparatus for use in monitoring or controlling internet access | |
US9258289B2 (en) | Authentication of IP source addresses | |
BR102020003104A2 (pt) | Método para identificação e classificação de ponto de acesso baseado em http usando aprendizagem de máquina | |
EP3297248A1 (en) | System and method for generating rules for attack detection feedback system | |
JP2004304752A (ja) | 攻撃防御システムおよび攻撃防御方法 | |
RU2653241C1 (ru) | Обнаружение угрозы нулевого дня с использованием сопоставления ведущего приложения/программы с пользовательским агентом | |
CN107347076B (zh) | Ssrf漏洞的检测方法及装置 | |
WO2014032619A1 (zh) | 网址访问方法及系统 | |
JP5832951B2 (ja) | 攻撃判定装置、攻撃判定方法及び攻撃判定プログラム | |
CN111314301A (zh) | 一种基于dns解析的网站访问控制方法及装置 | |
JP2007325293A (ja) | 攻撃検知システムおよび攻撃検知方法 | |
Yen | Detecting stealthy malware using behavioral features in network traffic | |
CN113329035B (zh) | 一种攻击域名的检测方法、装置、电子设备及存储介质 | |
CN108604273B (zh) | 防止恶意软件下载 | |
WO2015024435A1 (zh) | 处理系统文件的方法及装置 | |
CN111385248B (zh) | 攻击防御方法和攻击防御设备 | |
JP5456636B2 (ja) | ファイル収集監視方法、ファイル収集監視装置及びファイル収集監視プログラム | |
US20230056625A1 (en) | Computing device and method of detecting compromised network devices | |
CN116800777A (zh) | 报文处理方法、装置、电子设备及存储介质 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 17881349 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
ENP | Entry into the national phase |
Ref document number: 2017881349 Country of ref document: EP Effective date: 20190627 |