CN113014549B - HTTP-based malicious traffic classification method and related equipment - Google Patents

HTTP-based malicious traffic classification method and related equipment Download PDF

Info

Publication number
CN113014549B
CN113014549B CN202110139581.7A CN202110139581A CN113014549B CN 113014549 B CN113014549 B CN 113014549B CN 202110139581 A CN202110139581 A CN 202110139581A CN 113014549 B CN113014549 B CN 113014549B
Authority
CN
China
Prior art keywords
traffic
attack
malicious traffic
label
malicious
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110139581.7A
Other languages
Chinese (zh)
Other versions
CN113014549A (en
Inventor
赵春辉
涂腾飞
张华�
秦素娟
李文敏
高飞
温巧燕
王华伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Posts and Telecommunications
Original Assignee
Beijing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Posts and Telecommunications filed Critical Beijing University of Posts and Telecommunications
Priority to CN202110139581.7A priority Critical patent/CN113014549B/en
Publication of CN113014549A publication Critical patent/CN113014549A/en
Application granted granted Critical
Publication of CN113014549B publication Critical patent/CN113014549B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1416Event detection, e.g. attack signature detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2457Query processing with adaptation to user needs
    • G06F16/24573Query processing with adaptation to user needs using data annotations, e.g. user-defined metadata
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/285Clustering or classification
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/02Network architectures or network communication protocols for network security for separating internal from external traffic, e.g. firewalls
    • H04L63/0227Filtering policies
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1425Traffic logging, e.g. anomaly detection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1433Vulnerability analysis
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1441Countermeasures against malicious traffic

Abstract

One or more embodiments of the present disclosure provide a method and related device for classifying malicious traffic based on HTTP, where the malicious traffic is first filtered by constructing a tag filter library, the malicious traffic is then matched with tags in a tag knowledge base, and an attack behavior pattern of the malicious traffic is determined, and finally classified by manual judgment, false successful malicious traffic is filtered, and truly successful malicious traffic is stored and unknown attack traffic is classified as a new attack event. The malicious traffic classification method provided by the invention can effectively identify malicious traffic in the network, accurately detect the malicious traffic category, greatly reduce the workload of manual analysis, effectively identify novel malicious traffic and store the novel malicious traffic into a malicious traffic rule knowledge base, provide powerful guarantee for subsequent analysis of network attack, and ensure the network security.

Description

HTTP-based malicious traffic classification method and related equipment
Technical Field
One or more embodiments of the present disclosure relate to the field of network security technologies, and in particular, to a malicious traffic classification method based on HTTP (hypertext transfer protocol) and a related device.
Background
Nowadays, with the continuous development of internet technology, people's life and work are more and more dependent on various internet applications. However, due to lack of security awareness and continuous development of sophisticated and diversified attack techniques, many network applications suffer from various network attacks and security threats, and a lot of network security vulnerabilities are exposed. The abnormal traffic detection is used as the first step of attack defense to provide effective guarantee for interception of attacks, so that the accurate detection of the abnormal traffic is the key for guaranteeing the usability and the safety of network application. At present, the main method for detecting malicious traffic is to use rules to match the malicious traffic, and safety practitioners are required to analyze samples one by one, which consumes great manpower and is difficult to detect variant malicious traffic.
Disclosure of Invention
In view of this, an object of one or more embodiments of the present disclosure is to provide a method and related device for classifying malicious traffic based on the HTTP.
In view of the above, one or more embodiments of the present specification provide a method for classifying malicious traffic based on a hypertext transfer protocol HTTP, including:
in response to receiving HTTP-based malicious traffic, extracting traffic features of the malicious traffic;
filtering and extracting the malicious traffic by matching the traffic characteristics with filtering tags in a malicious traffic tag filtering library;
attaching corresponding knowledge tags in a malicious traffic tag knowledge base to the malicious traffic based on the traffic features of the malicious traffic subjected to the filtering and extraction processing, the malicious traffic tag knowledge base and a predetermined feature-tag association rule;
classifying the malicious traffic by an attack behavior pattern based on the attached knowledge tag,
wherein the malicious traffic label filtering library and the malicious traffic label knowledge library are constructed in advance based on an HTTP response mechanism and a security protection mode.
Further, the flow characteristics include at least one of: request mode, uniform resource locator URL, Internet protocol IP address, request parameter, status code, user agent UA and response text.
Further, the filter label includes at least one of: web application protection system WAF label, UA common label, UA special label, malicious IP label.
Further, the filtering and extracting process includes one of:
filtering out the malicious traffic in response to determining that any of the traffic characteristics match a WAF label of the filter labels;
filtering the malicious traffic in response to determining that the UA in the traffic signature matches a UA common label in the filter labels;
in response to determining that the UA in the traffic feature matches a UA special label in the filter labels, extracting the malicious traffic;
extracting the malicious traffic in response to determining that the IP address in the traffic feature matches a malicious IP tag in the filter tags.
Further, the knowledge tag includes at least one of: conventional attack tags, Web fingerprint tags, vulnerability attack tags, sensitive file tags, and Webshell tags.
Further, the feature-tag association rule includes:
determining whether the malicious traffic is associated with a conventional attack tag in the knowledge tags according to a request mode, a URL (uniform resource locator), a request parameter and a status code in the traffic characteristics;
determining whether the malicious traffic is associated with a Web fingerprint tag in the knowledge tag according to an IP address in the traffic characteristics;
determining whether the malicious traffic is associated with a vulnerability attack tag in the knowledge tags according to the URL in the traffic characteristics;
determining whether the malicious traffic is associated with a sensitive file tag in the knowledge tag according to the URL, the request parameter and the response text in the traffic characteristic;
and determining whether the malicious traffic is associated with the Webshell tag in the knowledge tag according to the request mode, the URL and the request parameter in the traffic characteristics.
Further, the attack behavior pattern includes at least one of: the method comprises the steps of success of conventional attack, attempt of conventional attack, targeted vulnerability attack, vulnerability scanning attack, unknown attack of vulnerability exploitation, unknown attack, sensitive information leakage and successful access of Webshell.
Further, classifying the malicious traffic according to an attack behavior pattern includes at least one of:
classifying the malicious traffic into conventional attack successful traffic, conventional attack attempt traffic or conventional attack unmatched traffic by matching the traffic characteristics of the malicious traffic with the conventional attack tags in the knowledge tags;
the flow characteristics of the conventional attack unmatched flow are matched with the Web fingerprint label and the vulnerability attack label in the knowledge label, and the conventional attack unmatched flow is classified into targeted vulnerability attack flow, vulnerability scanning attack flow, vulnerability utilization unknown attack flow, unknown attack flow or vulnerability attack unmatched flow;
classifying the unmatched flow of the vulnerability attack into sensitive information leakage flow or unmatched flow of a sensitive file label by matching the flow characteristics of the unmatched flow of the vulnerability attack with the sensitive file label in the knowledge label;
and classifying the sensitive file label unmatched flow into Webshell successful access flow or Webshell label unmatched flow by matching the flow characteristics of the sensitive file label unmatched flow with the WebShell label in the knowledge label.
Based on the same inventive concept, one or more embodiments of the present specification provide a malicious traffic classification apparatus based on an HTTP protocol, including:
a feature extraction module configured to extract traffic features of malicious traffic based on HTTP in response to receiving the malicious traffic;
a filtering and extracting processing module configured to filter and extract the malicious traffic by matching the traffic features with filtering tags in a malicious traffic tag filtering library;
a tag attaching module configured to attach a corresponding knowledge tag in a malicious traffic tag knowledge base to the malicious traffic based on the traffic features of the malicious traffic subjected to the filtering and extraction processing, the malicious traffic tag knowledge base, and a predetermined feature-tag association rule;
a classification module configured to classify the malicious traffic by an attack behavior pattern based on the attached knowledge tag,
wherein the malicious traffic label filtering library and the malicious traffic label knowledge library are constructed in advance based on an HTTP response mechanism and a security protection mode.
Based on the same inventive concept, one or more embodiments of the present specification provide an electronic device comprising a memory, a processor, and a computer program stored on the memory and executable by the processor, wherein the processor implements the method according to any one of the above when executing the computer program.
As can be seen from the above, according to the malicious traffic classification method based on HTTP and the related device provided in one or more embodiments of the present disclosure, the malicious traffic is first filtered by constructing the tag filter library, then the malicious traffic is matched with the tags in the tag knowledge base, and the attack behavior pattern of the malicious traffic is determined, and finally the malicious traffic is classified by manual judgment, false successful malicious traffic is filtered out, true successful malicious traffic is stored, and unknown attack traffic is classified as a new attack event. The security incident mining method provided by the invention can effectively identify malicious traffic in the network, accurately detect the category of the malicious traffic, greatly reduce the workload of manual analysis, effectively identify novel malicious traffic and store the novel malicious traffic into a malicious traffic rule knowledge base, provide powerful guarantee for subsequent analysis of network attack, and ensure the network security.
Drawings
In order to more clearly illustrate one or more embodiments or prior art solutions of the present specification, the drawings that are needed in the description of the embodiments or prior art will be briefly described below, and it is obvious that the drawings in the following description are only one or more embodiments of the present specification, and that other drawings may be obtained by those skilled in the art without inventive effort from these drawings.
Fig. 1 is a flowchart illustrating a method for classifying malicious traffic based on HTTP according to one or more embodiments of the present disclosure;
FIG. 2 is a schematic diagram of a correlation of traffic characteristics of malicious traffic with knowledge base tags according to one or more embodiments of the present description;
FIG. 3 is a diagram of conventional attack pattern matching of malicious traffic in accordance with one or more embodiments of the present disclosure;
FIG. 4 is a schematic diagram of attack pattern matching for malicious traffic that is not matched by a conventional attack in accordance with one or more embodiments of the present disclosure;
FIG. 5 is a schematic diagram illustrating attack pattern matching of malicious traffic that is unmatched by a vulnerability attack in accordance with one or more embodiments of the present disclosure;
FIG. 6 is a schematic diagram of attack pattern matching in which sensitive information reveals unmatched malicious traffic in one or more embodiments of the present description;
fig. 7 is a block diagram illustrating a structure of a malicious HTTP-based traffic classification apparatus according to one or more embodiments of the present disclosure;
fig. 8 is a hardware configuration diagram of an electronic device according to one or more embodiments of the present disclosure.
Detailed Description
For the purpose of promoting a better understanding of the objects, aspects and advantages of the present disclosure, reference is made to the following detailed description taken in conjunction with the accompanying drawings.
It is to be noted that unless otherwise defined, technical or scientific terms used in one or more embodiments of the present specification should have the ordinary meaning as understood by those of ordinary skill in the art to which this disclosure belongs. The word "comprising" or "comprises", and the like, means that the element or item listed before the word covers the element or item listed after the word and its equivalents, but does not exclude other elements or items. The terms "connected" or "coupled" and the like are not restricted to physical or mechanical connections, but may include electrical connections, whether direct or indirect.
As described in the background section, the HTTP protocol (hypertext transfer protocol) is a transport protocol for transferring hypertext from a web server to a local browser and is widely used. With the rapid development of the internet, people are increasingly dependent on the internet. However, the open nature of the internet enables any device or software meeting its technical standards to access the internet without any restriction, resulting in various security incidents of the internet coming out endlessly and the situation of network security becoming increasingly severe. The efficiency of discovering and determining malicious traffic will be crucial when faced with various cyber-security threats.
In carrying out the present disclosure, applicants have discovered that in an actual network environment, there are a large number of unknown types of cyber-attack behaviors. In the prior art, only a limited number of sample data of known network attack types can be used for labeling, and the labeling process needs to be determined one by practitioners, so that a large amount of manpower is consumed, and the problems of low detection efficiency and high false detection probability exist.
Hereinafter, the technical means of the present disclosure will be described in further detail with reference to specific examples.
The embodiment provides a malicious traffic classification method based on a hypertext transfer protocol HTTP, which specifically includes the following steps with reference to fig. 1:
step S101, responding to the fact that malicious traffic based on HTTP is received, and extracting traffic characteristics of the malicious traffic.
As an optional embodiment, the traffic characteristics include a request mode, a uniform Resource locator url (uniform Resource locator), an internet protocol IP address, a request parameter, a status code, a User Agent UA (User-Agent), and a response body, and these traffic characteristics are stored in malicious traffic in a json (javascript Object notification) text format of single traffic.
As an optional embodiment, the flow characteristic extraction specific method is as follows: for the Request mode, the content of the Request Method field is extracted, and the format is' Request Method: request mode ", the invention only includes GET and POST request mode; for the URL, extracting the content of a Remote Address field, wherein the format is' Remote Address: URL "; for the IP address, the content of the Host field is extracted, and the format is' Host: URL "; regarding the request parameter, if the HTTP request manner is GET, the request parameter is considered as "? The subsequent part of the method considers that the request parameters are POST parameters in an HTTP field if the HTTP request mode is POST, indicates that the request parameters have a plurality of request parameter names and request parameter values if the request parameters contain "&" characters, and obtains a plurality of specific parameters with the format of "request parameter name &" request parameter value "after the request parameters are segmented according to the" & "characters; for the Status Code, a Status Code field is extracted in the format "Status Code: status code "mainly including" 200 "," 404 ", etc.; for the User-Agent, extracting a User-Agent field with a format of' User-Agent: User-Agent "; and extracting all information except the state code and the response header in the response message for the response text.
And S102, filtering and extracting the malicious traffic by matching the traffic characteristics with the filtering tags in the malicious traffic tag filtering library.
In the step, the filtering labels in the malicious flow label filtering library are selected and matched with the extracted flow characteristics, the malicious flows are divided into invalid attack flows, special marked flows and unmarked flows according to the matched labels, wherein the invalid attack flows are directly filtered, the special label flows are independently extracted and manually researched, and the unmarked flows directly enter the next step.
As an alternative embodiment, the filtering labels include the following: WAF (Web Application Firewal) label, UA common label, UA special label, and malicious IP label. The WAF label mainly comes from open source tools such as sqlmap and wafw00f, the UA common label mainly comes from some scanning tools such as sqlmap, and the malicious IP label mainly comes from an open source malicious IP address library. The UA special label comes from UA characteristics of an open source scanner tool (sqlmap, hydra, WPScan, etc.), a homemade scanner (python-request, ruby, java, etc.). The traffic marked by the UA special label represents that the attack initiation already contains certain prior knowledge, has strong pertinence and high success rate, and needs to be manually judged. Malicious IP comes from open source threat intelligence, representing the currently active host that continues to conduct aggressive behavior. The latest vulnerability information can be obtained by tracking the label flow, and unknown attack modes can be easily found.
The invalid attack traffic is mainly malicious traffic filtered by a WAF label and a UA common label; the special traffic is mainly malicious traffic filtered by special UA labels and malicious IP labels.
Specifically, as shown in table 1, if any feature of the malicious traffic matches a WAF tag in the tag, the malicious traffic is directly filtered; if the UA characteristics of the malicious traffic are matched with UA common tags in the tags, filtering the malicious traffic; if the UA characteristics of the malicious traffic are matched with the UA special label, the malicious traffic is separately extracted and analyzed; and if the IP address characteristics of the malicious traffic are matched with the malicious IP labels in the labels, the malicious traffic is separately extracted and analyzed.
TABLE 1 malicious traffic filtering tag matching and corresponding processing method
Figure BDA0002928060300000071
Step S103, attaching corresponding knowledge tags in the malicious traffic tag knowledge base to the malicious traffic based on the traffic features of the malicious traffic subjected to the filtering and extraction processing, the malicious traffic tag knowledge base and a predetermined feature-tag association rule.
As an alternative embodiment, the knowledge tag comprises at least one of: conventional attack tags, Web fingerprint tags, vulnerability attack tags, sensitive file tags, and Webshell tags. The conventional attack labels are mainly written by YARA syntax aiming at malicious attack requests and responses, and at least comprise SQL injection, XSS attack and local file inclusion. The Web fingerprint tag can help identify fingerprint information of a victim target, including at least code language, operating system, Web server, Web framework, database. The vulnerability attack tag refers to vulnerability attack rules aiming at a Web framework or a platform, and at least comprises vulnerability rules such as a Content Management System (CMS), a Web middleware, a server and the like. The sensitive file label mainly aims at some key files under Windows and Linux, and at least comprises a configuration file, a password file and a file directory. Webshell tags can help identify backdoors of hacker implantation, and mainly comprise PHP platforms Webshell, Java platforms Webshell and NET platforms Webshell.
As an alternative embodiment, referring to fig. 2, the predetermined feature-tag association rule includes:
determining whether the malicious traffic is associated with the conventional attack tag according to a request mode, a URL, a request parameter and a status code in the traffic characteristics;
determining whether the malicious traffic is associated with the Web fingerprint tag according to an IP address in the traffic characteristics;
determining whether the malicious traffic is associated with the vulnerability attack tag according to a URL in the traffic characteristics;
determining whether the malicious traffic is associated with a sensitive file label according to the URL, the request parameter and the response text in the traffic characteristic;
and determining whether the malicious traffic is associated with a Webshell label according to the request mode, the URL and the request parameters in the traffic characteristics.
An example of a specific predefined association rule format is as follows:
1.import"httprule"
rule frame _ xxx _ yy// malicious traffic tag name
3.{
4.meta:
5.tag="framework_xxx_yyy"
Description ═ description information "
7.filetype="unknown"
Condition:// malicious traffic tag associated feature rules
Request method and, httpresult, and
request URL (/ feature of traffic URL /) and
11.…
12.}
as shown in the above example, the malicious traffic tag knowledge base defines a corresponding field for each feature, each field value corresponds to a specific description of the feature, and the specific feature details of a piece of traffic are represented by a certain number of feature field + description combinations.
And step S104, classifying the malicious traffic according to an attack behavior mode based on the attached knowledge tag, wherein the malicious traffic tag filtering library and the malicious traffic tag knowledge library are constructed in advance based on an HTTP response mechanism and a security protection mode.
As an alternative embodiment, the attack behavior pattern includes at least one of the following: the method comprises the steps of success of conventional attack, attempt of conventional attack, targeted vulnerability attack, vulnerability scanning attack, unknown attack of vulnerability exploitation, unknown attack, sensitive information leakage and successful access of Webshell. The determination of attack behavior patterns based on knowledge tags is detailed in table 2, where,
the conventional attack success mode represents the conventional attack with a request (q) and a successful response (p 1);
the conventional attack attempt mode represents a conventional attack with a request (q) and without a successful response (p 2);
the targeted vulnerability attack pattern represents a Web fingerprint tag on match (f1) and a vulnerability attack tag on match (Vf 1);
vulnerability scanning attack pattern represents matching Web fingerprint label (f1) and unmatching vulnerability attack label (Vf 2);
the vulnerability exploiting unknown attack patterns represent Web fingerprint tags (f2) which are not matched and vulnerability attack tags (Vf1) which are matched;
the unknown attack pattern represents that the Web fingerprint label (f2) is not matched and the vulnerability attack label (Vf2) is not matched;
the sensitive information leakage attack mode represents matching with an upper sensitive file label;
the Webshell successful access attack mode shows that the Webshell label is matched.
Table 2 attack behavior Pattern determination rules
Figure BDA0002928060300000091
Figure BDA0002928060300000101
Specifically, the examples of the knowledge tag matching and attack pattern determination format are as follows:
Figure BDA0002928060300000102
Figure BDA0002928060300000111
in the above example, the vulnerability attack tag VulTags tells us that: an attacker launches a 'ThinkPHP SQL injection vulnerability attack' aiming at the xx website, and the fingerprint of the framework used by the attacked website is known as ThinkPHP by the fingerprint tag VulTags and is consistent with the vulnerability attack. Therefore, the attack mode of the flow is targeted vulnerability attack, the success probability of the targeted attack mode is high, and manual research and judgment are carried out.
As an alternative embodiment, referring to fig. 3, traffic characteristics of malicious traffic are matched with a conventional attack tag, and the malicious traffic is divided into conventional attack successful traffic, conventional attack attempt traffic, and conventional attack unmatched traffic.
Referring to fig. 4, the traffic characteristics of the general attack unmatched traffic are matched with the Web fingerprint tag and the vulnerability attack tag, and the general attack unmatched traffic is divided into targeted vulnerability attack traffic, vulnerability scanning attack traffic, vulnerability utilization unknown attack traffic, and vulnerability attack unmatched traffic.
Referring to fig. 5, the traffic characteristics of the unmatched traffic of the vulnerability attack are matched with the sensitive file tags, and the unmatched traffic of the vulnerability attack is divided into sensitive information leakage traffic and sensitive file tag unmatched traffic.
Referring to fig. 6, the traffic characteristics of the unmatched traffic of the sensitive file tag are matched with the WebShell tag, and after the matching is successful, a corresponding attack behavior pattern is automatically generated, so that the unmatched traffic of the sensitive file tag is divided into the WebShell successful access traffic and the WebShell tag unmatched traffic.
Manually judging the UA special label flow and the malicious IP label flow generated in the step S102 and the classified flow in the step S104, manually analyzing the malicious flow under each attack behavior mode, filtering out false successful malicious flow to obtain truly successful malicious flow, classifying the truly successful malicious flow into a truly successful event, and manually analyzing the key characteristics of unknown attack flow and classifying the unknown attack flow into a novel attack event. The manual analysis method specifically comprises the following steps: and sequencing the attack behavior patterns according to the danger degree, performing key analysis on the malicious traffic in the attack behavior pattern with high danger degree, and performing brief analysis on the malicious traffic in the attack behavior pattern with low danger degree. In this embodiment, the behavior patterns with high risk include success of a conventional attack, targeted vulnerability attack, unknown attack of vulnerability exploitation, unknown attack, sensitive information leakage, and successful Webshell access. Behavior patterns with low risk levels include routine attack attempts, vulnerability scanning attacks, and invalidation attacks.
And for the true success event, storing the characteristics of the event into a malicious flow threat intelligence library. As shown in table 3, the truly successful events at least include conventional attack successful events such as SQL injection, XSS attack, local file inclusion, etc., exploit successful events, directory traversal, information leakage, etc., and Webshell successful attack events. And for the novel attack event, the key characteristics of the novel attack event are summarized and finally stored in a malicious flow rule knowledge base, so that support is provided for subsequent malicious flow detection.
TABLE 3 resulting events from various patterns of attack behavior
Figure BDA0002928060300000121
And based on the steps from S101 to S104, filtering and classifying network attack malicious traffic, storing successful malicious traffic and classifying unknown attack traffic into novel attack traffic.
It is to be appreciated that the method can be performed by any apparatus, device, platform, cluster of devices having computing and processing capabilities.
It should be noted that the method of one or more embodiments of the present disclosure may be performed by a single device, such as a computer or server. The method of the embodiment can also be applied to a distributed scene and completed by the mutual cooperation of a plurality of devices. In such a distributed scenario, one of the devices may perform only one or more steps of the method of one or more embodiments of the present disclosure, and the devices may interact with each other to complete the method.
It should be noted that the above description describes certain embodiments of the present disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
Based on the same inventive concept, referring to fig. 7, an embodiment of the present disclosure provides a malicious traffic classification apparatus based on an HTTP protocol, including:
a feature extraction module 701 configured to extract traffic features of malicious traffic based on HTTP in response to receiving the malicious traffic;
a filtering and extracting processing module 702 configured to filter and extract the malicious traffic by matching the traffic characteristics with filtering tags in a malicious traffic tag filtering library;
a tag attaching module 703 configured to attach a corresponding knowledge tag in a malicious traffic tag knowledge base to the malicious traffic based on the traffic features of the malicious traffic subjected to the filtering and extraction processing, the malicious traffic tag knowledge base, and a predetermined feature-tag association rule;
a classification module 704 configured to classify the malicious traffic by an attack behavior pattern based on the attached knowledge tag,
wherein the malicious traffic label filtering library and the malicious traffic label knowledge library are constructed in advance based on an HTTP response mechanism and a security protection mode.
As an optional embodiment, the feature extraction module 701 is specifically configured to, the flow feature includes at least one of the following: request mode, uniform resource locator URL, Internet protocol IP address, request parameter, status code, user agent UA and response text.
As an optional embodiment, the filtering and extracting processing module 702 is specifically configured to, the filtering tag includes at least one of the following: web application protection system WAF label, UA common label, UA special label, malicious IP label.
As an alternative embodiment, the filtering and extracting processing module 702 is specifically configured to perform the filtering and extracting process by using one of the following:
filtering out the malicious traffic in response to determining that any of the traffic characteristics match a WAF label of the filter labels;
filtering the malicious traffic in response to determining that the UA in the traffic signature matches a UA common label in the filter labels;
in response to determining that the UA in the traffic feature matches a UA special label in the filter labels, extracting the malicious traffic;
extracting the malicious traffic in response to determining that the IP address in the traffic feature matches a malicious IP tag in the filter tags.
As an optional embodiment, the tag attaching module 703 is specifically configured to, the knowledge tag includes at least one of the following: conventional attack tags, Web fingerprint tags, vulnerability attack tags, sensitive file tags, and Webshell tags.
As an optional embodiment, the tag attaching module 703 is specifically configured to determine whether the malicious traffic is associated with a conventional attack tag in the knowledge tags according to a request manner, a URL, a request parameter, and a status code in the traffic characteristics;
determining whether the malicious traffic is associated with a Web fingerprint tag in the knowledge tag according to an IP address in the traffic characteristics;
determining whether the malicious traffic is associated with a vulnerability attack tag in the knowledge tags according to the URL in the traffic characteristics;
determining whether the malicious traffic is associated with a sensitive file tag in the knowledge tag according to the URL, the request parameter and the response text in the traffic characteristic;
and determining whether the malicious traffic is associated with the Webshell tag in the knowledge tag according to the request mode, the URL and the request parameter in the traffic characteristics.
As an optional embodiment, the classification module 704 is specifically configured to configure the attack behavior pattern to include at least one of the following: the method comprises the steps of success of conventional attack, attempt of conventional attack, targeted vulnerability attack, vulnerability scanning attack, unknown attack of vulnerability exploitation, unknown attack, sensitive information leakage and successful access of Webshell.
As an optional embodiment, the classification module 704 is specifically configured to classify the malicious traffic according to an attack behavior pattern, and includes at least one of the following:
classifying the malicious traffic into conventional attack successful traffic, conventional attack attempt traffic or conventional attack unmatched traffic by matching the traffic characteristics of the malicious traffic with the conventional attack tags in the knowledge tags;
the flow characteristics of the conventional attack unmatched flow are matched with the Web fingerprint label and the vulnerability attack label in the knowledge label, and the conventional attack unmatched flow is classified into targeted vulnerability attack flow, vulnerability scanning attack flow, vulnerability utilization unknown attack flow, unknown attack flow or vulnerability attack unmatched flow;
classifying the unmatched flow of the vulnerability attack into sensitive information leakage flow or unmatched flow of a sensitive file label by matching the flow characteristics of the unmatched flow of the vulnerability attack with the sensitive file label in the knowledge label;
and classifying the sensitive file label unmatched flow into Webshell successful access flow or Webshell label unmatched flow by matching the flow characteristics of the sensitive file label unmatched flow with the WebShell label in the knowledge label.
For convenience of description, the above devices are described as being divided into various modules by functions, and are described separately. Of course, the functionality of the modules may be implemented in the same one or more software and/or hardware implementations in implementing one or more embodiments of the present description.
The apparatus of the foregoing embodiment is used to implement the corresponding method in the foregoing embodiment, and has the beneficial effects of the corresponding method embodiment, which are not described herein again.
Based on the same inventive concept, one or more embodiments of the present specification further provide an associated apparatus, including a memory, a processor, and a computer program stored on the memory and executable by the processor, wherein the processor implements the method according to any of the above embodiments when executing the computer program.
Fig. 8 is a schematic diagram illustrating a more specific hardware structure of an electronic device according to this embodiment, where the electronic device may include: a processor 1010, a memory 1020, an input/output interface 1030, a communication interface 1040, and a bus 1050. Wherein the processor 1010, memory 1020, input/output interface 1030, and communication interface 1040 are communicatively coupled to each other within the device via bus 1050.
The processor 1010 may be implemented by a general-purpose CPU (Central Processing Unit), a microprocessor, an Application Specific Integrated Circuit (ASIC), or one or more Integrated circuits, and is configured to execute related programs to implement the technical solutions provided in the embodiments of the present disclosure.
The Memory 1020 may be implemented in the form of a ROM (Read Only Memory), a RAM (Random Access Memory), a static storage device, a dynamic storage device, or the like. The memory 1020 may store an operating system and other application programs, and when the technical solution provided by the embodiments of the present specification is implemented by software or firmware, the relevant program codes are stored in the memory 1020 and called to be executed by the processor 1010.
The input/output interface 1030 is used for connecting an input/output module to input and output information. The i/o module may be configured as a component in a device (not shown) or may be external to the device to provide a corresponding function. The input devices may include a keyboard, a mouse, a touch screen, a microphone, various sensors, etc., and the output devices may include a display, a speaker, a vibrator, an indicator light, etc.
The communication interface 1040 is used for connecting a communication module (not shown in the drawings) to implement communication interaction between the present apparatus and other apparatuses. The communication module can realize communication in a wired mode (such as USB, network cable and the like) and also can realize communication in a wireless mode (such as mobile network, WIFI, Bluetooth and the like).
Bus 1050 includes a path that transfers information between various components of the device, such as processor 1010, memory 1020, input/output interface 1030, and communication interface 1040.
It should be noted that although the above-mentioned device only shows the processor 1010, the memory 1020, the input/output interface 1030, the communication interface 1040 and the bus 1050, in a specific implementation, the device may also include other components necessary for normal operation. In addition, those skilled in the art will appreciate that the above-described apparatus may also include only those components necessary to implement the embodiments of the present description, and not necessarily all of the components shown in the figures.
The related devices of the foregoing embodiments are used for implementing the corresponding methods in the foregoing embodiments, and have the beneficial effects of the corresponding method embodiments, which are not described herein again.
Those of ordinary skill in the art will understand that: the discussion of any embodiment above is meant to be exemplary only, and is not intended to intimate that the scope of the disclosure, including the claims, is limited to these examples; within the spirit of the present disclosure, features from the above embodiments or from different embodiments may also be combined, steps may be implemented in any order, and there are many other variations of different aspects of one or more embodiments of the present description as described above, which are not provided in detail for the sake of brevity.
In addition, well-known power/ground connections to Integrated Circuit (IC) chips and other components may or may not be shown in the provided figures, for simplicity of illustration and discussion, and so as not to obscure one or more embodiments of the disclosure. Furthermore, devices may be shown in block diagram form in order to avoid obscuring the understanding of one or more embodiments of the present description, and this also takes into account the fact that specifics with respect to implementation of such block diagram devices are highly dependent upon the platform within which the one or more embodiments of the present description are to be implemented (i.e., specifics should be well within purview of one skilled in the art). Where specific details (e.g., circuits) are set forth in order to describe example embodiments of the disclosure, it should be apparent to one skilled in the art that one or more embodiments of the disclosure can be practiced without, or with variation of, these specific details. Accordingly, the description is to be regarded as illustrative instead of restrictive.
While the present disclosure has been described in conjunction with specific embodiments thereof, many alternatives, modifications, and variations of these embodiments will be apparent to those of ordinary skill in the art in light of the foregoing description. For example, other memory architectures (e.g., dynamic ram (dram)) may use the discussed embodiments.
It is intended that the one or more embodiments of the present specification embrace all such alternatives, modifications and variations as fall within the broad scope of the appended claims. Therefore, any omissions, modifications, substitutions, improvements, and the like that may be made without departing from the spirit and principles of one or more embodiments of the present disclosure are intended to be included within the scope of the present disclosure.

Claims (7)

1. A malicious traffic classification method based on a hypertext transfer protocol (HTTP) comprises the following steps:
in response to receiving HTTP-based malicious traffic, extracting traffic features of the malicious traffic;
filtering and extracting the malicious traffic by matching the traffic characteristics with the filtering tags in a malicious traffic tag filtering library,
the flow characteristics include at least one of: request mode, uniform resource locator URL, Internet protocol IP address, request parameter, status code, user agent UA, response text,
the filter label includes at least one of: web application protection system WAF label, UA common label, UA special label, malicious IP label,
filtering out the malicious traffic in response to determining that any of the traffic characteristics match a WAF label of the filter labels;
filtering the malicious traffic in response to determining that the UA in the traffic signature matches a UA common label in the filter labels;
in response to determining that the UA in the traffic feature matches a UA special label in the filter labels, extracting the malicious traffic;
extracting the malicious traffic in response to determining that an IP address in the traffic feature matches a malicious IP tag in the filter tags;
attaching corresponding knowledge tags in the malicious traffic tag knowledge base to the malicious traffic after the filtering and extraction processing based on the traffic features of the malicious traffic after the filtering and extraction processing, the malicious traffic tag knowledge base and a predetermined feature-tag association rule;
classifying the filtered and extracted malicious traffic according to attack behavior patterns based on the attached knowledge tags,
wherein the malicious traffic label filtering library and the malicious traffic label knowledge library are constructed in advance based on an HTTP response mechanism and a security protection mode.
2. The method of claim 1, wherein the knowledge tag comprises at least one of: conventional attack tags, Web fingerprint tags, vulnerability attack tags, sensitive file tags, and Webshell tags.
3. The method of claim 2, wherein the feature-tag association rule comprises:
determining whether the malicious traffic is associated with a conventional attack tag in the knowledge tags according to a request mode, a URL (uniform resource locator), a request parameter and a status code in the traffic characteristics;
determining whether the malicious traffic is associated with a Web fingerprint tag in the knowledge tag according to an IP address in the traffic characteristics;
determining whether the malicious traffic is associated with a vulnerability attack tag in the knowledge tags according to the URL in the traffic characteristics;
determining whether the malicious traffic is associated with a sensitive file tag in the knowledge tag according to the URL, the request parameter and the response text in the traffic characteristic;
and determining whether the malicious traffic is associated with the Webshell tag in the knowledge tag according to the request mode, the URL and the request parameter in the traffic characteristics.
4. The method of claim 2, wherein the aggressive behavior pattern comprises at least one of: the method comprises the steps of success of conventional attack, attempt of conventional attack, targeted vulnerability attack, vulnerability scanning attack, unknown attack of vulnerability exploitation, unknown attack, sensitive information leakage and successful access of Webshell.
5. The method of claim 4, wherein classifying the filtered and extracted malicious traffic as a pattern of attack behavior comprises at least one of:
classifying the malicious traffic into conventional attack successful traffic, conventional attack attempt traffic or conventional attack unmatched traffic by matching the traffic characteristics of the malicious traffic after filtering and extraction processing with conventional attack tags in the knowledge tags;
the flow characteristics of the conventional attack unmatched flow are matched with the Web fingerprint label and the vulnerability attack label in the knowledge label, and the conventional attack unmatched flow is classified into targeted vulnerability attack flow, vulnerability scanning attack flow, vulnerability utilization unknown attack flow, unknown attack flow or vulnerability attack unmatched flow;
classifying the unmatched flow of the vulnerability attack into sensitive information leakage flow or unmatched flow of a sensitive file label by matching the flow characteristics of the unmatched flow of the vulnerability attack with the sensitive file label in the knowledge label;
and classifying the sensitive file label unmatched flow into Webshell successful access flow or Webshell label unmatched flow by matching the flow characteristics of the sensitive file label unmatched flow with the WebShell label in the knowledge label.
6. An apparatus for classifying malicious traffic based on an HTTP protocol, comprising:
a feature extraction module configured to extract traffic features of malicious traffic based on HTTP in response to receiving the malicious traffic;
a filtering and extraction processing module configured to filter and extract the malicious traffic by matching the traffic features with filtering tags in a malicious traffic tag filtering library,
the flow characteristics include at least one of: request mode, uniform resource locator URL, Internet protocol IP address, request parameter, status code, user agent UA, response text,
the filter label includes at least one of: web application protection system WAF label, UA common label, UA special label, malicious IP label,
filtering out the malicious traffic in response to determining that any of the traffic characteristics match a WAF label of the filter labels;
filtering the malicious traffic in response to determining that the UA in the traffic signature matches a UA common label in the filter labels;
in response to determining that the UA in the traffic feature matches a UA special label in the filter labels, extracting the malicious traffic;
extracting the malicious traffic in response to determining that an IP address in the traffic feature matches a malicious IP tag in the filter tags;
a tag attaching module configured to attach a corresponding knowledge tag in a malicious traffic tag knowledge base to the filtered and extracted malicious traffic based on the traffic features of the filtered and extracted malicious traffic, the malicious traffic tag knowledge base and a predetermined feature-tag association rule;
a classification module configured to classify the filtered and extracted malicious traffic according to an attack behavior pattern based on the attached knowledge tag,
wherein the malicious traffic label filtering library and the malicious traffic label knowledge library are constructed in advance based on an HTTP response mechanism and a security protection mode.
7. An associated apparatus comprising a memory, a processor and a computer program stored on the memory and executable by the processor, wherein the processor implements the method of any one of claims 1 to 5 when executing the computer program.
CN202110139581.7A 2021-02-01 2021-02-01 HTTP-based malicious traffic classification method and related equipment Active CN113014549B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110139581.7A CN113014549B (en) 2021-02-01 2021-02-01 HTTP-based malicious traffic classification method and related equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110139581.7A CN113014549B (en) 2021-02-01 2021-02-01 HTTP-based malicious traffic classification method and related equipment

Publications (2)

Publication Number Publication Date
CN113014549A CN113014549A (en) 2021-06-22
CN113014549B true CN113014549B (en) 2022-04-08

Family

ID=76384807

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110139581.7A Active CN113014549B (en) 2021-02-01 2021-02-01 HTTP-based malicious traffic classification method and related equipment

Country Status (1)

Country Link
CN (1) CN113014549B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113554317B (en) * 2021-07-27 2023-12-08 北京天融信网络安全技术有限公司 Network attack data distribution research and judgment method, device, equipment and storage medium
CN113923021B (en) * 2021-10-09 2023-09-22 中国联合网络通信集团有限公司 Sandbox-based encrypted traffic processing method, system, equipment and medium
CN114021040B (en) * 2021-11-15 2022-05-24 北京华清信安科技有限公司 Method and system for alarming and protecting malicious event based on service access
CN114584402B (en) * 2022-05-07 2022-08-05 浙江御安信息技术有限公司 Threat filtering studying and judging method based on attack feature identification tag library

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102571486A (en) * 2011-12-14 2012-07-11 上海交通大学 Traffic identification method based on bag of word (BOW) model and statistic features
CN108965350A (en) * 2018-10-23 2018-12-07 杭州安恒信息技术股份有限公司 A kind of mail auditing method, device and computer readable storage medium
CN109960729A (en) * 2019-03-28 2019-07-02 国家计算机网络与信息安全管理中心 The detection method and system of HTTP malicious traffic stream
CN112235230A (en) * 2019-07-15 2021-01-15 北京观成科技有限公司 Malicious traffic identification method and system

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10187401B2 (en) * 2015-11-06 2019-01-22 Cisco Technology, Inc. Hierarchical feature extraction for malware classification in network traffic
CN110912888B (en) * 2019-11-22 2021-08-10 上海交通大学 Malicious HTTP (hyper text transport protocol) traffic detection system and method based on deep learning

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102571486A (en) * 2011-12-14 2012-07-11 上海交通大学 Traffic identification method based on bag of word (BOW) model and statistic features
CN108965350A (en) * 2018-10-23 2018-12-07 杭州安恒信息技术股份有限公司 A kind of mail auditing method, device and computer readable storage medium
CN109960729A (en) * 2019-03-28 2019-07-02 国家计算机网络与信息安全管理中心 The detection method and system of HTTP malicious traffic stream
CN112235230A (en) * 2019-07-15 2021-01-15 北京观成科技有限公司 Malicious traffic identification method and system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
"基于样本增强的网络恶意流量智能检测方法";陈铁明等;《通信学报》;20200612;第128-137页 *

Also Published As

Publication number Publication date
CN113014549A (en) 2021-06-22

Similar Documents

Publication Publication Date Title
CN113014549B (en) HTTP-based malicious traffic classification method and related equipment
CN111600850B (en) Method, equipment and storage medium for detecting mine digging virtual currency
US10721245B2 (en) Method and device for automatically verifying security event
CN102984161B (en) The recognition methods of a kind of reliable website and device
CN111163095B (en) Network attack analysis method, network attack analysis device, computing device, and medium
US11792221B2 (en) Rest API scanning for security testing
CN110336835B (en) Malicious behavior detection method, user equipment, storage medium and device
CN107332804B (en) Method and device for detecting webpage bugs
US20190356675A1 (en) Combining apparatus, combining method, and combining program
CN111163094B (en) Network attack detection method, network attack detection device, electronic device, and medium
CN103746992A (en) Reverse-based intrusion detection system and reverse-based intrusion detection method
CN109344614B (en) Android malicious application online detection method
CN114528457A (en) Web fingerprint detection method and related equipment
CN115150261B (en) Alarm analysis method, device, electronic equipment and storage medium
US11874933B2 (en) Security event modeling and threat detection using behavioral, analytical, and threat intelligence attributes
CN114006746A (en) Attack detection method, device, equipment and storage medium
CN102984162A (en) Identifying method and collecting system for credible websites
Bhardwaj et al. Forensic analysis and security assessment of IoT camera firmware for smart homes
CN113839954A (en) Method, device, equipment and storage medium for acquiring threat information
TWI758632B (en) Data collection system for efficient processing of massive data
KR102001814B1 (en) A method and apparatus for detecting malicious scripts based on mobile device
CN111291044A (en) Sensitive data identification method and device, electronic equipment and storage medium
US20230156017A1 (en) Quantification of Adversary Tactics, Techniques, and Procedures Using Threat Attribute Groupings and Correlation
CN114039776B (en) Method and device for generating flow detection rule, electronic equipment and storage medium
CN116170243B (en) POC (point-of-care) -based rule file generation method and device, electronic equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant