WO2021169293A1 - 攻击行为检测方法、装置及攻击检测设备 - Google Patents

攻击行为检测方法、装置及攻击检测设备 Download PDF

Info

Publication number
WO2021169293A1
WO2021169293A1 PCT/CN2020/118782 CN2020118782W WO2021169293A1 WO 2021169293 A1 WO2021169293 A1 WO 2021169293A1 CN 2020118782 W CN2020118782 W CN 2020118782W WO 2021169293 A1 WO2021169293 A1 WO 2021169293A1
Authority
WO
WIPO (PCT)
Prior art keywords
sample
detection
behavior
training
initial
Prior art date
Application number
PCT/CN2020/118782
Other languages
English (en)
French (fr)
Inventor
唐玉宾
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to EP20922229.8A priority Critical patent/EP4060958B1/en
Publication of WO2021169293A1 publication Critical patent/WO2021169293A1/zh
Priority to US17/867,976 priority patent/US20220368706A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1425Traffic logging, e.g. anomaly detection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1416Event detection, e.g. attack signature detection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/02Network architectures or network communication protocols for network security for separating internal from external traffic, e.g. firewalls
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1433Vulnerability analysis
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1441Countermeasures against malicious traffic
    • H04L63/145Countermeasures against malicious traffic the attack involving the propagation of malware through the network, e.g. viruses, trojans or worms
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/02Protocols based on web technology, e.g. hypertext transfer protocol [HTTP]

Definitions

  • This application relates to the field of network security technology, and in particular to an attack detection method, device, and attack detection equipment.
  • EK exploit kit
  • a host visits a malicious website containing EK
  • EK will use the vulnerability information in the Internet environment of the host to select the corresponding malware to attack the host.
  • the user can be reminded to take timely measures to deal with the EK's attack, so as to minimize user losses.
  • a host when a host visits a website, it can collect and detect the script code of the website, parse the script code, and generate a signature of the script code. After that, the host compares the generated signature with the signature in the stored signature library to determine whether there is an EK attack during the host's access to the website.
  • the signature in the stored signature library is generated by a signature algorithm based on the malicious code of the known EK.
  • the collected script code usually contains the user's privacy data, so there is a risk of infringing on the user's privacy.
  • the processor resources and memory resources consumed for parsing the script code are very large, which will cause the performance of the host to decrease.
  • the present application provides an attack detection method, device, and attack detection equipment, which can reduce the risk of infringing on user privacy in related technologies and improve the accuracy of attack detection without consuming host resources.
  • the technical solution is as follows:
  • an attack detection method which includes:
  • HTTP hypertext transfer protocol
  • the HTTP packet stream data includes data in one or more HTTP packets, and the one or more HTTP packets
  • the message belongs to the first data stream, and the reference time period is the time period before the current time and the reference duration of the current time; according to the HTTP message stream data, multiple behavior detection models are used to determine multiple initial probability values. Behavior detection models are used to describe different stages of EK's attack behavior trajectory.
  • the initial probability value refers to the probability value output by one behavior detection model of the multiple behavior detection models; the comprehensive probability value is determined according to the multiple initial probability values , The comprehensive probability value is used to indicate the possibility of the host being attacked by EK during the process of transmitting the first data stream; if the comprehensive probability value is greater than the preset probability threshold, it is determined that the host is in the process of transmitting the first data stream. EK's offensive behavior.
  • the EK attack behavior trajectory includes multiple different stages.
  • the HTTP packet stream (data stream) data will be transmitted, and the HTTP packet stream data transmitted by the host attacked by the EK will carry EK. Some behavioral characteristics of the attack.
  • the attack detection device can obtain the HTTP packet stream data transmitted by the host within the reference time period, and then analyze and process the HTTP packet stream data to detect whether the host is attacked by EK, that is, the host Is there any EK attack in the process of transmitting data stream?
  • HTTP packet flow data includes data in one or more HTTP packets, and the one or more HTTP packets belong to the first data flow, and the first data flow is the difference between the host and a certain device
  • the data stream formed by the transmission of HTTP packets between the two, the reference time period is the time period before the current time and the reference duration of the current time.
  • the attack detection device can perform data preprocessing on the acquired HTTP packet stream data to obtain the input of each behavior detection model in the multiple behavior detection models, and process the corresponding input through each behavior detection model. Obtain the corresponding initial probability value to obtain multiple initial probability values.
  • the multiple behavior detection models are respectively used to describe different stages of the EK's attack behavior trajectory.
  • determining multiple initial probability values through multiple behavior detection models according to the HTTP packet flow data includes: selecting a behavior detection model from the multiple behavior detection models, and executing the following according to the selected behavior detection model Operate until the following operations have been performed according to each of the multiple behavior detection models:
  • the HTTP packet flow data determine the feature vector corresponding to the selected behavior detection model; input the feature vector into the selected behavior detection model to obtain the initial probability value output by the selected behavior detection model.
  • the multiple behavior detection models are respectively used to describe different stages in the EK attack behavior trajectory, that is, one behavior detection model can be used to describe a stage in the EK attack behavior trajectory, and EK attacks in different stages
  • the behavior characteristics of the behaviors are different. Therefore, the feature vectors input to each behavior detection model are also different, that is, the feature vectors corresponding to each behavior detection model determined according to the HTTP packet flow data of the host are also different.
  • the EK's attack behavior trajectory includes the redirection phase, the attack object screening phase, the vulnerability exploitation phase, and the malware download phase.
  • multiple behavior detection models can describe at least any two of these four phases. There are two stages, that is, the multiple behavior detection models include at least two models respectively used to describe any two of the four stages. That is, optionally, the multiple behavior detection models include at least two of the following models: a redirection detection model, an attack object screening detection model, a vulnerability exploitation detection model, and a malware download detection model.
  • the redirection detection model is used to describe the redirection stage in the EK attack behavior trajectory.
  • the EK attack redirects the webpage that the user is browsing.
  • the attack object screening model is used to describe the attack object screening stage in the EK's attack behavior trajectory.
  • the EK will filter the attack objects based on the information such as the operating system and browser version carried in the HTTP message transmitted by the host.
  • the exploit detection model is used to describe the exploit stage in the trajectory of EK's attack behavior.
  • the EK will analyze the vulnerabilities in the host and download the vulnerability files to the host. For example, there may be vulnerabilities in the low version of the Flash plug-in on the host.
  • EK will download a Flash vulnerability file to the host, and blast the host.
  • the malware download detection model is used to describe the malware download stage in the EK's attack behavior trajectory.
  • EK will download malware to the host, such as Trojan horse software, ransomware, etc.
  • the attack detection device can obtain one or more features included in the feature vector corresponding to each behavior detection model from the HTTP packet stream data.
  • the data in an HTTP message includes multiple fields, each field represents a type of information, and each feature included in the feature vector can be obtained from these fields respectively.
  • the various behavior characteristics of EK attack behavior can be divided into public characteristics and unique characteristics.
  • the public characteristics are the characteristics that are common to each behavior detection model, or the characteristics that are common to some behavior detection models, and the unique characteristics. Detect features unique to a certain behavior model. That is, the feature vector corresponding to each behavior detection model includes one or more features, and part of the one or more features is a public feature, and the other part is a unique feature.
  • the HTTP message transmitted by the host may carry the Location (direction) field, and the Location field is used to The web page browsed by the host is redirected.
  • the feature vector corresponding to the redirection detection model includes the behavioral characteristics of the redirection stage, such as the HTTP message code, the length of the uniform resource locator (URL) field, and whether the Location field is carried or not.
  • the HTTP message code is the public feature of the redirection detection model and the vulnerability exploitation detection model
  • the length of the URL field is the public feature of the above four models
  • whether the HTTP message stream data carries the Location field is a unique feature of the redirection detection model.
  • the feature vector corresponding to the attack object screening model includes the behavior characteristics of the attack object screening phase, such as the URL field. Length, operating system type, etc.
  • the operating system type is a unique feature of the attack target screening model.
  • the exploit detection model is used to describe the exploit stage in the EK's attack behavior trajectory, and the fields of the HTTP message transmitted by the host at this stage may be changed, including adding fields, tampering with data, Encrypted fields, etc.
  • the feature vector corresponding to the exploit detection model includes the behavioral characteristics of the exploit stage, such as the message code of the HTTP message, the length of the URL field, whether the URL field contains the Base64 mode encoding substring, whether it carries the X-Flash-Version field, etc. .
  • whether the URL field contains the Base64 mode encoding substring and whether it carries the X-Flash-Version field is a unique feature of the vulnerability exploitation detection model.
  • the feature vector corresponding to the malware download detection model includes the behavior characteristics of the malware download phase. For example, the message code of the HTTP message, the length of the URL field, the Content-Type (content type) field, and the Content-Length (content length) field of the HTTP message. It should be noted that the HTTP message includes an HTTP request message and an HTTP response message, and the HTTP response message carries Content-Type and Content-Length fields.
  • the attack detection device After determining the feature vector corresponding to a model, the attack detection device inputs the feature vector to the corresponding model, and uses the probability value output by the model as an initial probability value. This operation is performed on all four models to obtain four initial values. Probability value.
  • the method before determining multiple initial probability values through multiple behavior detection models according to the HTTP packet flow data, the method further includes: filtering the HTTP packet flow data according to the filtering rule set; For flow data, determining multiple initial probability values through multiple behavior detection models includes: determining the multiple initial probability values through the multiple behavior detection models according to the HTTP packet flow data remaining after filtering. That is, the attack detection device can first filter out the data in the HTTP message that obviously does not need to be detected, and then determine multiple initial probability values through the multiple behavior detection models.
  • the filtering rule set includes but is not limited to the following rules:
  • the first filter rule, the matching item of the first filter rule is: a reference type set containing one or more types of operating systems, the reference type set includes types of operating systems whose probability of being attacked by EK is less than the reference probability threshold, the The action of the first filter rule is: filter out.
  • the first filter rule is used to filter out the data in the first-destination HTTP packet.
  • the first-destination HTTP packet means that the type of operating system carried is included in the reference type. Centralized HTTP messages; and/or
  • the second filter rule, the matching item of the second filter rule is: one or more intranet addresses, the action of the second filter rule is: filter out, and the second filter rule is used to include the second destination HTTP message
  • the data of the second destination is filtered out, the second destination HTTP message refers to the HTTP message carrying the destination address of the intranet address;
  • the third filter rule, the matching item of the third filter rule is: a reference domain name set containing one or more domain names, and the reference domain name set includes domain names whose access frequency is greater than the frequency threshold.
  • the action of the third filter rule is: filter
  • the third filtering rule is used to filter out the data in the third-destination HTTP message.
  • the third-destination HTTP message refers to the HTTP message whose domain name is included in the reference domain name set.
  • the attack detection device can set the type of operating system carried as a low-risk operating system and/or the carried destination address as an intranet
  • the data in the HTTP packets whose addresses and/or domain names carried are domain names with a high frequency of visits are filtered out.
  • the attack detection device can comprehensively process the multiple initial probability values to determine a comprehensive probability value, which is used to indicate the possibility of the host being attacked by EK in the process of transmitting the first data stream. sex.
  • the determining a comprehensive probability value based on the multiple initial probability values includes: determining multiple cross features based on the multiple initial probability values, where the cross features refer to two different initial probability values among the multiple initial probability values It is obtained after multiplication; generates a cross feature vector based on the multiple cross features; inputs the cross feature vector into the correlation analysis model to obtain the comprehensive probability value output by the correlation analysis model, and the correlation analysis model is used for the attack behavior trajectory of EK Comprehensive analysis of multiple different stages in the
  • the method further includes: performing vulnerability file detection and malware detection on the HTTP packet stream data to obtain the vulnerability file detection result and the malware detection result;
  • the multiple initial probability values determine multiple cross features, including: determining the multiple cross features according to the multiple initial probability values, the detection result of the vulnerable file and the detection result of the malware, and the cross feature refers to the multiple initial The probability value, the detection result of the vulnerable file, and the detection result of the malware are multiplied by two different data.
  • the method of performing vulnerability file detection and malware detection on HTTP packet stream data is a detection method based on an intrusion prevention system (IPS).
  • IPS can analyze the fields and characters included in the HTTP packet stream data to obtain the detection results of the vulnerable files and the malware detection results.
  • the determining the multiple cross-features according to the multiple initial probability values, the vulnerability file detection result and the malware detection result includes: according to the multiple initial probability values, and the vulnerability file detection result and the malicious software detection result.
  • Software detection results generate a probability matrix, the probability matrix is a matrix of X rows and X columns, X is the total number of the multiple initial probability values, the detection results of the vulnerable file, and the malware detection results, both of the X rows and X columns
  • the elements in the probability matrix are obtained by multiplying the two crossed data; according to the cross feature selection strategy, the probability matrix is selected Multiple elements are selected, and the multiple elements selected are used as the multiple cross-features.
  • the cross feature selection strategy is a strategy determined based on experience to filter out redundant features.
  • the attack detection device can generate cross feature vectors according to the aforementioned related introduction, input the cross feature vectors into the correlation analysis model, and output a comprehensive probability value. If the comprehensive probability value is greater than the preset probability threshold, the attack detection device determines that there is an EK attack behavior in the process of the host transmitting the first data stream, that is, the host is attacked by the EK.
  • the above multiple attack behavior detection models are multiple models determined in advance based on training samples. That is, before determining multiple initial probability values through the multiple behavior detection models according to the HTTP packet flow data, the method further includes: obtaining multiple training samples, and the corresponding training sample of each of the multiple training samples Sample label.
  • the training sample includes the data in one or more sample HTTP messages belonging to the second data stream.
  • the sample label is used to indicate whether the corresponding training sample is a positive training sample or a negative training sample.
  • a positive training sample means that it has not been EK
  • the attacked HTTP packet stream data the negative training sample refers to the HTTP packet stream data attacked by EK; according to the multiple training samples and the sample label corresponding to each training sample in the multiple training samples, the multiple initial The detection model is trained to obtain the multiple behavior detection models, and the multiple initial detection models respectively correspond to different stages in the attack behavior trajectory of the EK.
  • obtaining multiple training samples includes: obtaining multiple sample HTTP packet stream data, where the sample HTTP packet stream data refers to HTTP packets in the second data stream that are located within the reference time period before the current time According to the filtering rule set, filter each sample HTTP packet flow data in the multiple sample HTTP packet flow data; determine the remaining multiple sample HTTP packet flow data after filtering as multiple training samples .
  • the attack detection device can perform preprocessing operations on the obtained HTTP packets according to the definition of the data stream and the reference duration to obtain multiple training samples.
  • training multiple initial detection models to obtain the multiple behavior detection models includes: Select an initial detection model from the initial detection model, and perform the following operations according to the selected initial detection model, until the following operations have been performed according to each of the multiple initial detection models:
  • the sample feature set corresponding to the selected initial detection model according to the sample HTTP message included in each training sample in the multiple training samples, the sample feature set including multiple sample feature vectors corresponding to the multiple training samples one-to-one ;
  • the multiple sample feature vectors are respectively input to the selected initial detection model, and the selected initial detection model is trained so that the output of the selected initial detection model is the sample label corresponding to the corresponding training sample in the multiple training samples , So as to get a behavior detection model.
  • the method further includes: according to the A plurality of behavior detection models, and a sample feature set corresponding to each behavior detection model of the plurality of behavior detection models, determine a sample cross feature set, and the sample cross feature set includes a plurality of samples corresponding to the plurality of training samples one-to-one Cross feature vector; input the multiple sample cross feature vectors into the initial analysis model, and train the initial analysis model so that the output of the initial analysis model is the sample label corresponding to the corresponding training sample in the multiple training samples, In order to obtain the correlation analysis model.
  • the sample feature vectors used to train each initial detection model are also different.
  • the attack detection device can determine the sample feature set corresponding to the corresponding initial detection model based on the sample HTTP messages included in the multiple training samples and the behavior features included in the feature vector corresponding to each behavior detection model.
  • the method further includes: performing loopholes on the multiple training samples respectively File detection and malware detection, to obtain the vulnerability file detection result and malware detection result corresponding to each training sample in the multiple training samples; according to the multiple behavior detection models and each behavior in the multiple behavior detection models
  • the sample feature set corresponding to the detection model, and the determination of the sample cross feature set includes: according to the plurality of behavior detection models, the sample feature set corresponding to each behavior detection model in the plurality of behavior detection models, and each of the plurality of training samples
  • the vulnerability file detection results and malware detection results corresponding to each training sample determine the cross feature set of the sample. That is, the attack detection equipment can perform vulnerability file detection and malware detection on each training sample based on IPS.
  • determining the sample cross feature set includes: selecting a training sample from the multiple training samples, and performing the following processing on the selected training sample until each training sample in the multiple training samples is processed:
  • the sample feature vector corresponding to the selected training sample from the sample feature set corresponding to the multiple behavior detection models is input into the multiple behavior detection models to obtain the sample probability values respectively output by the multiple behavior detection models, thereby obtaining multiple samples Probability value; according to the probability values of the multiple samples, and the detection results of the vulnerability files and the malware detection results corresponding to the selected training samples, determine the cross features of multiple samples.
  • the cross features of the samples refer to the probability values of the multiple samples and the selected training
  • the vulnerability file detection result corresponding to the sample is obtained by multiplying two different data in the malware detection result; a sample cross feature vector is generated according to the cross features of the multiple samples.
  • the implementation of determining the cross features of multiple samples can refer to the above description of determining multiple cross features. , I won’t repeat it here.
  • an attack behavior detection device in a second aspect, is provided, and the attack behavior detection device has the function of realizing the behavior of the attack behavior detection method in the first aspect.
  • the attack behavior detection device includes one or a module, and the one or more modules are used to implement the attack behavior detection method provided in the above-mentioned first aspect.
  • an attack behavior detection device which includes:
  • the first obtaining module is used to obtain HTTP packet stream data transmitted by the host within a reference time period.
  • the HTTP packet stream data includes data in one or more HTTP packets, and one or more HTTP packets belong to the first data Stream, the reference time period is the time period before the current time and the reference duration of the current time;
  • the first determination module is used to determine multiple initial probability values through multiple behavior detection models based on HTTP packet flow data.
  • the multiple behavior detection models are used to describe different stages of the EK's attack behavior trajectory.
  • the initial probability Value refers to the probability value output by one behavior detection model among multiple behavior detection models;
  • the second determining module is used to determine a comprehensive probability value according to the multiple initial probability values, and the comprehensive probability value is used to indicate the possibility of the host being attacked by the EK in the process of transmitting the first data stream;
  • the third determining module is configured to determine that if the comprehensive probability value is greater than the preset probability threshold, there is an EK attack behavior in the process of the host transmitting the first data stream.
  • the first determining module is specifically configured to:
  • the multiple behavior detection models include at least two of the following models: a redirection detection model, an attack object screening detection model, a vulnerability exploitation detection model, and a malware download detection model.
  • the second determining module includes:
  • the first determining unit is configured to determine multiple cross features according to the multiple initial probability values, where the cross feature refers to a product obtained by multiplying two different initial probability values among the multiple initial probability values;
  • a generating unit configured to generate a cross feature vector according to the multiple cross features
  • the comprehensive analysis unit is used to input the cross feature vector into the correlation analysis model to obtain the comprehensive probability value output by the correlation analysis model.
  • the correlation analysis model is used to comprehensively analyze multiple different stages in the EK's attack behavior trajectory.
  • the second determining module further includes:
  • the second determining unit is used to perform vulnerability file detection and malware detection on the HTTP packet stream data to obtain the vulnerability file detection result and the malware detection result;
  • the first determining unit is specifically used for:
  • Cross-features refer to two differences among multiple initial probability values, the detection result of the vulnerable file, and the malware detection result. After multiplying the data.
  • the first determining unit is specifically configured to:
  • a probability matrix is generated.
  • the probability matrix is a matrix with X rows and X columns.
  • X is multiple initial probability values, vulnerability file detection results and malware detection.
  • the total number of results, X rows and X columns correspond to multiple initial probability values, vulnerability file detection results, and malware detection results.
  • the elements in the probability matrix are obtained by multiplying the two intersecting data;
  • multiple elements are selected from the probability matrix, and the multiple elements selected are used as multiple cross features.
  • the device further includes:
  • the first filtering unit is configured to filter the HTTP packet flow data according to the filtering rule set
  • the first determining module is specifically used for:
  • multiple behavior detection models are used to determine multiple initial probability values.
  • the filtering rule set includes but is not limited to the following rules:
  • the first filter rule, the matching item of the first filter rule is: a reference type set containing one or more types of operating systems, the reference type set includes types of operating systems whose probability of being attacked by EK is less than the reference probability threshold, the first filter
  • the action of the rule is: filter out.
  • the first filter rule is used to filter out the data in the first-destination HTTP packet.
  • the first-destination HTTP packet refers to the HTTP packet whose operating system type is included in the reference type set. ;and / or
  • the second filter rule, the matching item of the second filter rule is: one or more intranet addresses
  • the action of the second filter rule is: filter out
  • the second filter rule is used to filter the data in the second destination HTTP message
  • the second destination HTTP message refers to the HTTP message carrying the destination address of the intranet address;
  • the third filter rule The matching item of the third filter rule is: a reference domain name set containing one or more domain names.
  • the reference domain name set includes domain names whose access frequency is greater than the frequency threshold.
  • the action of the third filter rule is: filter out, first
  • the three filtering rules are used to filter out the data in the third-destination HTTP message.
  • the third-destination HTTP message refers to the HTTP message whose domain name is included in the reference domain name set.
  • the device further includes:
  • the second acquisition module is used to acquire a plurality of training samples and a sample label corresponding to each training sample in the plurality of training samples, the training samples include data in one or more sample HTTP messages belonging to the second data stream,
  • the sample tag is used to indicate whether the corresponding training sample is a positive training sample or a negative training sample.
  • the positive training sample refers to the HTTP packet stream data that has not been attacked by EK
  • the negative training sample refers to the HTTP packet stream data that is attacked by EK
  • the first training module is used to train multiple initial detection models according to the multiple training samples and the sample label corresponding to each training sample in the multiple training samples to obtain multiple behavior detection models.
  • the detection models correspond to different stages in the trajectory of EK's attack behavior.
  • the second acquisition module includes:
  • the acquiring unit is configured to acquire multiple sample HTTP packet stream data, where the sample HTTP packet stream data refers to the data in the HTTP packet in the second data stream within a reference time period before the current time;
  • the second filtering unit is configured to filter each sample HTTP packet flow data among the multiple sample HTTP packet flow data according to the filtering rule set;
  • the third determining unit is configured to determine the multiple sample HTTP packet stream data remaining after filtering as multiple training samples.
  • the first training module is specifically used for:
  • the multiple sample feature vectors are respectively input to the selected initial detection model, and the selected initial detection model is trained so that the output of the selected initial detection model is the sample label corresponding to the corresponding training sample in the multiple training samples, thereby Get a behavior detection model.
  • the device further includes:
  • the second acquisition module is configured to acquire a plurality of training samples and a sample label corresponding to each training sample in the plurality of training samples, where the training samples include one or more sample HTTP messages belonging to the second data stream
  • the sample tag is used to indicate whether the corresponding training sample is a positive training sample or a negative training sample, the positive training sample refers to the HTTP packet stream data that has not been attacked by EK, and the negative training sample refers to the EK Attacked HTTP packet stream data;
  • the first training module is configured to train multiple initial detection models according to the multiple training samples and the sample label corresponding to each of the multiple training samples to obtain the multiple behavior detection models,
  • the multiple initial detection models respectively correspond to different stages in the EK's attack behavior trajectory;
  • the third determining module is used to determine a sample cross feature set based on the multiple behavior detection models and the sample feature set corresponding to each behavior detection model in the multiple behavior detection models.
  • the sample cross feature set includes multiple training samples One-to-one correspondence of multiple sample cross feature vectors;
  • the second training module is used to input the multiple sample cross feature vectors into the initial analysis model, and train the initial analysis model so that the output of the initial analysis model is the sample label corresponding to the corresponding training sample in the multiple training samples. , So as to get the correlation analysis model.
  • the device further includes:
  • the fourth determining module is used to perform vulnerability file detection and malware detection on the multiple training samples, respectively, to obtain the vulnerability file detection result and the malware detection result corresponding to each training sample in the multiple training samples;
  • the third determining module is used to:
  • the sample feature set corresponding to each behavior detection model in the plurality of behavior detection models, and the vulnerability file detection result and the malware detection result corresponding to each training sample in the plurality of training samples determine Sample cross feature set.
  • the third determining module is specifically configured to:
  • the sample feature vectors corresponding to the selected training samples from the sample feature sets corresponding to the multiple behavior detection models are input into the multiple behavior detection models respectively, and the probability values of the samples respectively output by the multiple behavior detection models are obtained, thereby obtaining multiple samples Probability value
  • the cross features of the samples refer to the multiple sample probability values and the vulnerabilities corresponding to the selected training samples. It is obtained by multiplying two different data in the file detection result and the malware detection result;
  • a sample cross feature vector is generated according to the multiple sample cross features.
  • an attack detection device in a third aspect, includes a processor and a memory, and the memory is used to store a program for executing the attack behavior detection method provided in the first aspect above, and to store a program for implementing the above The data involved in the attack detection method provided by the first aspect.
  • the processor is configured to execute a program stored in the memory.
  • the operating device of the storage device may further include a communication bus, and the communication bus is used to establish a connection between the processor and the memory.
  • a computer-readable storage medium stores instructions that, when run on a computer, cause the computer to execute the attack behavior detection method described in the first aspect.
  • a computer program product containing instructions which when running on a computer, causes the computer to execute the attack behavior detection method described in the first aspect.
  • this solution obtains the HTTP packet flow data of the host within a period of time and processes it through multiple behavior detection models to determine multiple initial probability values.
  • the multiple behavior detection models are respectively used to describe the multiple different stages. Therefore, this solution can completely describe the EK's attack behavior trajectory.
  • the multiple initial probability values can be comprehensively processed to obtain a comprehensive probability value, that is, this solution can comprehensively analyze the behavior patterns of EK attacks at various stages, and more accurately determine that the host is transmitting data
  • the probability of being attacked by EK in the process of streaming that is, more accurate detection of EK's attack behavior. It can be seen that this solution can quickly and accurately detect EK attacks without seriously consuming the resources of the host itself.
  • the HTTP packet stream data obtained in this solution only contains regular data specified by the network protocol, compared to the method of obtaining script codes for parsing, the risk of infringing user privacy in this solution is very low.
  • FIG. 1 is a system architecture diagram involved in an attack detection method provided by an embodiment of the present application
  • FIG. 2 is a schematic structural diagram of an attack detection device provided by an embodiment of the present application.
  • FIG. 3 is a flowchart of an attack detection method provided by an embodiment of the present application.
  • FIG. 4 is a flowchart of filtering HTTP packet flow data according to a filtering rule set according to an embodiment of the present application
  • FIG. 5 is a flowchart of another method for determining attack behavior detection provided by an embodiment of the present application.
  • FIG. 6 is a flowchart of a method for determining multiple behavior detection models provided by an embodiment of the present application.
  • FIG. 7 is a flowchart of a method for determining an association analysis model provided by an embodiment of the present application.
  • FIG. 8 is a schematic structural diagram of an attack behavior detection device provided by an embodiment of the present application.
  • FIG. 1 is a system architecture diagram involved in an attack detection method provided by an embodiment of the present application.
  • the system architecture includes a host 101, an HTTP proxy device 102, a firewall 103 and an attack detection device 104.
  • the host 101 can connect and communicate with the HTTP proxy device 102 in a wireless or wired manner.
  • the HTTP proxy device 102 can connect and communicate with the firewall 103 in a wireless or wired manner.
  • the HTTP proxy device 102 can also communicate with the attack detection device 104 in a wireless or wired manner.
  • the host 101 is used to transmit (send or receive) HTTP packet stream data.
  • the HTTP proxy device 102 is used to proxy the host 101 to obtain some information, that is, after the HTTP packet stream data sent by the host 101 reaches the HTTP proxy device 102, the HTTP proxy device 102 can obtain corresponding information from the external network and return it to the host 101.
  • the firewall 103 is used to protect the host 101.
  • the attack detection device 104 is configured to obtain the HTTP packet stream data transmitted by the host 101 in the reference time period from the HTTP proxy device 102, and process the HTTP packet stream data according to the technical solution provided in the embodiment of the present application to determine Whether the host 101 is attacked by EK. Among them, the attack detection device 104 is also used to determine multiple behavior detection models and correlation analysis models provided by this solution, and deploy it in itself.
  • the attack detection device 104 is any third-party device.
  • the attack detection device 104 is a cybersecurity intelligence system (CIS), referred to as a CIS device for short.
  • CIS cybersecurity intelligence system
  • the attack detection device 104 is a bypass device of a forwarding device using firewalls, switches, and routers as examples.
  • the system architecture also includes a forwarding device that is used to transmit data to the host 101 Packet flow data is forwarded.
  • the attack detection device 104 is configured to obtain the HTTP packet stream data transmitted by the host 101 from the forwarding device.
  • the attack detection device 104 adopts a cloud deployment solution, that is, the attack detection device 104 is deployed on the Internet.
  • the attack detection device 104 provides an EK attack behavior detection service to other devices that provide packet flow data.
  • the devices that provide packet flow data here include, but are not limited to, hosts, forwarding devices using firewalls, switches, and routers as examples, or third-party servers.
  • the device that provides packet flow data provides the packet flow data to the attack detection device 104 through the web product interface design (Website User Interface, Web UI), and receives the detection result output by the attack detection device 104, such as the provided packet. Whether there are EK attacks in the stream data.
  • the multiple behavior detection models and correlation analysis models in the embodiments of the present application can also be determined by other computer equipment based on training sample training. In this way, the multiple behavior detection models and correlation analysis models that have been trained are deployed in the attack.
  • the detection device 104 can be used.
  • the attack detection device 104 is further configured to send alarm information to the host 101 after determining that the host 101 is attacked by the EK, and the alarm information is used to indicate that the host 101 is attacked by the EK to prompt the user or
  • the host 101 takes countermeasures in time.
  • the system architecture also includes a network management device. Both the host 101 and the attack detection device 104 can communicate with the network management device in a wireless or wired manner.
  • the attack detection device 104 is also used to communicate to the host 101 after it is determined that the host 101 is attacked by EK.
  • the alarm information is reported on the network management device, and the network management device can take countermeasures based on the reported alarm information, such as forwarding the alarm information to the host 101.
  • the system architecture includes multiple hosts 101, the multiple hosts 101 are in a local area network, and the multiple hosts 101 can communicate with the HTTP proxy device 102 in a wireless or wired manner.
  • the attack detection device 104 is used to detect whether the multiple hosts 101 are attacked by EKs.
  • the system architecture also includes a forwarding device, the forwarding device can forward HTTP packet stream data transmitted by the multiple hosts 101, and the attack detection device 104 is used to detect whether each host 101 of the multiple hosts 101 is EK attack.
  • any host 101 is a desktop computer, a tablet computer, a notebook computer, a mobile phone, a smart TV, a smart speaker, etc., which is not limited in the embodiment of the present application.
  • FIG. 2 is a schematic structural diagram of an attack detection device according to an embodiment of the present application.
  • the attack detection device is the attack detection device 102 shown in FIG. 1, and the attack detection device includes one or more processors 201, a communication bus 202, a memory 203 and one or more communication interfaces 204.
  • the processor 201 is a general-purpose central processing unit (CPU), a network processor (NP), a microprocessor, or one or more integrated circuits for implementing the solution of the application, for example, an application specific integrated circuit ( application-specific integrated circuit, ASIC), programmable logic device (programmable logic device, PLD) or a combination thereof.
  • the above-mentioned PLD is a complex programmable logic device (CPLD), a field-programmable gate array (FPGA), a general array logic (generic array logic, GAL), or any of them combination.
  • the communication bus 202 is used to transfer information between the aforementioned components.
  • the communication bus 202 is divided into an address bus, a data bus, a control bus, and the like.
  • address bus a data bus
  • control bus a control bus
  • only one thick line is used in the figure, but it does not mean that there is only one bus or one type of bus.
  • the memory 203 is read-only memory (ROM), random access memory (RAM), electrically erasable programmable read-only memory (EEPROM) , Optical discs (including compact discs (read-only memory, CD-ROM), compact discs, laser discs, digital universal discs, Blu-ray discs, etc.), magnetic disk storage media or other magnetic storage devices, or can be used to carry Or any other medium that stores desired program codes in the form of instructions or data structures and can be accessed by a computer, but is not limited to this.
  • the memory 203 exists independently and is connected to the processor 201 through the communication bus 202, or the memory 203 and the processor 201 are integrated together.
  • the communication interface 204 uses any device such as a transceiver for communicating with other devices or a communication network.
  • the communication interface 204 includes a wired communication interface, and optionally, a wireless communication interface.
  • the wired communication interface is, for example, an Ethernet interface.
  • the Ethernet interface is an optical interface, an electrical interface, or a combination thereof.
  • the wireless communication interface is a wireless local area network (WLAN) interface, a cellular network communication interface, or a combination thereof.
  • WLAN wireless local area network
  • the attack detection device includes multiple processors, such as the processor 201 and the processor 205 as shown in FIG. 2. Each of these processors is a single-core processor or a multi-core processor.
  • the processor herein refers to one or more devices, circuits, and/or processing cores for processing data (such as computer program instructions).
  • the attack detection device further includes an output device 206 and an input device 207.
  • the output device 206 communicates with the processor 201 and can display information in a variety of ways.
  • the output device 206 is a liquid crystal display (LCD), a light emitting diode (LED) display device, a cathode ray tube (CRT) display device, a projector (projector), or a printer Wait.
  • the input device 207 communicates with the processor 201 and can receive user input in a variety of ways.
  • the input device 207 is a mouse, a keyboard, a touch screen device, a sensor device, or the like.
  • the memory 203 is used to store the program code 210 for executing the solution of the present application, and the processor 201 can execute the program code 210 stored in the memory 203.
  • the program code includes one or more software modules, and the attack detection device can implement the attack behavior detection method provided in the embodiment of FIG. 3 below through the processor 201 and the program code 210 in the memory 203.
  • FIG. 3 is a flowchart of an attack detection method provided by an embodiment of the present application. The method is applied to an attack detection device.
  • the attack detection device is the CIS device in FIG. 1 as an example for introduction. Please refer to Figure 3, the method includes the following steps.
  • Step 301 Obtain HTTP packet stream data transmitted by the host in a reference time period.
  • the EK attack behavior trajectory includes multiple different stages.
  • the HTTP packet stream (data stream) data will be transmitted, and the HTTP packet stream data transmitted by the host attacked by the EK will be Carry some behavioral characteristics of EK attacks.
  • the attack detection device can obtain the HTTP packet stream data transmitted by the host within the reference time period, and then analyze and process the HTTP packet stream data to detect whether the host is attacked by EK, that is, the host Is there any EK attack in the process of transmitting data stream?
  • the HTTP packet flow data includes data in one or more HTTP packets, and the one or more HTTP packets belong to the first data flow, and the first data flow is the difference between the host and a certain device
  • the data stream formed by the transmission of HTTP packets between the two, the reference time period is the time period before the current time and the reference duration of the current time.
  • the reference duration is a preset duration, such as 20 minutes, 30 minutes, and so on.
  • the definition of a data stream is a two-tuple, and the two-tuple includes a source IP address and a destination IP address, that is, multiple HTTP packets with the same source IP address and destination IP address. Belong to the same data stream.
  • the definition of the data stream is any other definition, which is not limited in the embodiment of the present application.
  • the HTTP packet stream data transmitted by the host will be received by the HTTP proxy device, and the attack detection device will send an acquisition request for obtaining the HTTP packet data stream to the HTTP proxy device.
  • the HTTP proxy device can send the host to the HTTP proxy device according to the acquisition request.
  • the HTTP packet stream data transmitted within the reference time period is sent to the attack detection device.
  • the HTTP proxy device only receives the HTTP packet stream data transmitted by the host, and the attack detection device can directly obtain the HTTP transmitted by the host from the HTTP proxy device. Packet flow data.
  • the HTTP proxy device is used to serve the multiple hosts, and the HTTP proxy device will receive the HTTP packet stream data transmitted by each host, that is, the HTTP packet received by the HTTP proxy device Streaming data comes from multiple hosts, and the attack detection device is used to detect whether each host in the multiple hosts is attacked by EK.
  • the attack detection device can obtain the HTTP packets received by the HTTP proxy device in real time, and process the obtained HTTP packets according to the definition of the data flow and the reference time period, and obtain the data transmitted by each host in the reference time period.
  • HTTP packet stream data For the HTTP packet stream data of each host, the attack detection device can detect whether the corresponding host has EK in the process of transmitting HTTP packet stream data according to the attack detection method provided in the embodiment of this application. Aggressive behavior.
  • Step 302 According to the HTTP packet flow data, multiple behavior detection models are used to determine multiple initial probability values.
  • the attack detection device can perform data preprocessing on the acquired HTTP packet stream data to obtain the input of each behavior detection model in the multiple behavior detection models, and perform data processing on the corresponding input through each behavior detection model. Processing to obtain the corresponding initial probability value to obtain multiple initial probability values.
  • the multiple behavior detection models are respectively used to describe different stages of the EK's attack behavior trajectory.
  • the attack detection device can select a behavior detection model from the plurality of behavior detection models, and perform the following operations according to the selected behavior detection model, until each behavior detection model in the plurality of behavior detection models has been executed
  • the following operations are as follows: determine the feature vector corresponding to the selected behavior detection model according to the HTTP packet flow data, input the feature vector into the selected behavior detection model, and obtain the initial probability value output by the selected behavior detection model.
  • the multiple behavior detection models are respectively used to describe different stages in the EK attack behavior trajectory, that is, one behavior detection model can be used to describe one stage in the EK attack behavior trajectory, and the different stages of the behavior detection model
  • the behavior characteristics of EK attack behaviors are different. Therefore, the feature vectors input to each behavior detection model are also different, that is, the feature vectors corresponding to each behavior detection model determined according to the HTTP packet flow data of the host are also different.
  • the multiple behavior detection models include at least two of the following models: a redirection detection model, an attack object screening detection model, a vulnerability exploitation detection model, and a malware download detection model.
  • the attack behavior trajectory of the EK includes the redirection phase, the attack object screening phase, the vulnerability exploitation phase, and the malware download phase.
  • the multiple behavior detection models can at least describe any two of the four stages, that is, the multiple behavior detection models include at least two models for describing any two of the four stages. .
  • the redirection detection model is used to describe the redirection stage in the EK attack behavior trajectory.
  • the EK attack redirects the webpage that the user is browsing.
  • the attack object screening model is used to describe the attack object screening stage in the EK's attack behavior trajectory.
  • the EK will filter the attack objects based on the information such as the operating system and browser version carried in the HTTP message transmitted by the host.
  • the exploit detection model is used to describe the exploit stage in the trajectory of EK's attack behavior.
  • the EK will analyze the vulnerabilities in the host and download the vulnerability files to the host. For example, there may be vulnerabilities in the low version of the Flash plug-in on the host.
  • EK will download a Flash vulnerability file to the host, and blast the host.
  • the malware download detection model is used to describe the malware download stage in the EK's attack behavior trajectory.
  • EK will download malware to the host, such as Trojan horse software, ransomware, etc.
  • the behavior detection model in the embodiment of the present application is a learning model based on a machine learning algorithm, for example, a learning model based on a random forest algorithm, a learning model based on a support vector machine (support vector machine, SVM) algorithm, and a learning model based on xgboost
  • the learning model of the algorithm, etc., and the algorithm adopted by each behavior detection model may be the same or different, which is not limited in the embodiment of the present application.
  • the behavior detection model A and the behavior detection model B are both learning models based on the random forest algorithm, or the behavior detection model A is a learning model based on the random forest algorithm, and the behavior detection model B is a learning model based on the SVM algorithm.
  • the attack detection device can obtain one or more features included in the feature vector corresponding to each behavior detection model from the HTTP packet stream data.
  • the data in an HTTP message includes multiple fields, each field represents a type of information, and each feature included in the feature vector can be obtained from these fields respectively.
  • the various behavior characteristics of EK attack behavior can be divided into public characteristics and unique characteristics.
  • the public characteristics are the characteristics that are common to each behavior detection model, or the characteristics that are common to some behavior detection models, and the unique characteristics. Detect features unique to a certain behavior model. That is, the feature vector corresponding to each behavior detection model includes one or more features, and part of the one or more features is a public feature, and the other part is a unique feature.
  • the HTTP message transmitted by the host may carry the Location field, and the Location field is used to browse the host.
  • the web page is redirected.
  • the feature vector corresponding to the redirection detection model includes the behavioral characteristics of the redirection stage, such as the message code of the HTTP message, the length of the URL field, and whether the Location field is carried or not.
  • the HTTP message code is the public feature of the redirection detection model and the vulnerability exploitation detection model
  • the length of the URL field is the public feature of the above four models
  • whether the HTTP message stream data carries the Location field is a unique feature of the redirection detection model.
  • the HTTP packet message code may be a message code vector determined according to the message code carried in the data in each HTTP packet, The message code is 200, 400, 404, etc.
  • the length of the URL field is the total length or the average length or the longest length of all URL fields carried in the data in the one or more HTTP messages.
  • the HTTP message stream data carries the Location field.
  • use '0' to indicate not to carry, use '1' to indicate to carry, or use other characters indicates that this is not limited in the embodiment of the present application.
  • the embodiment of the present application may adopt a one-hot encoding method to determine the message code vector.
  • the message code used in the embodiment of the present application includes Type 1, Type 2, and Type 3
  • the message code vector can be initialized to '0, 0, 0', that is, the total number of elements of the initialization message code vector is equal to the total number of message code types used. Each element corresponds to a type and is initialized to '0'. If the one or more message codes include Type 1 and Type 2, excluding Type 3, can determine that the message code vector is '1, 1, 0'.
  • the message codes used in the embodiment of the present application include 400 and 302
  • the HTTP packet stream data includes data in three packets of HTTP1, HTTP2, and HTTP3, and the message codes carried in these three packets are respectively 200, 400, 404
  • the average length of all URL fields carried in the data in these three messages is 60
  • the data in the HTTP1 message carries the Location field
  • the feature vector of the HTTP message flow data is [0, 1, 60, 1].
  • the feature vector corresponding to the attack object screening model includes the behavior characteristics of the attack object screening phase, such as the URL field. Length, operating system type, etc.
  • the operating system type is a unique feature of the attack target screening model.
  • the exploit detection model is used to describe the exploit stage in the EK's attack behavior trajectory, and the fields of the HTTP message transmitted by the host at this stage may be changed, including adding fields, tampering with data, Encrypted fields, etc.
  • the feature vector corresponding to the exploit detection model includes the behavioral characteristics of the exploit stage, such as the message code of the HTTP message, the length of the URL field, whether the URL field contains the Base64 mode encoding substring, whether it carries the X-Flash-Version field, etc. .
  • whether the URL field contains the Base64 mode encoding substring and whether it carries the X-Flash-Version field is a unique feature of the vulnerability exploitation detection model.
  • the URL field contains a Base64 mode encoding substring
  • the URL is encrypted, and the host may be attacked by EK.
  • the URL field of an HTTP message contains the Base64 encoding mode substring
  • the HTTP message stream data contains the X-Flash-Version field.
  • the feature vector corresponding to the malware download detection model includes the behavior characteristics of the malware download phase. For example, the message code of the HTTP message, the length of the URL field, the Content-Type (content type) field, and the Content-Length (content length) field of the HTTP message. It should be noted that the HTTP message includes an HTTP request message and an HTTP response message, and the HTTP response message carries Content-Type and Content-Length fields.
  • the features included in the feature vector corresponding to each model described above can be extended according to actual conditions, such as whether to include certain special characters, special fields, etc., the number of fields included in the HTTP header, and the length of the fields included in the header.
  • the number of fields included in the HTTP message header is the largest number of fields included in one or more HTTP message headers
  • the length of the fields included in the message header is that the HTTP message header includes The average length of each field, which features are public features, and which features are unique features can be specified in advance.
  • the message code of the characteristic HTTP message described above is marked as X1
  • the length of the URL field is marked as X2
  • the location field is marked as X3
  • whether the URL field contains a Base64 mode encoding substring is marked as X4
  • whether it carries X -The Flash-Version field is denoted as X5
  • the operating system type is denoted as X6.
  • the feature vector corresponding to the redirection detection model is [X1, X2, X3]
  • the feature vector corresponding to the attack object screening detection model is [X1, X2, X6]
  • the feature vector corresponding to the vulnerability exploitation detection model is [X1, X2, X4, X5]
  • the feature vector corresponding to the malicious file download detection model is [X1, X2].
  • the attack detection device After determining the feature vector corresponding to a model, the attack detection device inputs the feature vector to the corresponding model, and uses the probability value output by the model as an initial probability value. This operation is performed on all four models to obtain four initial values.
  • the probability values are denoted as E1, E2, E3, and E4.
  • the attack detection device can also filter the obtained HTTP packet flow data according to the filtering rule set before determining multiple initial probability values through multiple behavior detection models based on the HTTP packet flow data of the host. Filter out the data in the HTTP message that obviously does not need to be detected. After that, the attack detection device can determine the multiple initial probability values through the multiple behavior detection models based on the remaining HTTP packet stream data after filtering.
  • the filtering rule set includes but is not limited to the following rules:
  • the first filter rule, the matching items of the first filter rule are: the type of operating system is included in the reference type set, the reference type set includes the type of operating system whose probability of being attacked by EK is less than the reference probability threshold, and the action of the first filter rule To: filter out.
  • the first filtering rule is used to filter out the data in the first-destination HTTP message.
  • the first-destination HTTP message refers to the HTTP message in which the type of operating system carried is included in the reference type set. and / or
  • the matching item of the second filter rule is: the destination address is an intranet address, and the action of the second filter rule is: filter out.
  • the second filtering rule is used to filter out the data in the second-destination HTTP message, and the second-destination HTTP message refers to the HTTP message carrying the destination address of the intranet address. and / or
  • the third filter rule the matching item of the third filter rule is: the domain name is included in the reference domain name set, the reference domain name set includes domain names whose access frequency is greater than the frequency threshold, and the action of the third filter rule is: filter out.
  • the third filtering rule is used to filter out the data in the third-destination HTTP message.
  • the third-destination HTTP message refers to the HTTP message whose domain name is included in the reference domain name set.
  • the attack detection device can set the type of operating system carried as a low-risk operating system and/or the carried destination address as an intranet
  • the data in the HTTP packets whose addresses and/or domain names carried are domain names with a high frequency of visits are filtered out.
  • the User-Agent (user agent) field carried in the data in the HTTP message contains the type of operating system, and the attack detection device can obtain the type of operating system carried in this field.
  • each rule included in the filtering rule set is set based on experience or based on statistical data.
  • the filtering rule set can be dynamically expanded. The principle is to minimize the number of suspicious HTTP packets for EK attacks. Unnecessary data processing.
  • the filtering rule set includes multiple rules, when determining the data filtering in a certain HTTP packet according to these rules, the attack detection device can determine a filtering order according to the respective importance of these rules, or determine any one. This order serves as a filtering order.
  • the filter rule set includes a first filter rule, a second filter rule, and a third filter rule
  • the filter order is the first filter rule, the second filter rule, and the third filter rule.
  • the attack detection device can first determine whether the type of operating system carried in the data in the HTTP message is included in the reference type set, if it is, it will filter out the data in the HTTP message, if it is not, then it will determine Whether the destination address carried in the data in the HTTP message is an intranet address, if it is, filter out the data in the HTTP message, if not, then judge whether the domain name carried in the data in the HTTP message contains In the reference domain name set, if it is, the data in the HTTP message is filtered out, and if it is not, the data in the HTTP message is retained.
  • the attack detection device uses multiple behavior detection models to determine multiple initial probability values based on the filtered HTTP packet flow data.
  • the implementation method of determining multiple initial probability values can refer to the aforementioned related introduction. , I won’t repeat it here.
  • Step 303 Determine a comprehensive probability value according to the multiple initial probability values.
  • the attack detection device can perform comprehensive processing on the multiple initial probability values to determine a comprehensive probability value, which is used to indicate that the host was attacked by EK in the process of transmitting the first data stream Possibility.
  • the attack detection device can determine multiple cross features based on the multiple initial probability values, where the cross feature refers to the multiplication of two different initial probability values among the multiple initial probability values.
  • the attack detection device can generate a cross feature vector based on the multiple cross features, and input the cross feature vector into the correlation analysis model to obtain the comprehensive probability value output by the correlation analysis model. Comprehensive analysis at multiple different stages.
  • the association analysis model is any trained machine learning model, such as a logistic regression model, a random forest model, and so on.
  • multiple cross features include F 12 , F 13 , F 14 , F 23 , F 24, and F 34 in Table 1 below.
  • the cross feature vector is [F 12 , F 13 , F 14 , F 23 , F 24 , F 34 ], or [F 12 , F 13 , F 14 , F 23 , F 34 , F 24 ], etc., which is
  • the order of the elements in the cross feature vector is any defined order.
  • the attack detection device can input the cross feature vector into the correlation analysis model to obtain the output comprehensive probability value P.
  • the attack detection device can also perform weighted calculation on the multiple initial probability values to obtain the comprehensive probability value.
  • the attack detection device can also perform vulnerability file detection and malware detection on the obtained HTTP packet stream data before determining multiple cross-features based on multiple initial probability values, to obtain the vulnerability file detection result and the malware detection result .
  • the method of performing vulnerability file detection and malware detection on HTTP packet stream data is a detection method based on IPS.
  • IPS can analyze the fields and characters included in the HTTP packet stream data to obtain the detection results of the vulnerable files and the malware detection results.
  • the vulnerability file detection result and the malware detection result are both 0 or 1.
  • a vulnerability file detection result of 0 indicates that the vulnerability file has not been downloaded
  • a value of 1 indicates that the vulnerability file has been downloaded
  • a malware detection result of 0 indicates that no malware has been downloaded.
  • a value of 1 indicates that malware has been downloaded.
  • the detection result of the vulnerable file and the detection result of the malware will mark the detection result of the vulnerable file and the detection result of the malware as E5 and E6, respectively.
  • the attack detection device can determine the multiple intersection features according to the plurality of initial probability values, the vulnerability file detection result and the malware detection result,
  • the cross feature is obtained by multiplying two different data in the multiple initial probability values, the detection result of the vulnerable file, and the detection result of the malware.
  • the cross-features include F 12 , F 13, and F 13 in Table 2 below.
  • the attack detection device can generate a probability matrix based on the multiple initial probability values, as well as the detection results of the vulnerable files and the malware detection results, and then filter multiple elements from the probability matrix according to the cross feature selection strategy. Use the filtered multiple elements as multiple cross-features.
  • the probability matrix is a matrix with X rows and X columns, X is the total number of multiple initial probability values, vulnerability file detection results, and malware detection results, and X rows and X columns correspond to multiple initial probability values and vulnerability file detection. Results and malware detection results.
  • the elements in the probability matrix are obtained by multiplying the two crossed data.
  • the cross feature selection strategy is a strategy determined based on experience to filter out redundant features.
  • the probability matrix D is a matrix with 6 rows and 6 columns, that is, X is equal to 6.
  • the element x in the probability matrix D is a redundant feature.
  • the redundant feature includes the feature obtained by multiplying the same data and the one determined based on experience Features that can be ignored for improving the accuracy of attack detection, elements F1 to F14 are multiple cross features filtered according to the cross feature selection strategy.
  • the attack detection device can generate cross feature vectors according to the aforementioned related introduction, input the cross feature vectors into the correlation analysis model, and output a comprehensive probability value.
  • the cross feature vector is [F 13 , F 14 ,..., F 56 ].
  • Step 304 If the comprehensive probability value is greater than the preset probability threshold, it is determined that there is an EK attack behavior in the process of the host transmitting the first data stream.
  • the attack detection device determines that there is an EK attack behavior in the process of the host transmitting the first data stream, that is, the host is attacked by the EK.
  • the attack detection device determines that there is an EK attack behavior in the process of the host transmitting the first data stream, that is, the host is attacked by EK . If the comprehensive probability value is 60%, the attack detection device determines that there is no EK attack behavior during the host transmitting the first data stream, that is, the host has not been attacked by the EK within the reference time period.
  • Fig. 5 is a flowchart of another method for detecting an attack provided by an embodiment of the present application.
  • the attack detection device can obtain the HTTP packet stream data transmitted by the host within the reference time period, and filter out the data in the HTTP packet that obviously does not need to be detected according to the filtering rule set.
  • the attack detection device inputs the remaining HTTP packet flow data after filtering into the redirection detection model, the attack object screening detection model, the vulnerability exploitation detection model, and the malware download detection model.
  • the HTTP packet flow data is processed through these four models. After processing, the initial probability values E1, E2, E3, and E4 are obtained.
  • the attack detection device can input the remaining HTTP packet stream data after filtering into the IPS, and obtain the vulnerability file detection result E5 and the malware detection result E6 after the IPS detection. After that, the attack detection device will input these four initial probability values, as well as the detection result of the vulnerability file and the detection result of the malware into the correlation analysis model, and determine the detection result according to the comprehensive probability value P output by the correlation analysis model. If the comprehensive probability value P is greater than the preset probability threshold, the detection result is that the host is attacked by EK within the reference time period. If the comprehensive probability value P does not exceed the preset probability threshold, the detection result is that the host is in the reference No EK attacked during the time period.
  • the attack detection device can report the detection result to the network management device after determining that the host has an EK attack behavior in the process of transmitting the first data stream, and the network management device can report the detection result according to the detection result. To take countermeasures.
  • the attack detection device in the embodiment of the present application uses multiple behavior detection models to determine whether the host has an EK attack behavior in the process of transmitting the first data stream.
  • the multiple attack behavior detection models are multiple models determined in advance based on training samples.
  • a method for determining multiple attack behavior detection models provided in the embodiments of the present application will be introduced. The method is applied to attack detection equipment or other computer equipment. Next, it will be applied to attack detection equipment as an example. Introduce this. That is, before determining a first probability value set including a plurality of first probability values through multiple behavior detection models according to the HTTP packet flow data, referring to FIG. 6, the attack behavior detection method further includes steps 401 and 402 .
  • Step 401 Obtain a plurality of training samples, and a sample label corresponding to each training sample in the plurality of training samples.
  • the attack detection device can obtain the HTTP packet transmitted by each sample host in the network from the HTTP proxy device, so as to determine a plurality of training samples.
  • the training sample includes the data in one or more sample HTTP messages belonging to the second data stream.
  • the sample tag is used to indicate whether the corresponding training sample is a positive training sample or a negative training sample.
  • the positive training sample refers to the HTTP that has not been attacked by EK Packet flow data.
  • Negative training samples refer to HTTP packet flow data attacked by EK. It should be noted that the second data streams to which different training samples belong may be the same or different.
  • the attack detection device can obtain multiple sample HTTP packet stream data, and the sample HTTP packet stream data refers to the data in the HTTP packet in the second data stream within a reference time period before the current time. After that, the attack detection device can filter each sample HTTP packet flow data in the multiple sample HTTP packet flow data according to the filtering rule set, and determine the remaining multiple sample HTTP packet flow data after filtering as multiple Training samples.
  • the attack detection device can perform a preprocessing operation on the obtained HTTP packet according to the definition of the data flow and the reference duration to obtain multiple training samples.
  • the attack detection device performs a preprocessing operation on the obtained HTTP packet to obtain an event list as follows, and the sample HTTP packet flow data in each row of Table 4 is a training sample.
  • T1 in Table 4 is the reference time length, and the definition of the data stream is a two-tuple.
  • Each sample HTTP packet stream data includes data in one or more sample HTTP packets, and the one or more The data in each sample HTTP message can be arranged in sequence according to the time sequence of transmission.
  • the attack detection device can filter the data in one or more sample HTTP packets included in each sample HTTP packet flow data according to the aforementioned filtering rule set, and treat the filtered sample HTTP packet flow data as multiple Training samples.
  • the attack detection device can determine whether each training sample is a positive training sample or a negative training sample according to the actual situation to determine the sample label corresponding to each training sample.
  • the label of the positive sample can be '1'
  • the negative training sample can be '0'.
  • the negative training sample may be known HTTP packet stream data attacked by EK, including real data and/or simulated data.
  • the simulated data refers to HTTP packet stream data generated by simulating an EK attack behavior.
  • Step 402 According to the multiple training samples and the sample label corresponding to each training sample in the multiple training samples, train multiple initial detection models to obtain multiple behavior detection models.
  • the attack detection device can separately train each of the multiple initial detection models to obtain multiple behavior detection models .
  • multiple initial detection models respectively correspond to different stages in the EK's attack behavior trajectory, that is, the initial detection models selected according to the behavior characteristics of different stages in the EK's attack behavior trajectory.
  • the attack detection device can select an initial detection model from the multiple initial detection models, and perform the following operations according to the selected initial detection model, until the following operations have been performed according to each of the multiple initial detection models: Determine the sample feature set corresponding to the selected initial detection model according to the sample HTTP message included in each training sample in the multiple training samples.
  • the sample feature set includes multiple sample feature vectors one-to-one corresponding to the multiple training samples ;
  • the multiple sample feature vectors are respectively input to the selected initial detection model, and the selected initial detection model is trained so that the output of the selected initial detection model corresponds to the corresponding training sample in the multiple training samples. Sample labels to get a behavior detection model.
  • the sample feature vectors used to train each initial detection model are also different.
  • the attack detection device can determine the sample feature set corresponding to the corresponding initial detection model based on the sample HTTP messages included in the multiple training samples and the behavior features included in the feature vector corresponding to each behavior detection model.
  • multiple behavior detection models are still used, including a redirection detection model, an attack object screening detection model, a vulnerability exploitation detection model, and a malware download detection model.
  • the feature vector corresponding to the redirection detection model is [X1, X2, X3]
  • the feature vector corresponding to the attack object screening detection model can be [X1, X2, X6]
  • the feature vector corresponding to the exploit detection model can be [X1, X2, X4, X5]
  • the feature vector corresponding to the malicious file download detection model can be For [X1, X2].
  • the sample feature set of the initial detection model corresponding to the redirection detection model can be:
  • n the total number of sample feature vectors included in the sample feature set.
  • the attack detection device After the attack detection device determines the sample feature set corresponding to the selected initial detection model, it can input multiple sample specific positive vectors included in the sample feature set into the selected initial detection model to train the selected initial detection model. , So that the output of the selected initial detection model is the sample label corresponding to the corresponding training sample in the multiple training samples. That is, in the embodiment of the present application, the process of training multiple initial detection models is a process of supervised learning.
  • the association analysis model in the above attack detection model is also a model determined in advance based on training samples, that is, after multiple behavior detection models are obtained, referring to FIG. 7, the attack behavior detection method further includes step 403 and step 404.
  • Step 403 Determine the sample cross feature set according to the multiple behavior detection models and the sample feature set corresponding to each behavior detection model in the multiple behavior detection models.
  • the attack detection device can select a training sample from the multiple training samples, and perform the following processing on the selected training sample until the multiple training samples are processed Up to each training sample in: input the sample feature vector corresponding to the selected training sample from the sample feature set corresponding to the multiple behavior detection models to the multiple behavior detection models to obtain the samples respectively output by the multiple behavior detection models Probability value, thereby obtaining multiple sample probability values; According to the multiple sample probability values, multiple sample crossover features are determined, and the sample crossover feature refers to the multiplication of two different data in the multiple sample probability values; A sample cross feature vector is generated according to the multiple sample cross features.
  • the attack detection device can determine the sample cross feature set according to the multiple sample cross feature vectors corresponding to the multiple training samples one-to-one, that is, the sample cross feature set includes one-to-one with multiple training samples. Corresponding cross feature vector of multiple samples.
  • the attack detection device can also perform vulnerability file detection and malware detection on the multiple training samples, respectively, to obtain the vulnerability file detection result and the vulnerability file corresponding to each training sample in the multiple training samples. Malware detection result.
  • the attack detection device can detect the vulnerability file according to the multiple behavior detection models, the sample feature set corresponding to each behavior detection model in the multiple behavior detection models, and the vulnerability file detection results corresponding to each training sample in the multiple training samples. Malware detection results to determine the cross-feature set of the sample.
  • the attack detection device can perform vulnerability file detection and malware detection on each training sample according to the IPS.
  • the attack detection device can select a training sample from the multiple training samples, and perform the following processing on the selected training sample until each training sample in the multiple training samples is processed:
  • the sample feature vectors corresponding to the selected training samples in the sample feature set corresponding to the multiple behavior detection models are input into the multiple behavior detection models respectively, and the sample probability values respectively output by the multiple behavior detection models are obtained, thereby obtaining multiple samples Probability value; according to the probability values of the multiple samples, and the detection results of the vulnerability files and the malware detection results corresponding to the selected training samples, determine the cross characteristics of multiple samples.
  • the sample cross feature refers to the multiplication of the probability values of the multiple samples, the detection result of the vulnerability file corresponding to the selected training sample, and the two different data in the malware detection result;
  • the sample cross feature generates a sample cross feature vector.
  • the implementation of determining the cross features of multiple samples can refer to the above description of determining multiple cross features. , I won’t repeat it here.
  • Step 404 Input the multiple sample cross feature vectors into the initial analysis model, and train the initial analysis model so that the output of the initial analysis model is the sample label corresponding to the corresponding training sample in the multiple training samples, thereby obtaining Association analysis model.
  • the attack detection device can input the multiple sample cross feature vectors into the initial analysis model, and train the initial analysis model so that the output of the initial analysis model They are respectively the sample labels corresponding to the corresponding training samples in the multiple training samples, so as to obtain the association analysis model. That is, the process of training to obtain the association analysis model in the embodiment of the present application is a process of supervised learning.
  • the association analysis model is a model determined according to any machine learning algorithm, which is not limited in the embodiment of the present application. If the correlation analysis model is a logistic regression model, after training the model, the weight corresponding to each sample cross feature in the sample cross feature vector can also be obtained. The weight is used to characterize the importance of each cross feature. If the analysis model determines that the host is attacked by EK, it can carry in the alarm information why it is determined that the host is attacked by EK.
  • this solution can obtain the HTTP packet flow data of the host in a period of time and pass multiple behavior detections.
  • the model is processed to determine multiple initial probability values, and because the multiple behavior detection models are respectively used to describe the multiple different stages, this solution can completely describe the EK's attack behavior trajectory.
  • the multiple initial probability values can be comprehensively processed to obtain a comprehensive probability value, that is, this solution can comprehensively analyze the behavior patterns of EK attacks at various stages, and more accurately determine that the host is transmitting data
  • the probability of being attacked by EK in the process of streaming that is, more accurate detection of EK's attack behavior.
  • this solution can quickly and accurately detect EK attacks without seriously consuming the resources of the host itself.
  • the HTTP packet stream data obtained in this solution only contains regular data specified by the network protocol, compared to the method of obtaining script codes for parsing, the risk of infringing user privacy in this solution is very low.
  • FIG. 8 is a schematic structural diagram of an attack detection device provided by an embodiment of the present application.
  • the attack detection device 800 can be implemented as part or all of an attack detection device by software, hardware, or a combination of the two.
  • the attack detection device may It is the attack detection equipment shown in Figure 1. Referring to FIG. 8, the device includes: a first obtaining module 801, a first determining module 802, a second determining module 803, and a third determining module 804.
  • the first obtaining module 801 is configured to obtain HTTP packet stream data transmitted by the host within a reference time period.
  • the HTTP packet stream data includes data in one or more HTTP packets, and one or more HTTP packets belong to the first Data stream, the reference time period is the time period before the current time and the reference duration of the current time;
  • the first determining module 802 is used to determine multiple initial probability values through multiple behavior detection models according to the HTTP packet flow data.
  • the multiple behavior detection models are used to describe different stages of the EK's attack behavior trajectory.
  • the probability value refers to the probability value output by one behavior detection model among multiple behavior detection models;
  • the second determining module 803 is configured to determine a comprehensive probability value according to the multiple initial probability values, and the comprehensive probability value is used to indicate the possibility of the host being attacked by the EK in the process of transmitting the first data stream;
  • the third determining module 804 is configured to determine that if the comprehensive probability value is greater than the preset probability threshold, there is an EK attack behavior in the process of the host transmitting the first data stream.
  • the first determining module 802 is specifically configured to:
  • the multiple behavior detection models include at least two of the following models: a redirection detection model, an attack object screening detection model, a vulnerability exploitation detection model, and a malware download detection model.
  • the second determining module 803 includes:
  • the first determining unit is configured to determine multiple cross features according to the multiple initial probability values, where the cross feature refers to a product obtained by multiplying two different initial probability values among the multiple initial probability values;
  • a generating unit configured to generate a cross feature vector according to the multiple cross features
  • the comprehensive analysis unit is used to input the cross feature vector into the correlation analysis model to obtain the comprehensive probability value output by the correlation analysis model.
  • the correlation analysis model is used to comprehensively analyze multiple different stages in the EK's attack behavior trajectory.
  • the second determining module 803 further includes:
  • the second determining unit is used to perform vulnerability file detection and malware detection on the HTTP packet stream data to obtain the vulnerability file detection result and the malware detection result;
  • the first determining unit is specifically used for:
  • Cross-features refer to two differences among multiple initial probability values, the detection result of the vulnerable file, and the malware detection result. After multiplying the data.
  • the first determining unit is specifically configured to:
  • the probability matrix is a matrix of X rows and X columns.
  • X is the multiple initial probability values, the detection result of the vulnerable file, and the malware detection.
  • the total number of results, X rows and X columns correspond to multiple initial probability values, vulnerability file detection results, and malware detection results.
  • the elements in the probability matrix are obtained by multiplying the two intersecting data;
  • multiple elements are selected from the probability matrix, and the multiple elements selected are used as multiple cross features.
  • the device 800 further includes:
  • the first filtering unit is configured to filter the HTTP packet flow data according to the filtering rule set
  • the first determining module is specifically used for:
  • multiple behavior detection models are used to determine multiple initial probability values.
  • the filtering rule set includes but is not limited to the following rules:
  • the first filter rule, the matching item of the first filter rule is: a reference type set containing one or more types of operating systems, the reference type set includes types of operating systems whose probability of being attacked by EK is less than the reference probability threshold, the first filter
  • the action of the rule is: filter out.
  • the first filter rule is used to filter out the data in the first-destination HTTP packet.
  • the first-destination HTTP packet refers to the HTTP packet whose operating system type is included in the reference type set. ;and / or
  • the second filter rule, the matching item of the second filter rule is: one or more intranet addresses
  • the action of the second filter rule is: filter out
  • the second filter rule is used to filter the data in the second destination HTTP message
  • the second destination HTTP message refers to the HTTP message carrying the destination address of the intranet address;
  • the third filter rule The matching item of the third filter rule is: a reference domain name set containing one or more domain names.
  • the reference domain name set includes domain names whose access frequency is greater than the frequency threshold.
  • the action of the third filter rule is: filter out, first
  • the three filtering rules are used to filter out the data in the third-destination HTTP message.
  • the third-destination HTTP message refers to the HTTP message whose domain name is included in the reference domain name set.
  • the device 800 further includes:
  • the second acquisition module is used to acquire a plurality of training samples and a sample label corresponding to each training sample in the plurality of training samples, the training samples include data in one or more sample HTTP messages belonging to the second data stream,
  • the sample tag is used to indicate whether the corresponding training sample is a positive training sample or a negative training sample.
  • the positive training sample refers to the HTTP packet stream data that has not been attacked by EK
  • the negative training sample refers to the HTTP packet stream data that is attacked by EK
  • the first training module is used to train multiple initial detection models according to the multiple training samples and the sample label corresponding to each training sample in the multiple training samples to obtain multiple behavior detection models.
  • the detection models correspond to different stages in the trajectory of EK's attack behavior.
  • the second acquisition module includes:
  • the acquiring unit is configured to acquire multiple sample HTTP packet stream data, where the sample HTTP packet stream data refers to the data in the HTTP packet in the second data stream within a reference time period before the current time;
  • the second filtering unit is configured to filter each sample HTTP packet flow data among the multiple sample HTTP packet flow data according to the filtering rule set;
  • the third determining unit is configured to determine the multiple sample HTTP packet stream data remaining after filtering as multiple training samples.
  • the first training module is specifically used for:
  • the multiple sample feature vectors are respectively input to the selected initial detection model, and the selected initial detection model is trained so that the output of the selected initial detection model is the sample label corresponding to the corresponding training sample in the multiple training samples, thereby Get a behavior detection model.
  • the device 800 further includes:
  • the second acquisition module is used to acquire a plurality of training samples and a sample label corresponding to each training sample in the plurality of training samples, the training samples include data in one or more sample HTTP messages belonging to the second data stream,
  • the sample tag is used to indicate whether the corresponding training sample is a positive training sample or a negative training sample.
  • the positive training sample refers to the HTTP packet stream data that has not been attacked by EK
  • the negative training sample refers to the HTTP packet stream data that is attacked by EK
  • the first training module is used to train multiple initial detection models according to the multiple training samples and the sample label corresponding to each training sample in the multiple training samples to obtain multiple behavior detection models.
  • the detection models correspond to different stages in the trajectory of EK's attack behavior;
  • the third determining module is used to determine a sample cross feature set based on the multiple behavior detection models and the sample feature set corresponding to each behavior detection model in the multiple behavior detection models.
  • the sample cross feature set includes multiple training samples One-to-one correspondence of multiple sample cross feature vectors;
  • the second training module is used to input the multiple sample cross feature vectors into the initial analysis model, and train the initial analysis model so that the output of the initial analysis model is the sample label corresponding to the corresponding training sample in the multiple training samples. , So as to get the correlation analysis model.
  • the device 800 further includes:
  • the fourth determining module is used to perform vulnerability file detection and malware detection on the multiple training samples, respectively, to obtain the vulnerability file detection result and the malware detection result corresponding to each training sample in the multiple training samples;
  • the third determining module is used to:
  • the fourth determining unit is used for detecting the vulnerability file corresponding to each of the plurality of behavior detection models, the sample feature set corresponding to each of the plurality of behavior detection models, and each of the plurality of training samples And malware detection results to determine the cross-feature set of the sample.
  • the third determining module is specifically configured to:
  • the sample feature vectors corresponding to the selected training samples from the sample feature sets corresponding to the multiple behavior detection models are input into the multiple behavior detection models respectively, and the probability values of the samples respectively output by the multiple behavior detection models are obtained, thereby obtaining multiple samples Probability value
  • the cross features of the samples refer to the multiple sample probability values and the vulnerabilities corresponding to the selected training samples. It is obtained by multiplying two different data in the file detection result and the malware detection result;
  • a sample cross feature vector is generated according to the multiple sample cross features.
  • this solution can obtain the HTTP packet flow data of the host in a period of time.
  • processing through multiple behavior detection models to determine multiple initial probability values and since the multiple behavior detection models are used to describe the multiple different stages, this solution can completely describe the EK's attack behavior trajectory.
  • the multiple initial probability values can be comprehensively processed to obtain a comprehensive probability value, that is, this solution can comprehensively analyze the behavior patterns of EK attacks at various stages, and more accurately determine that the host is transmitting data
  • the probability of being attacked by EK in the process of streaming that is, more accurate detection of EK's attack behavior.
  • this solution can quickly and accurately detect EK attacks without seriously consuming the resources of the host itself.
  • the HTTP packet stream data obtained in this solution only contains regular data specified by the network protocol, compared to the method of obtaining script codes for parsing, the risk of infringing user privacy in this solution is very low.
  • attack detection device when the attack detection device provided in the above embodiment detects an attack, only the division of the above functional modules is used as an example. In practical applications, the above functions can be allocated by different functional modules according to needs. , The internal structure of the device is divided into different functional modules to complete all or part of the functions described above.
  • attack behavior detection device provided in the foregoing embodiment and the attack behavior detection method embodiment belong to the same concept, and the specific implementation process is detailed in the method embodiment, and will not be repeated here.
  • the computer program product includes one or more computer instructions.
  • the computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable devices.
  • the computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium. For example, the computer instructions may be transmitted from a website, computer, server, or data center.
  • the computer-readable storage medium may be any available medium that can be accessed by a computer, or a data storage device such as a server or a data center integrated with one or more available media.
  • the usable medium may be a magnetic medium (for example: floppy disk, hard disk, tape), optical medium (for example: digital versatile disc (DVD)) or semiconductor medium (for example: solid state disk (SSD)) Wait.
  • the computer-readable storage medium mentioned in this application may be a non-volatile storage medium, in other words, it may be a non-transitory storage medium.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Virology (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

本申请公开了一种攻击行为检测方法、装置及攻击检测设备,属于网络安全技术领域。所述方法包括:获取主机在参考时间段内传输的HTTP报文流数据,通过多个行为检测模型确定多个初始概率值,根据该多个初始概率值确定综合概率值,如果综合概率值大于预设概率阈值,则确定检测到EK的攻击行为。由于该多个行为检测模型分别用于描述EK的攻击行为轨迹中不同阶段,因此,本方案能够完整刻画EK的攻击行为轨迹,并综合各个阶段的初始概率值,更加准确地检测EK的攻击行为。另外,本方案也不会严重耗费主机本身的资源,且由于获取的数据仅包含网络协议规定的常规数据,因此,相比于获取脚本代码解析的方法,本方案存在的侵犯用户隐私的风险很低。

Description

攻击行为检测方法、装置及攻击检测设备
本申请要求于2020年2月27日提交的申请号为202010123839.X、发明名称为“攻击行为检测方法、装置及攻击检测设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及网络安全技术领域,特别涉及一种攻击行为检测方法、装置及攻击检测设备。
背景技术
当前,恶意分子可以使用漏洞利用工具包(exploit kit,EK)来传播恶意软件,达到对诸如用户终端等主机进行攻击的目的。EK是一套工具集,也可以认为是一种基于下载方式传播恶意软件的攻击手段。当主机访问含有EK的恶意网站时,EK会利用主机的上网环境中的漏洞信息,来选择对应的恶意软件对主机进行攻击。而如果能够及时地检测到EK的攻击行为,则可以提醒用户及时采取措施应对EK的攻击,最大程度地减少用户损失。
在相关技术中,主机在访问一个网站的过程中,能够收集并检测该网站的脚本代码,并解析这些脚本代码,生成脚本代码的签名。之后,主机将生成的签名与存储的签名库中的签名进行对比,来确定主机访问网站的过程中是否存在EK的攻击行为。其中,存储的签名库中的签名是根据已知EK的恶意代码通过签名算法生成的。
然而,收集的脚本代码中通常包含有用户的隐私数据,这样就会存在侵犯用户隐私的风险。并且,解析脚本代码所耗费的处理器资源以及内存资源很大,会造成主机的性能下降。
发明内容
本申请提供了一种攻击行为检测方法、装置及攻击检测设备,能够降低相关技术中存在的侵犯用户隐私的风险,且在不消耗主机的资源的情况下,提高攻击行为检测的准确率。所述技术方案如下:
第一方面,提供了一种攻击行为检测方法,该方法包括:
获取主机在参考时间段内传输的超文本传输协议(hyper text transfer protocol,HTTP)报文流数据,该HTTP报文流数据包括一个或多个HTTP报文中的数据,该一个或多个HTTP报文属于第一数据流,参考时间段为当前时间之前且距离当前时间参考时长的时间段;根据该HTTP报文流数据,通过多个行为检测模型,确定多个初始概率值,该多个行为检测模型分别用于描述EK的攻击行为轨迹中的不同阶段,初始概率值是指该多个行为检测模型中的一个行为检测模型输出的概率值;根据该多个初始概率值确定综合概率值,综合概率值用于指示该主机传输该第一数据流的过程中被EK攻击的可能性;如果该综合概率值大于预设概率阈值,则确定该主机传输该第一数据流的过程中存在EK的攻击行为。
在本申请中,EK攻击行为轨迹包括多个不同阶段,主机在运行过程中,会传输HTTP报文流(数据流)数据,而被EK攻击的主机所传输的HTTP报文流数据会携带EK攻击的 一些行为特征。基于此,攻击检测设备能够获取主机在参考时间段内传输的HTTP报文流数据,之后对该HTTP报文流数据进行分析处理,以此来检测该主机是否被EK攻击,也即是该主机在传输数据流的过程中是否存在EK的攻击行为。
如果主机被EK攻击,则主机与EK的攻击设备之间会持续传输一段时间的HTTP报文,因此,本方案需要获取一段时间的HTTP报文流数据,也即是获取参考时间段内的HTTP报文流数据,该HTTP报文流数据包括一个或多个HTTP报文中的数据,且该一个或多个HTTP报文属于第一数据流,第一数据流是该主机与某个设备之间传输HTTP报文所形成的数据流,该参考时间段为当前时间之前且距离当前时间参考时长的时间段。
在本申请中,攻击检测设备能够对获取的HTTP报文流数据进行数据预处理,得到多个行为检测模型中每个行为检测模型的输入,通过每个行为检测模型对相应的输入进行处理,得到相应的初始概率值,以得到多个初始概率值。其中,该多个行为检测模型分别用于描述EK的攻击行为轨迹中的不同阶段。
可选地,根据该HTTP报文流数据,通过多个行为检测模型,确定多个初始概率值,包括:从该多个行为检测模型中选择一个行为检测模型,根据选择的行为检测模型执行以下操作,直至根据该多个行为检测模型中的每个行为检测模型均已执行以下操作为止:
根据该HTTP报文流数据,确定该选择的行为检测模型对应的特征向量;将该特征向量输入该选择的行为检测模型,获得该选择的行为检测模型输出的初始概率值。
在本申请中,由于该多个行为检测模型分别用于描述EK攻击行为轨迹中的不同阶段,也即一个行为检测模型能够用于描述EK攻击行为轨迹中的一个阶段,而不同阶段的EK攻击行为的行为特征存在不同,因此,输入各个的行为检测模型的特征向量也不同,也即是根据主机的HTTP报文流数据确定的各个行为检测模型对应的特征向量也不同。
在本申请中,EK的攻击行为轨迹包括重定向阶段、攻击对象筛选阶段、漏洞利用阶段、恶意软件下载阶段等阶段,基于此,多个行为检测模型至少能够描述这四个阶段中的任意两个阶段,也即是该多个行为检测模型至少包括分别用于描述这四个阶段中任意两个阶段的两个模型。也即是,可选地,该多个行为检测模型包括至少两个以下模型:重定向检测模型、攻击对象筛选检测模型、漏洞利用检测模型和恶意软件下载检测模型。
其中,重定向检测模型用于描述EK攻击行为轨迹中的重定向阶段,这个阶段EK攻击会将用户正在浏览的网页等进行重定向。攻击对象筛选模型用于描述EK的攻击行为轨迹中的攻击对象筛选阶段,这个阶段EK会根据主机传输的HTTP报文中携带的操作系统、浏览器版本等信息,来筛选攻击对象。漏洞利用检测模型用于描述EK的攻击行为轨迹中的漏洞利用阶段,这个阶段EK会分析主机中存在的漏洞,下载漏洞文件到主机上,例如,主机上低版本的Flash插件可能会存在漏洞,EK会下载一个Flash漏洞文件到主机上,对主机进行爆破等。恶意软件下载检测模型用于描述EK的攻击行为轨迹中的恶意软件下载阶段,这个阶段EK会下载恶意软件到主机上,比如,木马软件、勒索软件等。
在本申请实施例中,攻击检测设备能够从HTTP报文流数据中获取每个行为检测模型对应的特征向量包括的一个或多个特征。而且,一个HTTP报文中的数据包括多个字段,每个字段表示一种信息,特征向量包括的各个特征能从这些字段中分别获取。
需要说明的是,EK攻击行为的多种行为特征,可以分为公有特征和独有特征,其中,公有特征为各个行为检测模型公有的特征,或者某些行为检测模型公有的特征,独有特征为某 个行为检测模型独有的特征。也即是,每个行为检测模型对应的特征向量包括一种或多种特征,该一种或多种特征的其中一部分为公有特征,另一部分为独有特征。
对于重定向检测模型来说,由于重定向检测模型用于描述EK的攻击行为轨迹中的重定向阶段,因此,主机传输的HTTP报文中可能会携带Location(定向)字段,Location字段用于将主机浏览的网页进行重定向。基于此,重定向检测模型对应的特征向量包括重定向阶段的行为特征,例如HTTP报文消息码、统一资源定位符(uniform resource locator,URL)字段的长度、是否携带Location字段等特征。其中,HTTP报文消息码为重定向检测模型和漏洞利用检测模型的公有特征,URL字段的长度为上述四个模型的公有特征,HTTP报文流数据是否携带Location字段为重定向检测模型的独有特征。
对于攻击对象筛选模型来说,由于攻击对象筛选模型用于描述EK的攻击行为轨迹中的攻击对象筛选阶段,因此,攻击对象筛选模型对应的特征向量包括攻击对象筛选阶段的行为特征,例如URL字段的长度、操作系统类型等。可选地,操作系统类型为攻击对象筛选模型的独有特征。
对于漏洞利用检测模型来说,由于漏洞利用检测模型用于描述EK的攻击行为轨迹中的漏洞利用阶段,且这个阶段主机传输的HTTP报文的字段可能会被更改,包括增加字段、篡改数据、加密字段等。基于此,漏洞利用检测模型对应的特征向量包括漏洞利用阶段的行为特征,例如HTTP报文消息码、URL字段的长度、URL字段是否含有Base64模式编码子串、是否携带X-Flash-Version字段等。可选地,URL字段是否含有Base64模式编码子串、是否携带X-Flash-Version字段为漏洞利用检测模型的独有特征。
对于恶意软件下载检测模型来说,由于恶意软件下载检测模型用于描述EK的攻击行为轨迹中的恶意软件下载阶段,因此,恶意软件下载检测模型对应的特征向量包括恶意软件下载阶段的行为特征,例如,HTTP报文消息码、URL字段的长度、HTTP报文的Content-Type(内容类型)字段、Content-Length(内容长度)字段等。需要说明的是,HTTP报文包括HTTP请求报文和HTTP响应报文,HTTP响应报文携带Content-Type、Content-Length字段。
在确定一个模型对应的特征向量之后,攻击检测设备将该特征向量输入对应的模型,并将该模型输出的概率值作为一个初始概率值,对这四个模型均执行该操作,得到四个初始概率值。
可选地,根据该HTTP报文流数据,通过多个行为检测模型,确定多个初始概率值之前,还包括:根据过滤规则集,对该HTTP报文流数据进行过滤;根据该HTTP报文流数据,通过多个行为检测模型,确定多个初始概率值,包括:根据过滤后剩余的HTTP报文流数据,通过该多个行为检测模型,确定该多个初始概率值。也即是,攻击检测设备可以先过滤掉明显不需要检测的HTTP报文中的数据,之后,再通过该多个行为检测模型确定多个初始概率值。
可选地,该过滤规则集包括但不限于以下的规则:
第一过滤规则,该第一过滤规则的匹配项为:包含一个或多个操作系统的类型的参考类型集,该参考类型集中包括被EK攻击的概率小于参考概率阈值的操作系统的类型,该第一过滤规则的动作为:过滤掉,该第一过滤规则用于将第一目的HTTP报文中的数据过滤掉,第一目的HTTP报文是指携带的操作系统的类型包含在该参考类型集中的HTTP报文;和/或
第二过滤规则,该第二过滤规则的匹配项为:一个或多个内网地址,该第二过滤规则的 动作为:过滤掉,该第二过滤规则用于将第二目的HTTP报文中的数据过滤掉,第二目的HTTP报文是指携带的目的地址为内网地址的HTTP报文;和/或
第三过滤规则,该第三过滤规则的匹配项为:包含一个或多个域名的参考域名集,该参考域名集中包括被访问频率大于频率阈值的域名,该第三过滤规则的动作为:过滤掉,该第三过滤规则用于将第三目的HTTP报文中的数据过滤掉,第三目的HTTP报文是指携带的域名包含在该参考域名集中的HTTP报文。
也即是,攻击检测设备在获取到主机在参考时间段内传输的HTTP报文流数据之后,能够将携带的操作系统的类型为低风险的操作系统、和/或携带的目的地址为内网地址、和/或携带的域名为高访频率的域名的HTTP报文中的数据过滤掉。
在本申请中,由于仅根据单一一个初始概率值,往往不能确定是否存在EK攻击行为,也即是单方面的因素往往和EK的关联度较低,因此,在通过多个行为检测模型,确定多个初始概率值之后,攻击检测设备能够对该多个初始概率值进行综合处理,确定一个综合概率值,该综合概率值用于指示该主机传输第一数据流的过程中被EK攻击的可能性。
可选地,该根据该多个初始概率值确定综合概率值,包括:根据该多个初始概率值确定多个交叉特征,交叉特征是指该多个初始概率值中两个不同的初始概率值相乘后得到的;根据该多个交叉特征生成交叉特征向量;将该交叉特征向量输入关联分析模型,获得该关联分析模型输出的该综合概率值,关联分析模型用于对EK的攻击行为轨迹中的多个不同阶段进行综合分析。
可选地,根据该多个初始概率值确定多个交叉特征之前,还包括:对该HTTP报文流数据进行漏洞文件检测和恶意软件检测,得到漏洞文件检测结果和恶意软件检测结果;根据该多个初始概率值确定多个交叉特征,包括:根据该多个初始概率值,以及该漏洞文件检测结果和该恶意软件检测结果,确定该多个交叉特征,该交叉特征是指该多个初始概率值、该漏洞文件检测结果和该恶意软件检测结果中的两个不同的数据相乘后得到的。
其中,对HTTP报文流数据进行漏洞文件检测和恶意软件检测的方法是根据入侵防御系统(intrusion prevent system,IPS)进行检测的方法。IPS能够对HTTP报文流数据包括的字段、字符等进行分析,得到漏洞文件检测结果和恶意软件检测结果。
可选地,该根据该多个初始概率值,以及该漏洞文件检测结果和恶意软件检测结果,确定该多个交叉特征,包括:根据该多个初始概率值,以及该漏洞文件检测结果和恶意软件检测结果,生成一个概率矩阵,该概率矩阵为X行X列的矩阵,X为该多个初始概率值、该漏洞文件检测结果和恶意软件检测结果的总个数,X行和X列均对应该多个初始概率值、该漏洞文件检测结果和恶意软件检测结果,该概率矩阵中的元素是将交叉的两个数据相乘后得到的;按照交叉特征选择策略,从该概率矩阵中筛选出多个元素,将筛选出的多个元素作为该多个交叉特征。需要说明的是,在本申请中,交叉特征选择策略是根据经验确定的一种策略,以将冗余特征筛除。
在按照交叉特征选择策略筛选得到多个交叉特征之后,攻击检测设备能够根据前述相关介绍生成交叉特征向量,并将交叉特征向量输入关联分析模型,并输出综合概率值。如果综合概率值大于预设概率阈值,则攻击检测设备确定该主机传输第一数据流的过程中存在EK的攻击行为,也即是该主机被EK攻击。
需要说明的是,上述多个攻击行为检测模型为事先根据训练样本确定的多个模型。也即 是,根据该HTTP报文流数据,通过该多个行为检测模型,确定多个初始概率值之前,还包括:获取多个训练样本,以及该多个训练样本中每个训练样本对应的样本标签,训练样本包括属于第二数据流的一个或多个样本HTTP报文中的数据,样本标签用于指示对应的训练样本为正训练样本还是负训练样本,正训练样本是指未被EK攻击的HTTP报文流数据,负训练样本是指被EK攻击的HTTP报文流数据;根据该多个训练样本,以及该多个训练样本中每个训练样本对应的样本标签,对多个初始检测模型进行训练,得到该多个行为检测模型,该多个初始检测模型分别对应EK的攻击行为轨迹中的不同阶段。
可选地,获取多个训练样本,包括:获取多个样本HTTP报文流数据,样本HTTP报文流数据是指该第二数据流中位于当前时间之前的该参考时长内的HTTP报文中的数据;根据过滤规则集,对该多个样本HTTP报文流数据中的每个样本HTTP报文流数据进行过滤;将过滤后剩余的多个样本HTTP报文流数据确定为多个训练样本。
需要说明的书,在本申请中,攻击检测设备能够按照数据流的定义以及参考时长对获取的HTTP报文进行预处理操作,以得到多个训练样本。
可选地,根据该多个训练样本,以及该多个训练样本中每个训练样本对应的样本标签,对多个初始检测模型进行训练,得到该多个行为检测模型,包括:从该多个初始检测模型中选择一个初始检测模型,根据选择的初始检测模型执行以下操作,直至根据该多个初始检测模型中的每个初始检测模型均已执行以下操作为止:
根据该多个训练样本中每个训练样本包括的样本HTTP报文,确定选择的初始检测模型对应的样本特征集,该样本特征集包括与该多个训练样本一一对应的多个样本特征向量;将该多个样本特征向量分别输入该选择的初始检测模型,对选择的初始检测模型进行训练,以使选择的初始检测模型的输出分别为该多个训练样本中相应训练样本对应的样本标签,从而得到一个行为检测模型。
可选地,根据该多个训练样本,以及该多个训练样本中每个训练样本对应的样本标签,对多个初始检测模型进行训练,得到该多个行为检测模型之后,还包括:根据该多个行为检测模型,以及该多个行为检测模型中每个行为检测模型对应的样本特征集,确定样本交叉特征集,该样本交叉特征集包括与该多个训练样本一一对应的多个样本交叉特征向量;将该多个样本交叉特征向量分别输入初始分析模型,对该初始分析模型进行训练,以使该初始分析模型的输出分别为该多个训练样本中相应训练样本对应的样本标签,从而得到关联分析模型。
在本申请中,由于该多个行为检测模型分别用于描述EK的攻击行为轨迹中的不同阶段,因此,用于训练各个初始检测模型的样本特征向量也不相同。攻击检测设备能够根据多个训练样本包括的样本HTTP报文,以及每个行为检测模型对应的特征向量包括的行为特征,来确定相应初始检测模型对应的样本特征集。
可选地,根据该多个行为检测模型,以及该多个行为检测模型中每个行为检测模型对应的样本特征集,确定样本交叉特征集之前,还包括:对该多个训练样本分别进行漏洞文件检测和恶意软件检测,得到该多个训练样本中每个训练样本对应的漏洞文件检测结果和恶意软件检测结果;该根据该多个行为检测模型,以及该多个行为检测模型中每个行为检测模型对应的样本特征集,确定样本交叉特征集,包括:根据该多个行为检测模型、该多个行为检测模型中每个行为检测模型对应的样本特征集,以及该多个训练样本中每个训练样本对应的漏洞文件检测结果和恶意软件检测结果,确定该样本交叉特征集。也即是,攻击检测设备能够 根据IPS对每个训练样本进行漏洞文件检测和恶意软件检测。
可选地,根据该多个行为检测模型、该多个行为检测模型中每个行为检测模型对应的样本特征集,以及该多个训练样本中每个训练样本对应的漏洞文件检测结果和恶意软件检测结果,确定该样本交叉特征集,包括:从该多个训练样本中选择一个训练样本,对选择的训练样本执行以下处理,直至处理完该多个训练样本中的每个训练样本为止:
将该多个行为检测模型对应的样本特征集中与选择的训练样本对应的样本特征向量分别输入该多个行为检测模型,获得该多个行为检测模型分别输出的样本概率值,从而得到多个样本概率值;根据该多个样本概率值,以及选择的训练样本对应的漏洞文件检测结果和恶意软件检测结果,确定多个样本交叉特征,样本交叉特征是指该多个样本概率值、选择的训练样本对应的漏洞文件检测结果和恶意软件检测结果中的两个不同的数据相乘后得到的;根据该多个样本交叉特征生成一个样本交叉特征向量。
需要说明的是,根据多个样本概率值,以及所选择的训练样本对应的漏洞文件检测结果和恶意软件检测结果,确定多个样本交叉特征的实现方式可以参照前述确定多个交叉特征的相关介绍,这里不再赘述。
第二方面,提供了一种攻击行为检测装置,所述攻击行为检测装置具有实现上述第一方面中攻击行为检测方法行为的功能。所述攻击行为检测装置包括一个或模块,该一个或多个模块用于实现上述第一方面所提供的攻击行为检测方法。
也即是,本申请提供了一种攻击行为检测装置,该装置包括:
第一获取模块,用于获取主机在参考时间段内传输的HTTP报文流数据,HTTP报文流数据包括一个或多个HTTP报文中的数据,一个或多个HTTP报文属于第一数据流,参考时间段为当前时间之前且距离当前时间参考时长的时间段;
第一确定模块,用于根据HTTP报文流数据,通过多个行为检测模型,确定多个初始概率值,该多个行为检测模型分别用于描述EK的攻击行为轨迹中的不同阶段,初始概率值是指多个行为检测模型中的一个行为检测模型输出的概率值;
第二确定模块,用于根据该多个初始概率值确定综合概率值,综合概率值用于指示主机传输第一数据流的过程中被EK攻击的可能性;
第三确定模块,用于如果该综合概率值大于预设概率阈值,则确定该主机传输第一数据流的过程中存在EK的攻击行为。
可选地,第一确定模块具体用于:
从该多个行为检测模型中选择一个行为检测模型,根据选择的行为检测模型执行以下操作,直至根据多个行为检测模型中的每个行为检测模型均已执行以下操作为止:
根据HTTP报文流数据,确定选择的行为检测模型对应的特征向量;
将该特征向量输入选择的行为检测模型,获得选择的行为检测模型输出的初始概率值。
可选地,该多个行为检测模型包括至少两个以下模型:重定向检测模型、攻击对象筛选检测模型、漏洞利用检测模型和恶意软件下载检测模型。
可选地,第二确定模块包括:
第一确定单元,用于根据该多个初始概率值确定多个交叉特征,交叉特征是指多个初始概率值中两个不同的初始概率值相乘后得到的;
生成单元,用于根据该多个交叉特征生成交叉特征向量;
综合分析单元,用于将该交叉特征向量输入关联分析模型,获得关联分析模型输出的综合概率值,该关联分析模型用于对EK的攻击行为轨迹中的多个不同阶段进行综合分析。
可选地,第二确定模块还包括:
第二确定单元,用于对该HTTP报文流数据进行漏洞文件检测和恶意软件检测,得到漏洞文件检测结果和恶意软件检测结果;
第一确定单元具体用于:
根据该多个初始概率值,以及该漏洞文件检测结果和恶意软件检测结果,确定多个交叉特征,交叉特征是指多个初始概率值、漏洞文件检测结果和恶意软件检测结果中的两个不同的数据相乘后得到的。
可选地,第一确定单元具体用于:
根据该多个初始概率值,以及漏洞文件检测结果和恶意软件检测结果,生成一个概率矩阵,概率矩阵为X行X列的矩阵,X为多个初始概率值、漏洞文件检测结果和恶意软件检测结果的总个数,X行和X列均对应多个初始概率值、漏洞文件检测结果和恶意软件检测结果,概率矩阵中的元素是将交叉的两个数据相乘后得到的;
按照交叉特征选择策略,从该概率矩阵中筛选出多个元素,将筛选出的多个元素作为多个交叉特征。
可选地,该装置还包括:
第一过滤单元,用于根据过滤规则集,对HTTP报文流数据进行过滤;
第一确定模块具体用于:
根据过滤后剩余的HTTP报文流数据,通过多个行为检测模型,确定多个初始概率值。
可选地,该过滤规则集包括但不限于以下的规则:
第一过滤规则,第一过滤规则的匹配项为:包含一个或多个操作系统的类型的参考类型集,参考类型集中包括被EK攻击的概率小于参考概率阈值的操作系统的类型,第一过滤规则的动作为:过滤掉,第一过滤规则用于将第一目的HTTP报文中的数据过滤掉,第一目的HTTP报文是指携带的操作系统的类型包含在参考类型集中的HTTP报文;和/或
第二过滤规则,第二过滤规则的匹配项为:一个或多个内网地址,第二过滤规则的动作为:过滤掉,第二过滤规则用于将第二目的HTTP报文中的数据过滤掉,第二目的HTTP报文是指携带的目的地址为内网地址的HTTP报文;和/或
第三过滤规则,第三过滤规则的匹配项为:包含一个或多个域名的参考域名集,参考域名集中包括被访问频率大于频率阈值的域名,第三过滤规则的动作为:过滤掉,第三过滤规则用于将第三目的HTTP报文中的数据过滤掉,第三目的HTTP报文是指携带的域名包含在参考域名集中的HTTP报文。
可选地,该装置还包括:
第二获取模块,用于获取多个训练样本,以及该多个训练样本中每个训练样本对应的样本标签,训练样本包括属于第二数据流的一个或多个样本HTTP报文中的数据,样本标签用于指示对应的训练样本为正训练样本还是负训练样本,正训练样本是指未被EK攻击的HTTP报文流数据,负训练样本是指被EK攻击的HTTP报文流数据;
第一训练模块,用于根据该多个训练样本,以及该多个训练样本中每个训练样本对应的 样本标签,对多个初始检测模型进行训练,得到多个行为检测模型,该多个初始检测模型分别对应EK的攻击行为轨迹中的不同阶段。
可选地,第二获取模块包括:
获取单元,用于获取多个样本HTTP报文流数据,样本HTTP报文流数据是指第二数据流中位于当前时间之前的参考时长内的HTTP报文中的数据;
第二过滤单元,用于根据该过滤规则集,对多个样本HTTP报文流数据中的每个样本HTTP报文流数据进行过滤;
第三确定单元,用于将过滤后剩余的多个样本HTTP报文流数据确定为多个训练样本。
可选地,第一训练模块具体用于:
从该多个初始检测模型中选择一个初始检测模型,根据选择的初始检测模型执行以下操作,直至根据多个初始检测模型中的每个初始检测模型均已执行以下操作为止:
根据该多个训练样本中每个训练样本包括的样本HTTP报文,确定选择的初始检测模型对应的样本特征集,样本特征集包括与多个训练样本一一对应的多个样本特征向量;
将该多个样本特征向量分别输入选择的初始检测模型,对选择的初始检测模型进行训练,以使选择的初始检测模型的输出分别为该多个训练样本中相应训练样本对应的样本标签,从而得到一个行为检测模型。
可选地,该装置还包括:
第二获取模块,用于获取多个训练样本,以及所述多个训练样本中每个训练样本对应的样本标签,所述训练样本包括属于第二数据流的一个或多个样本HTTP报文中的数据,所述样本标签用于指示对应的训练样本为正训练样本还是负训练样本,所述正训练样本是指未被EK攻击的HTTP报文流数据,所述负训练样本是指被EK攻击的HTTP报文流数据;
第一训练模块,用于根据所述多个训练样本,以及所述多个训练样本中每个训练样本对应的样本标签,对多个初始检测模型进行训练,得到所述多个行为检测模型,该多个初始检测模型分别对应EK的攻击行为轨迹中的不同阶段;
第三确定模块,用于根据该多个行为检测模型,以及该多个行为检测模型中每个行为检测模型对应的样本特征集,确定样本交叉特征集,样本交叉特征集包括与多个训练样本一一对应的多个样本交叉特征向量;
第二训练模块,用于将该多个样本交叉特征向量分别输入初始分析模型,对初始分析模型进行训练,以使该初始分析模型的输出分别为多个训练样本中相应训练样本对应的样本标签,从而得到关联分析模型。
可选地,该装置还包括:
第四确定模块,用于对该多个训练样本分别进行漏洞文件检测和恶意软件检测,得到该多个训练样本中每个训练样本对应的漏洞文件检测结果和恶意软件检测结果;
第三确定模块用于:
根据该多个行为检测模型、该多个行为检测模型中每个行为检测模型对应的样本特征集,以及该多个训练样本中每个训练样本对应的漏洞文件检测结果和恶意软件检测结果,确定样本交叉特征集。
可选地,第三确定模块具体用于:
从该多个训练样本中选择一个训练样本,对选择的训练样本执行以下处理,直至处理完 该多个训练样本中的每个训练样本为止:
将该多个行为检测模型对应的样本特征集中与选择的训练样本对应的样本特征向量分别输入该多个行为检测模型,获得该多个行为检测模型分别输出的样本概率值,从而得到多个样本概率值;
根据该多个样本概率值,以及选择的训练样本对应的漏洞文件检测结果和恶意软件检测结果,确定多个样本交叉特征,样本交叉特征是指多个样本概率值、选择的训练样本对应的漏洞文件检测结果和恶意软件检测结果中的两个不同的数据相乘后得到的;
根据该多个样本交叉特征生成一个样本交叉特征向量。
第三方面,提供了一种攻击检测设备,所述攻击检测设备包括处理器和存储器,所述存储器用于存储执行上述第一方面所提供的攻击行为检测方法的程序,以及存储用于实现上述第一方面所提供的攻击行为检测方法所涉及的数据。所述处理器被配置为用于执行所述存储器中存储的程序。所述存储设备的操作装置还可以包括通信总线,该通信总线用于该处理器与存储器之间建立连接。
第四方面,提供了一种计算机可读存储介质,所述计算机可读存储介质中存储有指令,当其在计算机上运行时,使得计算机执行上述第一方面所述的攻击行为检测方法。
第五方面,提供了一种包含指令的计算机程序产品,当其在计算机上运行时,使得计算机执行上述第一方面所述的攻击行为检测方法。
上述第二方面、第三方面、第四方面和第五方面所获得的技术效果与第一方面中对应的技术手段获得的技术效果近似,在这里不再赘述。
本申请提供的技术方案至少能够带来以下有益效果:
由于EK的攻击行为轨迹包括多个不同阶段,因此,本方案通过获取主机在一个时间段内的HTTP报文流数据,并通过多个行为检测模型进行处理,确定多个初始概率值,且由于该多个行为检测模型分别用于描述该多个不同阶段,因此,本方案能够完整刻画EK的攻击行为轨迹。在确定多个初始概率值之后,能够对该多个初始概率值进行综合处理,得到综合概率值,也即本方案能够综合分析各个阶段EK攻击的行为模式,更加准确地确定该主机在传输数据流的过程中被EK攻击的概率,也即更加准确地检测EK的攻击行为。由此可见,本方案既能够快速准确地检测EK的攻击行为,也不会严重耗费主机本身的资源。另外,由于本方案中获取的HTTP报文流数据仅包含网络协议规定的常规数据,因此,相比于获取脚本代码进行解析的方法,本方案存在的侵犯用户隐私的风险很低。
附图说明
图1是本申请实施例提供的一种攻击行为检测方法所涉及的系统架构图;
图2是本申请实施例提供的一种攻击检测设备的结构示意图;
图3是本申请实施例提供的一种攻击行为检测方法的流程图;
图4是本申请实施例提供的一种根据过滤规则集筛选HTTP报文流数据的流程图;
图5是本申请实施例提供的另一种确定攻击行为检测方法的流程图;
图6是本申请实施例提供的一种确定多个行为检测模型的方法流程图;
图7是本申请实施例提供的一种确定关联分析模型的方法流程图;
图8是本申请实施例提供的一种攻击行为检测装置的结构示意图。
具体实施方式
为使本申请的目的、技术方案和优点更加清楚,下面将结合附图对本申请实施方式作进一步地详细描述。
图1是本申请实施例提供的一种攻击行为检测方法所涉及的系统架构图。参见图1,该系统架构包括主机101、HTTP代理设备102、防火墙103和攻击检测设备104。主机101能够与HTTP代理设备102以无线或者有线方式连接进行通信。HTTP代理设备102能够与防火墙103以无线或者有线方式连接进行通信。HTTP代理设备102还能够与攻击检测设备104以无线或者有线方式连接进行通信。
主机101用于传输(发送或接收)HTTP报文流数据。HTTP代理设备102用于代理主机101获取一些信息,也即是,主机101发送的HTTP报文流数据到达HTTP代理设备102之后,HTTP代理设备102可以从外网获取相应的信息返回给主机101。防火墙103用于对主机101进行保护。攻击检测设备104用于从HTTP代理设备102中获取主机101在参考时间段内传输的HTTP报文流数据,并根据本申请实施例提供的技术方案对该HTTP报文流数据进行处理,来确定主机101是否被EK攻击。其中,攻击检测设备104还用于确定本方案提供的多个行为检测模型以及关联分析模型,并在自身部署。
可选地,攻击检测设备104为任一第三方设备。例如,攻击检测设备104为网络安全智能系统(cybersecurity intelligence system,CIS),简称为CIS设备。
在另一些实施例中,攻击检测设备104为以防火墙、交换机、路由器为例转发设备的旁路设备,这种场景中,该系统架构还包括转发设备,该转发设备用于对主机101传输的报文流数据进行转发。攻击检测设备104用于从转发设备中获取主机101传输的HTTP报文流数据。
在另一些实施例中,攻击检测设备104采用云端部署方案,即攻击检测设备104部署于互联网。攻击检测设备104向其他提供报文流数据的设备提供EK攻击行为检测服务。这里提供报文流数据的设备包括但不限于主机、以防火墙、交换机、路由器为例转发设备、或者第三方服务器等。可选地,提供报文流数据的设备通过网络产品界面设计(Website User Interface,Web UI)向攻击检测设备104提供报文流数据,并接收攻击检测设备104输出的检测结果,例如提供的报文流数据是否存在EK的攻击行为。
可选地,本申请实施例中的多个行为检测模型和关联分析模型也能由其他的计算机设备根据训练样本训练确定,这样,将训练好的多个行为检测模型和关联分析模型部署在攻击检测设备104上即可。
可选地,在本申请实施例中,攻击检测设备104还用于在确定主机101被EK攻击之后,向主机101发送告警信息,该告警信息用于指示主机101被EK攻击,以提示用户或者主机101及时采取应对措施。或者,该系统架构还包括网络管理设备,主机101和攻击检测设备104均能够与网络管理设备以无线或有线方式连接进行通信,攻击检测设备104还用于在确 定主机101被EK攻击之后,向网络管理设备上报告警信息,网络管理设备能够根据上报的告警信息,采取应对措施,例如向主机101转发告警信息等。
在另一些实施例中,该系统架构包括多个主机101,该多个主机101处于一个局域网中,该多个主机101都能够以无线或有线的方式与HTTP代理设备102通信,在这种场景中,攻击检测设备104用于检测该多个主机101是否被EK攻击。在该系统架构还包括转发设备的场景中,转发设备能够转发该多个主机101传输的HTTP报文流数据,攻击检测设备104用于检测该多个主机101中的每个主机101是否被EK攻击。
可选地,在本申请实施例中,任一主机101为台式电脑、平板电脑、笔记本电脑、手机、智能电视、智能音箱等,本申请实施例对此不作限定。
请参考图2,图2是根据本申请实施例示出的一种攻击检测设备的结构示意图。可选地,该攻击检测设备为图1中所示的攻击检测设备102,该攻击检测设备包括一个或多个处理器201、通信总线202、存储器203以及一个或多个通信接口204。
处理器201为一个通用中央处理器(central processing unit,CPU)、网络处理器(NP)、微处理器、或者为一个或多个用于实现本申请方案的集成电路,例如,专用集成电路(application-specific integrated circuit,ASIC),可编程逻辑器件(programmable logic device,PLD)或其组合。可选地,上述PLD为复杂可编程逻辑器件(complex programmable logic device,CPLD),现场可编程逻辑门阵列(field-programmable gate array,FPGA),通用阵列逻辑(generic array logic,GAL)或其任意组合。
通信总线202用于在上述组件之间传送信息。可选地,通信总线202分为地址总线、数据总线、控制总线等。为便于表示,图中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。
可选地,存储器203为只读存储器(read-only memory,ROM)、随机存取存储器(random access memory,RAM)、电可擦可编程只读存储器(electrically erasable programmable read-only memory,EEPROM)、光盘(包括只读光盘(compact disc read-only memory,CD-ROM)、压缩光盘、激光盘、数字通用光盘、蓝光光盘等)、磁盘存储介质或者其它磁存储设备,或者是能够用于携带或存储具有指令或数据结构形式的期望的程序代码并能够由计算机存取的任何其它介质,但不限于此。存储器203独立存在,并通过通信总线202与处理器201相连接,或者,存储器203与处理器201集成在一起。
通信接口204使用任何收发器一类的装置,用于与其它设备或通信网络通信。通信接口204包括有线通信接口,可选地,还包括无线通信接口。其中,有线通信接口例如以太网接口等。可选地,以太网接口为光接口、电接口或其组合。无线通信接口为无线局域网(wireless local area networks,WLAN)接口、蜂窝网络通信接口或其组合等。
可选地,在一些实施例中,攻击检测设备包括多个处理器,如图2中所示的处理器201和处理器205。这些处理器中的每一个为一个单核处理器,或者一个多核处理器。可选地,这里的处理器指一个或多个设备、电路、和/或用于处理数据(如计算机程序指令)的处理核。
在具体实现中,作为一种实施例,攻击检测设备还包括输出设备206和输入设备207。输出设备206和处理器201通信,能够以多种方式来显示信息。例如,输出设备206为液晶显示器(liquid crystal display,LCD)、发光二级管(light emitting diode,LED)显示设备、 阴极射线管(cathode ray tube,CRT)显示设备、投影仪(projector)或者打印机等。输入设备207和处理器201通信,能够以多种方式接收用户的输入。例如,输入设备207是鼠标、键盘、触摸屏设备或传感设备等。
在一些实施例中,存储器203用于存储执行本申请方案的程序代码210,处理器201能够执行存储器203中存储的程序代码210。该程序代码中包括一个或多个软件模块,该攻击检测设备能够通过处理器201以及存储器203中的程序代码210,来实现下文图3实施例提供的攻击行为检测方法。
图3是本申请实施例提供的一种攻击行为检测方法的流程图,该方法应用于攻击检测设备,以攻击检测设备为图1中的CIS设备为例进行介绍。请参考图3,该方法包括如下步骤。
步骤301:获取主机在参考时间段内传输的HTTP报文流数据。
在本申请实施例中,EK攻击行为轨迹包括多个不同阶段,主机在运行过程中,会传输HTTP报文流(数据流)数据,而被EK攻击的主机所传输的HTTP报文流数据会携带EK攻击的一些行为特征。基于此,攻击检测设备能够获取主机在参考时间段内传输的HTTP报文流数据,之后对该HTTP报文流数据进行分析处理,以此来检测该主机是否被EK攻击,也即是该主机在传输数据流的过程中是否存在EK的攻击行为。
如果主机被EK攻击,则主机与EK的攻击设备之间会持续传输一段时间的HTTP报文,因此,本方案需要获取一段时间的HTTP报文流数据,也即是获取参考时间段内的HTTP报文流数据,该HTTP报文流数据包括一个或多个HTTP报文中的数据,且该一个或多个HTTP报文属于第一数据流,第一数据流是该主机与某个设备之间传输HTTP报文所形成的数据流,该参考时间段为当前时间之前且距离当前时间参考时长的时间段。可选地,参考时长为预设的一个时长,例如20分钟、30分钟等。
需要说明的是,在本申请实施例中,数据流的定义为二元组,二元组包括源IP地址和目的IP地址,也即是源IP地址和目的IP地址相同的多个HTTP报文属于同一数据流。或者数据流的定义为其他的任何一种定义,本申请实施例对此不作限定。
另外,主机所传输的HTTP报文流数据,会被HTTP代理设备接收,攻击检测设备会向HTTP代理设备发送获取HTTP报文数据流的获取请求,HTTP代理设备能够根据该获取请求,将主机在参考时间内段内传输的HTTP报文流数据发送给攻击检测设备。
需要说明的是,在系统架构只包括一台主机的场景中,HTTP代理设备中仅接收有该主机传输的HTTP报文流数据,攻击检测设备能够直接从HTTP代理设备中获取该主机传输的HTTP报文流数据。在系统架构中包括多个主机的场景中,HTTP代理设备用于服务该多个主机,HTTP代理设备会接收各个主机传输的HTTP报文流数据,也即是,HTTP代理设备接收的HTTP报文流数据来自于多个主机,攻击检测设备用于检测该多个主机中每个主机是否被EK攻击。这种场景中,攻击检测设备能够实时获取HTTP代理设备接收到的HTTP报文,并按照数据流的定义以及参考时长,将获取的HTTP报文进行处理,得到各个主机在参考时间段内传输的HTTP报文流数据,对于每个主机的HTTP报文流数据,攻击检测设备均能够根据本申请实施例提供的攻击检测方法来检测相应主机在传输HTTP报文流数据的过程中是否存在EK的攻击行为。
步骤302:根据该HTTP报文流数据,通过多个行为检测模型,确定多个初始概率值。
在本申请实施例中,攻击检测设备能够对获取的HTTP报文流数据进行数据预处理,得到多个行为检测模型中每个行为检测模型的输入,通过每个行为检测模型对相应的输入进行处理,得到相应的初始概率值,以得到多个初始概率值。其中,该多个行为检测模型分别用于描述EK的攻击行为轨迹中的不同阶段。
可选地,攻击检测设备能够从该多个行为检测模型中选择一个行为检测模型,根据选择的行为检测模型执行以下操作,直至根据该多个行为检测模型中的每个行为检测模型均已执行以下操作为止:根据该HTTP报文流数据,确定所选择的行为检测模型对应的特征向量,将该特征向量输入所选择的行为检测模型,获得所选择的行为检测模型输出的初始概率值。
在本申请实施例中,由于该多个行为检测模型分别用于描述EK攻击行为轨迹中的不同阶段,也即一个行为检测模型能够用于描述EK攻击行为轨迹中的一个阶段,而不同阶段的EK攻击行为的行为特征存在不同,因此,输入各个的行为检测模型的特征向量也不同,也即是根据主机的HTTP报文流数据确定的各个行为检测模型对应的特征向量也不同。
可选地,该多个行为检测模型包括至少两个以下模型:重定向检测模型、攻击对象筛选检测模型、漏洞利用检测模型和恶意软件下载检测模型。
在本申请实施例中,EK的攻击行为轨迹包括重定向阶段、攻击对象筛选阶段、漏洞利用阶段、恶意软件下载阶段等阶段。基于此,多个行为检测模型至少能够描述这四个阶段中的任意两个阶段,也即是该多个行为检测模型至少包括分别用于描述这四个阶段中任意两个阶段的两个模型。
其中,重定向检测模型用于描述EK攻击行为轨迹中的重定向阶段,这个阶段EK攻击会将用户正在浏览的网页等进行重定向。攻击对象筛选模型用于描述EK的攻击行为轨迹中的攻击对象筛选阶段,这个阶段EK会根据主机传输的HTTP报文中携带的操作系统、浏览器版本等信息,来筛选攻击对象。漏洞利用检测模型用于描述EK的攻击行为轨迹中的漏洞利用阶段,这个阶段EK会分析主机中存在的漏洞,下载漏洞文件到主机上,例如,主机上低版本的Flash插件可能会存在漏洞,EK会下载一个Flash漏洞文件到主机上,对主机进行爆破等。恶意软件下载检测模型用于描述EK的攻击行为轨迹中的恶意软件下载阶段,这个阶段EK会下载恶意软件到主机上,比如,木马软件、勒索软件等。
可选地,本申请实施例中的行为检测模型为基于机器学习算法的学习模型,例如,基于随机森林算法的学习模型、基于支持向量机(support vector machine,SVM)算法的学习模型、基于xgboost算法的学习模型等,且每个行为检测模型所采用的算法可以相同或不同,本申请实施例对此不作限定。例如,行为检测模型A和行为检测模型B均为基于随机森林算法的学习模型,或者,行为检测模型A为基于随机森林算法的学习模型,行为检测模型B为基于SVM算法的学习模型。
在本申请实施例中,攻击检测设备能够从HTTP报文流数据中获取每个行为检测模型对应的特征向量包括的一个或多个特征。而且,一个HTTP报文中的数据包括多个字段,每个字段表示一种信息,特征向量包括的各个特征能从这些字段中分别获取。
需要说明的是,EK攻击行为的多种行为特征,可以分为公有特征和独有特征,其中,公有特征为各个行为检测模型公有的特征,或者某些行为检测模型公有的特征,独有特征为某个行为检测模型独有的特征。也即是,每个行为检测模型对应的特征向量包括一种或多种特征,该一种或多种特征的其中一部分为公有特征,另一部分为独有特征。
对于重定向检测模型来说,由于重定向检测模型用于描述EK的攻击行为轨迹中的重定向阶段,因此,主机传输的HTTP报文中可能会携带Location字段,Location字段用于将主机浏览的网页进行重定向。基于此,重定向检测模型对应的特征向量包括重定向阶段的行为特征,例如HTTP报文消息码、URL字段的长度、是否携带Location字段等特征。其中,HTTP报文消息码为重定向检测模型和漏洞利用检测模型的公有特征,URL字段的长度为上述四个模型的公有特征,HTTP报文流数据是否携带Location字段为重定向检测模型的独有特征。
可选地,由于HTTP报文流数据包括一个或多个HTTP报文中的数据,因此,HTTP报文消息码可以为根据每个HTTP报文中的数据携带的消息码确定的消息码向量,消息码为200、400、404等类型。URL字段的长度,为该一个或多个HTTP报文中的数据所携带的所有URL字段的总长度或者平均长度或者最长长度。另外,只要有一个HTTP报文中的数据携带Location字段,则确定该HTTP报文流数据携带Location字段,可选地,用‘0’表示不携带,用‘1’表示携带,或者用其他字符代码进行表示,本申请实施例对此不作限定。
其中,本申请实施例可以采用独热编码的方式来确定消息码向量,假设本申请实施例所采用的消息码包括类型1、类型2和类型3,可以初始化消息码向量为‘0,0,0’,也即是初始化消息码向量的元素总个数等于所采用的消息码的类型总数,每一位元素对应一个类型、且初始化为‘0’,如果该一个或多个消息码中包括类型1和类型2,不包括类型3,则可以确定消息码向量为‘1,1,0’。
示例性的,假设本申请实施例所采用的消息码包括400和302,HTTP报文流数据包括HTTP1、HTTP2、HTTP3这三个报文中的数据,这三个报文携带的消息码分别为200、400、404,这三个报文中的数据所携带的所有URL字段的平均长度为60,HTTP1报文中的数据携带Location字段,则该HTTP报文流数据的特征向量为[0,1,60,1]。
对于攻击对象筛选模型来说,由于攻击对象筛选模型用于描述EK的攻击行为轨迹中的攻击对象筛选阶段,因此,攻击对象筛选模型对应的特征向量包括攻击对象筛选阶段的行为特征,例如URL字段的长度、操作系统类型等。可选地,操作系统类型为攻击对象筛选模型的独有特征。
对于漏洞利用检测模型来说,由于漏洞利用检测模型用于描述EK的攻击行为轨迹中的漏洞利用阶段,且这个阶段主机传输的HTTP报文的字段可能会被更改,包括增加字段、篡改数据、加密字段等。基于此,漏洞利用检测模型对应的特征向量包括漏洞利用阶段的行为特征,例如HTTP报文消息码、URL字段的长度、URL字段是否含有Base64模式编码子串、是否携带X-Flash-Version字段等。可选地,URL字段是否含有Base64模式编码子串、是否携带X-Flash-Version字段为漏洞利用检测模型的独有特征。
需要说明的是,如果URL字段是否含有Base64模式编码子串,表面该URL被加密,主机可能正在被EK攻击。只要有一个HTTP报文的URL字段含有Base64编码模式子串,则确定该HTTP报文流数据的URL字段含有Base64模式编码子串。同样地,只要有一个HTTP报文携带X-Flash-Version字段,确定该HTTP报文流数据包含有X-Flash-Version字段。
对于恶意软件下载检测模型来说,由于恶意软件下载检测模型用于描述EK的攻击行为轨迹中的恶意软件下载阶段,因此,恶意软件下载检测模型对应的特征向量包括恶意软件下载阶段的行为特征,例如,HTTP报文消息码、URL字段的长度、HTTP报文的Content-Type(内容类型)字段、Content-Length(内容长度)字段等。需要说明的是,HTTP报文包括HTTP 请求报文和HTTP响应报文,HTTP响应报文携带Content-Type、Content-Length字段。
以上介绍的各个模型对应的特征向量包括的特征能够根据实际进行扩展,例如是否包含某些特殊字符、特殊字段等、HTTP报文头包括的字段个数、报文头包括的字段的长度等。可选地,其中,HTTP报文头包括的字段个数为一个或多个HTTP报文的报文头中包括的字段个数最多的,报文头包括的字段的长度是HTTP报文头包括的各个字段的平均长度,哪些特征为公有特征,哪些特征为独有特征可以事先规定。
接下来将以多个行为检测模型包括这四个模型为例,对获取的每个模型对应的特征向量进行介绍。为方便描述,将前述介绍的特征HTTP报文消息码记为X1,URL字段的长度记为X2、是否携带Location字段记为X3、URL字段是否含有Base64模式编码子串记为X4、是否携带X-Flash-Version字段记为X5、操作系统类型记为X6。则重定向检测模型对应的特征向量为[X1,X2,X3],攻击对象筛选检测模型对应的特征向量为[X1,X2,X6],漏洞利用检测模型对应的特征向量为[X1,X2,X4,X5],恶意文件下载检测模型对应的特征向量为[X1,X2]。
在确定一个模型对应的特征向量之后,攻击检测设备将该特征向量输入对应的模型,并将该模型输出的概率值作为一个初始概率值,对这四个模型均执行该操作,得到四个初始概率值,分别记为E1、E2、E3和E4。
可选地,攻击检测设备在根据主机的HTTP报文流数据,通过多个行为检测模型,确定多个初始概率值之前,还能够根据过滤规则集,对获取的HTTP报文流数据进行过滤,过滤掉明显不需要检测的HTTP报文中的数据。之后,攻击检测设备能够根据过滤后剩余的HTTP报文流数据,通过该多个行为检测模型,确定该多个初始概率值。
在本申请实施例中,过滤规则集包括但不限于以下的规则:
第一过滤规则,第一过滤规则的匹配项为:操作系统的类型包含在参考类型集中,该参考类型集中包括被EK攻击的概率小于参考概率阈值的操作系统的类型,第一过滤规则的动作为:过滤掉。第一过滤规则用于将第一目的HTTP报文中的数据过滤掉,第一目的HTTP报文是指携带的操作系统的类型包含在该参考类型集中的HTTP报文。和/或
第二过滤规则,第二过滤规则的匹配项为:目的地址为内网地址,第二过滤规则的动作为:过滤掉。第二过滤规则用于将第二目的HTTP报文中的数据过滤掉,第二目的HTTP报文是指携带的目的地址为内网地址的HTTP报文。和/或
第三过滤规则,第三过滤规则的匹配项为:域名包含在参考域名集中,该参考域名集中包括访问频率大于频率阈值的域名,第三过滤规则的动作为:过滤掉。第三过滤规则用于将第三目的HTTP报文中的数据过滤掉,第三目的HTTP报文是指携带的域名包含在该参考域名集中的HTTP报文。
也即是,攻击检测设备在获取到主机在参考时间段内传输的HTTP报文流数据之后,能够将携带的操作系统的类型为低风险的操作系统、和/或携带的目的地址为内网地址、和/或携带的域名为高访频率的域名的HTTP报文中的数据过滤掉。
需要说明的是,HTTP报文中的数据携带的User-Agent(用户代理)字段包含有操作系统的类型,攻击检测设备能够从该字段获取携带的操作系统的类型。可选地,过滤规则集包括的各个规则均是根据经验设置的,或者根据统计数据设置的,过滤规则集能够动态扩展,原则是在不漏掉EK攻击的可疑HTTP报文的前提下尽量减少不必要的数据处理。另外,如果 过滤规则集包括多个规则,根据这些规则来确定某个HTTP报文中的数据过滤时,攻击检测设备能够按照这些规则的各自的重要程度,来确定一个过滤顺序,或者确定任意一种顺序作为一个过滤顺序。
示例性的,假设过滤规则集包括第一过滤规则、第二过滤规则和第三过滤规则,过滤顺序为第一过滤规则、第二过滤规则、第三过滤规则。参见图4,攻击检测设备能够先判断HTTP报文中的数据携带的操作系统的类型是否包含在参考类型集中,如果是,则将该HTTP报文中的数据过滤掉,如果不是,则再判断该HTTP报文中的数据携带的目的地址是否为内网地址,如果是,则将该HTTP报文中的数据过滤掉,如果不是,则再判断该HTTP报文中的数据携带的域名是否包含在参考域名集中,如果是,则将该HTTP报文中的数据过滤掉,如果不是,则将该HTTP报文中的数据保留。
需要说明的是,在对HTTP报文流数据进行过滤之后,攻击检测设备根据过滤后的HTTP报文流数据,通过多个行为检测模型,确定多个初始概率值的实现方式可以参照前述相关介绍,这里不再赘述。
步骤303:根据该多个初始概率值确定综合概率值。
在本申请实施例中,由于仅根据单一一个初始概率值,往往不能确定是否存在EK攻击行为,也即是单方面的因素往往和EK的关联度较低,因此,在通过多个行为检测模型,确定多个初始概率值之后,攻击检测设备能够对该多个初始概率值进行综合处理,确定一个综合概率值,该综合概率值用于指示该主机传输第一数据流的过程中被EK攻击的可能性。
可选地,攻击检测设备能够根据该多个初始概率值确定多个交叉特征,交叉特征是指该多个初始概率值中两个不同的初始概率值相乘后得到的。攻击检测设备能够根据该多个交叉特征生成交叉特征向量,将该交叉特征向量输入关联分析模型,获得该关联分析模型输出的综合概率值,该关联分析模型用于对EK的攻击行为轨迹中的多个不同阶段进行综合分析。
可选地,关联分析模型为任一种训练好的机器学习模型,例如逻辑回归模型、随机森林模型等。
示例性的,以多个初始概率值包括E1、E2、E3和E4为例,多个交叉特征包括如下表1中的F 12、F 13、F 14、F 23、F 24和F 34。交叉特征向量为[F 12,F 13,F 14,F 23,F 24,F 34],或者为[F 12,F 13,F 14,F 23,F 34,F 24]等,也即是交叉特征向量中元素的顺序为任意一种定义的顺序。攻击检测设备能够将该交叉特征向量输入关联分析模型,得到输出的综合概率值P。
表1
Figure PCTCN2020118782-appb-000001
另外,除了上述介绍的通过关联分析模型来确定综合概率值以外,在其他一些实施例中,攻击检测设备也能够将该多个初始概率值进行加权计算得到综合概率值。
可选地,攻击检测设备在根据多个初始概率值确定多个交叉特征之前,还能够对获取的HTTP报文流数据进行漏洞文件检测和恶意软件检测,得到漏洞文件检测结果和恶意软件检测结果。
可选地,对HTTP报文流数据进行漏洞文件检测和恶意软件检测的方法是根据IPS进行检测的方法。IPS能够对HTTP报文流数据包括的字段、字符等进行分析,得到漏洞文件检测结果和恶意软件检测结果。漏洞文件检测结果和恶意软件检测结果均为0或者1,漏洞文件检测结果为0表示未进行漏洞文件下载,为1表示进行了漏洞文件下载,恶意软件检测结果为0表示未进行恶意软件下载,为1表示进行了恶意软件下载。为方便描述,下述将漏洞文件检测结果和恶意软件检测结果分别记为E5和E6。
在得到多个初始概率值以及漏洞文件检测结果和恶意软件检测结果之后,攻击检测设备能够根据该多个初始概率值,以及漏洞文件检测结果和恶意软件检测结果,确定所述多个交叉特征,交叉特征是指该多个初始概率值、该漏洞文件检测结果和恶意软件检测结果中的两个不同的数据相乘后得到的。
示例性的,假设多个初始概率值包括E1、E2、E3和E4,漏洞文件检测结果和恶意软件检测结果分别记为E5和E6,则交叉特征包括如下表2中的F 12、F 13、F 14、F 15、F 16、F 23、F 24、F 25、F 26、F 34、F 35、F 36、F 45、F 46和F 47,共15个交叉特征。
表2
Figure PCTCN2020118782-appb-000002
由于将多个初始概率值以及漏洞文件检测结果和恶意软件检测结果中的两个不同的数据相乘后得到的数据均作为一个交叉特征的话,得到的交叉特征的个数可能较多,且其中一些交叉特征为冗余特征,对攻击行为检测的准确性提升能够忽略,且去掉之后能够减少计算量。基于此,攻击检测设备能够根据该多个初始概率值,以及漏洞文件检测结果和恶意软件检测结果,生成一个概率矩阵,然后,按照交叉特征选择策略,从该概率矩阵中筛选出多个元素,将筛选出的多个元素作为多个交叉特征。其中,概率矩阵为X行X列的矩阵,X为多个初始概率值、漏洞文件检测结果和恶意软件检测结果的总个数,X行和X列均对应多个初始概率值、漏洞文件检测结果和恶意软件检测结果,概率矩阵中的元素是将交叉的两个数据相乘后得到的。
可选地,在本申请实施例中,交叉特征选择策略是根据经验确定的一种策略,以将冗余特征筛除。
示例性的,假设多个初始概率值为上述重定向检测模型、攻击对象筛选检测模型、漏洞利用检测模型和恶意软件下载检测模型分别输出的概率值E1、E2、E3和E4,漏洞文件检测结果和恶意软件检测结果分别为E6和E7,则生成的概率矩阵D如下表3所示:
表3
Figure PCTCN2020118782-appb-000003
其中,概率矩阵D为6行6列的矩阵,也即是X等于6,概率矩阵D中的元素x为冗余特征,冗余特征包括相同的数据相乘得到的特征、以及根据经验确定的对攻击检测的准确性提升能够忽略的特征,元素F1至F14为根据交叉特征选择策略筛选得到的多个交叉特征。
在按照交叉特征选择策略筛选得到多个交叉特征之后,攻击检测设备能够根据前述相关介绍生成交叉特征向量,并将交叉特征向量输入关联分析模型,并输出综合概率值。示例性的,交叉特征向量为[F 13,F 14,…,F 56]。
步骤304:如果该综合概率值大于预设概率阈值,则确定该主机传输第一数据流的过程中存在EK的攻击行为。
在本申请实施例中,如果综合概率值大于预设概率阈值,则攻击检测设备确定该主机传输第一数据流的过程中存在EK的攻击行为,也即是该主机被EK攻击。
示例性的,假设预设概率阈值为90%,如果综合概率值为98%,则攻击检测设备确定该主机传输第一数据流的过程中存在EK的攻击行为,也即是该主机被EK攻击。如果综合概率值为60%,则攻击检测设备确定该主机传输第一数据流的过程中不存在EK的攻击行为,也即是该主机在参考时间段内未被EK攻击。
图5是本申请实施例提供的另一种攻击行为检测方法的流程图。参见图5,攻击检测设备能够获取主机在参考时间段内传输的HTTP报文流数据,并根据过滤规则集过滤掉明显不需要检测的HTTP报文中的数据。攻击检测设备将过滤后剩余的HTTP报文流数据分别输入重定向检测模型、攻击对象筛选检测模型、漏洞利用检测模型和恶意软件下载检测模型,分别通过这四个模型对HTTP报文流数据进行处理,得到初始概率值E1、E2、E3和E4。同时,攻击检测设备能够将过滤后剩余的HTTP报文流数据输入IPS,经IPS检测之后得到漏洞文件检测结果E5和恶意软件检测结果E6。之后,攻击检测设备会将这四个初始概率值、以及漏洞文件检测结果和恶意软件检测结果输入关联分析模型,根据关联分析模型输出的综合概率值P,确定检测结果。如果该综合概率值P大于预设概率阈值,则检测结果为确定该主机在参考时间段内被EK攻击,如果该综合概率值P不超过预设概率阈值,则检测结果为确定该主机在参考时间段内未被EK攻击。
可选地,在本申请实施例中,攻击检测设备在确定主机在传输第一数据流的过程中存在EK的攻击行为之后,能够将检测结果上报给网络管理设备,网络管理设备能够根据检测结果来采取应对措施。
以上介绍了本申请实施例中攻击检测设备通过多个行为检测模型来确定主机在传输第一数据流的过程中是否存在EK的攻击行为的方法。可选地,该多个攻击行为检测模型为事先根据训练样本确定的多个模型。接下来将对本申请实施例提供的一种确定多个攻击行为检测 模型的方法进行介绍,该方法应用于攻击检测设备,或者应用于其他的计算机设备,接下来将以应用于攻击检测设备为例对此进行介绍。也即是,在根据HTTP报文流数据,通过多个行为检测模型,确定包括多个第一概率值的第一概率值集合之前,参见图6,攻击行为检测方法还包括步骤401和步骤402。
步骤401:获取多个训练样本,以及该多个训练样本中每个训练样本对应的样本标签。
在本申请实施例中,攻击检测设备能够从HTTP代理设备中获取网络中各个样本主机传输的HTTP报文,从而确定多个训练样本。训练样本包括属于第二数据流的一个或多个样本HTTP报文中的数据,样本标签用于指示对应的训练样本为正训练样本还是负训练样本,正训练样本是指未被EK攻击的HTTP报文流数据,负训练样本是指被EK攻击的HTTP报文流数据。需要说明的是,不同的训练样本所属的第二数据流可能相同或不同。
可选地,攻击检测设备能够获取多个样本HTTP报文流数据,样本HTTP报文流数据是指第二数据流中位于当前时间之前的参考时长内的HTTP报文中的数据。之后,攻击检测设备能够根据过滤规则集,对多个样本HTTP报文流数据中的每个样本HTTP报文流数据进行过滤,将过滤后剩余的多个样本HTTP报文流数据确定为多个训练样本。
在本申请实施例中,攻击检测设备能够按照数据流的定义以及参考时长对获取的HTTP报文进行预处理操作,以得到多个训练样本。
示例性的,攻击检测设备对获取的HTTP报文进行预处理操作能够得到事件列表如下,表4中每一行的样本HTTP报文流数据为一个训练样本。
表4
参考时长 源IP地址 目的IP地址 样本HTTP报文流数据
T1 10.0.xx.xx 10.xx.xx.xx [HTTP1、HTTP2、…、HTTPn]
T1 10.1.xx.xx 10.xx.xx.xx [HTTP1、HTTP2、…、HTTPm]
T1
T1 10.xx.xx.xx 10.xx.xx.xx [HTTP1、HTTP2、…、HTTPk]
需要说明的是,表4中的T1为参考时长,数据流的定义为二元组,每个样本HTTP报文流数据包括的一个或多个样本HTTP报文中的数据,且该一个或多个样本HTTP报文中的数据可以按照传输的时间顺序依次排列。攻击检测设备能够根据前述介绍的过滤规则集对每个样本HTTP报文流数据包括的一个或多个样本HTTP报文中的数据进行过滤,将过滤之后的样本HTTP报文流数据,作为多个训练样本。
另外,攻击检测设备能够根据实际确定每个训练样本为正训练样本还是负训练样本,以确定每个训练样本对应的样本标签,正样本标签可以为‘1’,负训练样本可以为‘0’。其中负训练样本可以为已知的被EK攻击的HTTP报文流数据,包括真实数据和/或模拟数据,模拟数据是指模拟EK攻击行为所产生的HTTP报文流数据。
步骤402:根据该多个训练样本,以及该多个训练样本中每个训练样本对应的样本标签,对多个初始检测模型进行训练,得到多个行为检测模型。
在本申请实施例中,攻击检测设备在获取多个训练样本,以及每个训练样本对应的样本标签之后,能够对多个初始检测模型中的每个模型分别进行训练,得到多个行为检测模型。其中,多个初始检测模型分别对应EK的攻击行为轨迹中的不同阶段,也即是,按照EK攻击行为轨迹中不同阶段的行为特征,选择出的初始检测模型。
攻击检测设备能够从该多个初始检测模型中选择一个初始检测模型,根据选择的初始检测模型执行以下操作,直至根据该多个初始检测模型中的每个初始检测模型均已执行以下操作为止:根据该多个训练样本中每个训练样本包括的样本HTTP报文,确定所选择的初始检测模型对应的样本特征集,样本特征集包括与该多个训练样本一一对应的多个样本特征向量;将该多个样本特征向量分别输入所选择的初始检测模型,对所选择的初始检测模型进行训练,以使所选择的初始检测模型的输出分别为该多个训练样本中相应训练样本对应的样本标签,从而得到一个行为检测模型。
在本申请实施例中,由于多个行为检测模型分别用于描述EK的攻击行为轨迹中的不同阶段,因此,用于训练各个初始检测模型的样本特征向量也不相同。攻击检测设备能够根据多个训练样本包括的样本HTTP报文,以及每个行为检测模型对应的特征向量包括的行为特征,来确定相应初始检测模型对应的样本特征集。
示例性的,仍以多个行为检测模型包括重定向检测模型、攻击对象筛选检测模型、漏洞利用检测模型和恶意软件下载检测模型,重定向检测模型对应的特征向量为[X1、X2、X3],攻击对象筛选检测模型对应的特征向量可以为[X1,X2,X6],漏洞利用检测模型对应的特征向量可以为[X1,X2,X4,X5],恶意文件下载检测模型对应的特征向量可以为[X1,X2]。以重定向检测模型为例,重定向检测模型对应的初始检测模型的样本特征集可以为:
Figure PCTCN2020118782-appb-000004
其中,n表示样本特征集包括的样本特征向量的总数量。
攻击检测设备在确定所选择的初始检测模型对应的样本特征集之后,可以将该样本特征集包括的多个样本特正向量分别输入所选择的初始检测模型,对所选择的初始检测模型进行训练,以使所选择的初始检测模型的输出分别为多个训练样本中相应训练样本对应的样本标签。也即是,本申请实施例中,训练多个初始检测模型的过程均为监督学习的过程。
上述攻击检测模型中的关联分析模型也是事先根据训练样本确定的模型,也即是在得到多个行为检测模型之后,参见图7,攻击行为检测方法还包括步骤403和步骤404。
步骤403:根据该多个行为检测模型,以及该多个行为检测模型中每个行为检测模型对应的样本特征集,确定样本交叉特征集。
在本申请实施例中,在确定该多个行为检测模型之后,攻击检测设备能够从该多个训练样本中选择一个训练样本,对选择的训练样本执行以下处理,直至处理完该多个训练样本中的每个训练样本为止:将多个行为检测模型对应的样本特征集中与所选择的训练样本对应的样本特征向量分别输入该多个行为检测模型,获得该多个行为检测模型分别输出的样本概率值,从而得到多个样本概率值;根据该多个样本概率值,确定多个样本交叉特征,样本交叉特征是指该多个样本概率值中的两个不同的数据相乘后得到的;根据该多个样本交叉特征生成一个样本交叉特征向量。
在执行完上述处理之后,攻击检测设备能够根据该多个训练样本一一对应的多个样本交叉特征向量,来确定样本交叉特征集,也即是样本交叉特征集包括与多个训练样本一一对应的多个样本交叉特征向量。
可选地,在本申请实施例中,攻击检测设备还能够对该多个训练样本分别进行漏洞文件检测和恶意软件检测,得到该多个训练样本中每个训练样本对应的漏洞文件检测结果和恶意软件检测结果。之后,攻击检测设备能够根据该多个行为检测模型、该多个行为检测模型中每个行为检测模型对应的样本特征集,以及该多个训练样本中每个训练样本对应的漏洞文件检测结果和恶意软件检测结果,来确定样本交叉特征集。
需要说明的是,在本申请实施例中,攻击检测设备能够根据IPS对每个训练样本进行漏洞文件检测和恶意软件检测。
在本申请实施例中,攻击检测设备能够从该多个训练样本中选择一个训练样本,对选择的训练样本执行以下处理,直至处理完所述多个训练样本中的每个训练样本为止:将该多个行为检测模型对应的样本特征集中与所选择的训练样本对应的样本特征向量分别输入该多个行为检测模型,获得该多个行为检测模型分别输出的样本概率值,从而得到多个样本概率值;根据该多个样本概率值,以及所选择的训练样本对应的漏洞文件检测结果和恶意软件检测结果,确定多个样本交叉特征。这种情况下,样本交叉特征是指该多个样本概率值、所选择的训练样本对应的漏洞文件检测结果和恶意软件检测结果中的两个不同的数据相乘后得到的;根据该多个样本交叉特征生成一个样本交叉特征向量。
需要说明的是,根据多个样本概率值,以及所选择的训练样本对应的漏洞文件检测结果和恶意软件检测结果,确定多个样本交叉特征的实现方式可以参照前述确定多个交叉特征的相关介绍,这里不再赘述。
步骤404:将该多个样本交叉特征向量分别输入初始分析模型,对该初始分析模型进行训练,以使该初始分析模型的输出分别为多个训练样本中相应训练样本对应的样本标签,从而得到关联分析模型。
在本申请实施例中,攻击检测设备在确定多个样本交叉特征向量之后,能够将该多个样本交叉特征向量输入初始分析模型,对该初始分析模型进行训练,以使该初始分析模型的输出分别为多个训练样本中相应训练样本对应的样本标签,从而得到关联分析模型。也即是,本申请实施例中的训练得到关联分析模型的过程为监督学习的过程。
可选地,关联分析模型为根据任一种机器学习算法确定的模型,本申请实施例对此不作限定。如果关联分析模型为逻辑回归模型,则训练得到该模型之后,还能够得到样本交叉特征向量中每个样本交叉特征对应的权重,权重用于表征每个交叉特征的重要度,后续如果根据该关联分析模型确定主机被EK攻击,则能在告警信息中携带为何确定该主机被EK攻击。
综上所述,在本申请实施例中,由于EK的攻击行为轨迹包括多个不同阶段,因此,本方案能够通过获取主机在一个时间段内的HTTP报文流数据,并通过多个行为检测模型进行处理,确定多个初始概率值,且由于该多个行为检测模型分别用于描述该多个不同阶段,因此,本方案能够完整刻画EK的攻击行为轨迹。在确定多个初始概率值之后,可以对该多个初始概率值进行综合处理,得到综合概率值,也即本方案能够综合分析各个阶段EK攻击的行为模式,更加准确地确定该主机在传输数据流的过程中被EK攻击的概率,也即更加准确地检测EK的攻击行为。由此可见,本方案既能够快速准确地检测EK的攻击行为,也不会严重耗费主机本身的资源。另外,由于本方案中获取的HTTP报文流数据仅包含网络协议规定的常规数据,因此,相比于获取脚本代码进行解析的方法,本方案存在的侵犯用户隐私的风险很低。
图8是本申请实施例提供的一种攻击行为检测装置的结构示意图,该攻击行为检测装置800可以由软件、硬件或者两者的结合实现成为攻击检测设备的部分或者全部,该攻击检测设备可以为图1所示的攻击检测设备。参见图8,该装置包括:第一获取模块801、第一确定模块802、第二确定模块803和第三确定模块804。
第一获取模块801,用于获取主机在参考时间段内传输的HTTP报文流数据,HTTP报文流数据包括一个或多个HTTP报文中的数据,一个或多个HTTP报文属于第一数据流,参考时间段为当前时间之前且距离当前时间参考时长的时间段;
第一确定模块802,用于根据HTTP报文流数据,通过多个行为检测模型,确定多个初始概率值,该多个行为检测模型分别用于描述EK的攻击行为轨迹中的不同阶段,初始概率值是指多个行为检测模型中的一个行为检测模型输出的概率值;
第二确定模块803,用于根据该多个初始概率值确定综合概率值,综合概率值用于指示主机传输第一数据流的过程中被EK攻击的可能性;
第三确定模块804,用于如果该综合概率值大于预设概率阈值,则确定该主机传输第一数据流的过程中存在EK的攻击行为。
可选地,第一确定模块802具体用于:
从该多个行为检测模型中选择一个行为检测模型,根据选择的行为检测模型执行以下操作,直至根据多个行为检测模型中的每个行为检测模型均已执行以下操作为止:
根据HTTP报文流数据,确定选择的行为检测模型对应的特征向量;
将该特征向量输入选择的行为检测模型,获得选择的行为检测模型输出的初始概率值。
可选地,该多个行为检测模型包括至少两个以下模型:重定向检测模型、攻击对象筛选检测模型、漏洞利用检测模型和恶意软件下载检测模型。
可选地,第二确定模块803包括:
第一确定单元,用于根据该多个初始概率值确定多个交叉特征,交叉特征是指多个初始概率值中两个不同的初始概率值相乘后得到的;
生成单元,用于根据该多个交叉特征生成交叉特征向量;
综合分析单元,用于将该交叉特征向量输入关联分析模型,获得关联分析模型输出的综合概率值,该关联分析模型用于对EK的攻击行为轨迹中的多个不同阶段进行综合分析。
可选地,第二确定模块803还包括:
第二确定单元,用于对该HTTP报文流数据进行漏洞文件检测和恶意软件检测,得到漏洞文件检测结果和恶意软件检测结果;
第一确定单元具体用于:
根据该多个初始概率值,以及该漏洞文件检测结果和恶意软件检测结果,确定多个交叉特征,交叉特征是指多个初始概率值、漏洞文件检测结果和恶意软件检测结果中的两个不同的数据相乘后得到的。
可选地,第一确定单元具体用于:
根据该多个初始概率值,以及漏洞文件检测结果和恶意软件检测结果,生成一个概率矩阵,概率矩阵为X行X列的矩阵,X为多个初始概率值、漏洞文件检测结果和恶意软件检测结果的总个数,X行和X列均对应多个初始概率值、漏洞文件检测结果和恶意软件检测结果, 概率矩阵中的元素是将交叉的两个数据相乘后得到的;
按照交叉特征选择策略,从该概率矩阵中筛选出多个元素,将筛选出的多个元素作为多个交叉特征。
可选地,该装置800还包括:
第一过滤单元,用于根据过滤规则集,对HTTP报文流数据进行过滤;
第一确定模块具体用于:
根据过滤后剩余的HTTP报文流数据,通过多个行为检测模型,确定多个初始概率值。
可选地,该过滤规则集包括但不限于以下的规则:
第一过滤规则,第一过滤规则的匹配项为:包含一个或多个操作系统的类型的参考类型集,参考类型集中包括被EK攻击的概率小于参考概率阈值的操作系统的类型,第一过滤规则的动作为:过滤掉,第一过滤规则用于将第一目的HTTP报文中的数据过滤掉,第一目的HTTP报文是指携带的操作系统的类型包含在参考类型集中的HTTP报文;和/或
第二过滤规则,第二过滤规则的匹配项为:一个或多个内网地址,第二过滤规则的动作为:过滤掉,第二过滤规则用于将第二目的HTTP报文中的数据过滤掉,第二目的HTTP报文是指携带的目的地址为内网地址的HTTP报文;和/或
第三过滤规则,第三过滤规则的匹配项为:包含一个或多个域名的参考域名集,参考域名集中包括被访问频率大于频率阈值的域名,第三过滤规则的动作为:过滤掉,第三过滤规则用于将第三目的HTTP报文中的数据过滤掉,第三目的HTTP报文是指携带的域名包含在参考域名集中的HTTP报文。
可选地,该装置800还包括:
第二获取模块,用于获取多个训练样本,以及该多个训练样本中每个训练样本对应的样本标签,训练样本包括属于第二数据流的一个或多个样本HTTP报文中的数据,样本标签用于指示对应的训练样本为正训练样本还是负训练样本,正训练样本是指未被EK攻击的HTTP报文流数据,负训练样本是指被EK攻击的HTTP报文流数据;
第一训练模块,用于根据该多个训练样本,以及该多个训练样本中每个训练样本对应的样本标签,对多个初始检测模型进行训练,得到多个行为检测模型,该多个初始检测模型分别对应EK的攻击行为轨迹中的不同阶段。
可选地,第二获取模块包括:
获取单元,用于获取多个样本HTTP报文流数据,样本HTTP报文流数据是指第二数据流中位于当前时间之前的参考时长内的HTTP报文中的数据;
第二过滤单元,用于根据该过滤规则集,对多个样本HTTP报文流数据中的每个样本HTTP报文流数据进行过滤;
第三确定单元,用于将过滤后剩余的多个样本HTTP报文流数据确定为多个训练样本。
可选地,第一训练模块具体用于:
从该多个初始检测模型中选择一个初始检测模型,根据选择的初始检测模型执行以下操作,直至根据多个初始检测模型中的每个初始检测模型均已执行以下操作为止:
根据该多个训练样本中每个训练样本包括的样本HTTP报文,确定选择的初始检测模型对应的样本特征集,样本特征集包括与多个训练样本一一对应的多个样本特征向量;
将该多个样本特征向量分别输入选择的初始检测模型,对选择的初始检测模型进行训练, 以使选择的初始检测模型的输出分别为该多个训练样本中相应训练样本对应的样本标签,从而得到一个行为检测模型。
可选地,该装置800还包括:
第二获取模块,用于获取多个训练样本,以及该多个训练样本中每个训练样本对应的样本标签,训练样本包括属于第二数据流的一个或多个样本HTTP报文中的数据,样本标签用于指示对应的训练样本为正训练样本还是负训练样本,正训练样本是指未被EK攻击的HTTP报文流数据,负训练样本是指被EK攻击的HTTP报文流数据;
第一训练模块,用于根据该多个训练样本,以及该多个训练样本中每个训练样本对应的样本标签,对多个初始检测模型进行训练,得到多个行为检测模型,该多个初始检测模型分别对应EK的攻击行为轨迹中的不同阶段;
第三确定模块,用于根据该多个行为检测模型,以及该多个行为检测模型中每个行为检测模型对应的样本特征集,确定样本交叉特征集,样本交叉特征集包括与多个训练样本一一对应的多个样本交叉特征向量;
第二训练模块,用于将该多个样本交叉特征向量分别输入初始分析模型,对初始分析模型进行训练,以使该初始分析模型的输出分别为多个训练样本中相应训练样本对应的样本标签,从而得到关联分析模型。
可选地,该装置800还包括:
第四确定模块,用于对该多个训练样本分别进行漏洞文件检测和恶意软件检测,得到该多个训练样本中每个训练样本对应的漏洞文件检测结果和恶意软件检测结果;
第三确定模块用于:
第四确定单元,用于根据该多个行为检测模型、该多个行为检测模型中每个行为检测模型对应的样本特征集,以及该多个训练样本中每个训练样本对应的漏洞文件检测结果和恶意软件检测结果,确定样本交叉特征集。
可选地,第三确定模块具体用于:
从该多个训练样本中选择一个训练样本,对选择的训练样本执行以下处理,直至处理完该多个训练样本中的每个训练样本为止:
将该多个行为检测模型对应的样本特征集中与选择的训练样本对应的样本特征向量分别输入该多个行为检测模型,获得该多个行为检测模型分别输出的样本概率值,从而得到多个样本概率值;
根据该多个样本概率值,以及选择的训练样本对应的漏洞文件检测结果和恶意软件检测结果,确定多个样本交叉特征,样本交叉特征是指多个样本概率值、选择的训练样本对应的漏洞文件检测结果和恶意软件检测结果中的两个不同的数据相乘后得到的;
根据该多个样本交叉特征生成一个样本交叉特征向量。
综上所述,在本申请实施例中,在本申请实施例中,由于EK的攻击行为轨迹包括多个不同阶段,因此,本方案能够通过获取主机在一个时间段内的HTTP报文流数据,并通过多个行为检测模型进行处理,确定多个初始概率值,且由于该多个行为检测模型分别用于描述该多个不同阶段,因此,本方案能够完整刻画EK的攻击行为轨迹。在确定多个初始概率值之后,可以对该多个初始概率值进行综合处理,得到综合概率值,也即本方案能够综合分析各个阶段EK攻击的行为模式,更加准确地确定该主机在传输数据流的过程中被EK攻击的 概率,也即更加准确地检测EK的攻击行为。由此可见,本方案既能够快速准确地检测EK的攻击行为,也不会严重耗费主机本身的资源。另外,由于本方案中获取的HTTP报文流数据仅包含网络协议规定的常规数据,因此,相比于获取脚本代码进行解析的方法,本方案存在的侵犯用户隐私的风险很低。
需要说明的是:上述实施例提供的攻击行为检测装置在检测攻击行为时,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将装置的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。另外,上述实施例提供的攻击行为检测装置与攻击行为检测方法实施例属于同一构思,其具体实现过程详见方法实施例,这里不再赘述。
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意结合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机指令时,全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络或其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如:同轴电缆、光纤、数据用户线(digital subscriber line,DSL))或无线(例如:红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质,或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质(例如:软盘、硬盘、磁带)、光介质(例如:数字通用光盘(digital versatile disc,DVD))或半导体介质(例如:固态硬盘(solid state disk,SSD))等。值得注意的是,本申请提到的计算机可读存储介质可以为非易失性存储介质,换句话说,可以是非瞬时性存储介质。
以上所述为本申请提供的实施例,并不用以限制本申请,本领域技术人员基于本申请实施例的描述所作的任何修改、等同替换、改进等,均应包含在本申请的保护范围之内。

Claims (21)

  1. 一种攻击行为检测方法,其特征在于,所述方法包括:
    获取主机在参考时间段内传输的超文本传输协议HTTP报文流数据,所述HTTP报文流数据包括一个或多个HTTP报文中的数据,所述一个或多个HTTP报文属于第一数据流,所述参考时间段为当前时间之前且距离当前时间参考时长的时间段;
    根据所述HTTP报文流数据,通过多个行为检测模型,确定多个初始概率值,所述多个行为检测模型分别用于描述漏洞利用工具包EK的攻击行为轨迹中的不同阶段,所述初始概率值是指所述多个行为检测模型中的一个行为检测模型输出的概率值;
    根据所述多个初始概率值确定综合概率值,所述综合概率值用于指示所述主机传输所述第一数据流的过程中被EK攻击的可能性;
    如果所述综合概率值大于预设概率阈值,则确定所述主机传输所述第一数据流的过程中存在EK的攻击行为。
  2. 如权利要求1所述的方法,其特征在于,所述根据所述HTTP报文流数据,通过多个行为检测模型,确定多个初始概率值,包括:
    从所述多个行为检测模型中选择一个行为检测模型,根据选择的行为检测模型执行以下操作,直至根据所述多个行为检测模型中的每个行为检测模型均已执行以下操作为止:
    根据所述HTTP报文流数据,确定所述选择的行为检测模型对应的特征向量;
    将所述特征向量输入所述选择的行为检测模型,获得所述选择的行为检测模型输出的初始概率值。
  3. 如权利要求1所述的方法,其特征在于,所述根据所述多个初始概率值确定综合概率值,包括:
    根据所述多个初始概率值确定多个交叉特征,所述交叉特征是指所述多个初始概率值中两个不同的初始概率值相乘后得到的;
    根据所述多个交叉特征生成交叉特征向量;
    将所述交叉特征向量输入关联分析模型,获得所述关联分析模型输出的所述综合概率值,所述关联分析模型用于对EK的攻击行为轨迹中的多个不同阶段进行综合分析。
  4. 如权利要求3所述的方法,其特征在于,所述根据所述多个初始概率值确定多个交叉特征之前,还包括:
    对所述HTTP报文流数据进行漏洞文件检测和恶意软件检测,得到漏洞文件检测结果和恶意软件检测结果;
    所述根据所述多个初始概率值确定多个交叉特征,包括:
    根据所述多个初始概率值,以及所述漏洞文件检测结果和所述恶意软件检测结果,确定所述多个交叉特征,所述交叉特征是指所述多个初始概率值、所述漏洞文件检测结果和所述恶意软件检测结果中的两个不同的数据相乘后得到的。
  5. 如权利要求4所述的方法,其特征在于,所述根据所述多个初始概率值,以及所述漏洞文件检测结果和恶意软件检测结果,确定所述多个交叉特征,包括:
    根据所述多个初始概率值,以及所述漏洞文件检测结果和恶意软件检测结果,生成一个概率矩阵,所述概率矩阵为X行X列的矩阵,X为所述多个初始概率值、所述漏洞文件检测结果和恶意软件检测结果的总个数,X行和X列均对应所述多个初始概率值、所述漏洞文件检测结果和恶意软件检测结果,所述概率矩阵中的元素是将交叉的两个数据相乘后得到的;
    按照交叉特征选择策略,从所述概率矩阵中筛选出多个元素,将筛选出的多个元素作为所述多个交叉特征。
  6. 如权利要求1-2、4-5任一所述的方法,其特征在于,所述根据所述HTTP报文流数据,通过多个行为检测模型,确定多个初始概率值之前,还包括:
    获取多个训练样本,以及所述多个训练样本中每个训练样本对应的样本标签,所述训练样本包括属于第二数据流的一个或多个样本HTTP报文中的数据,所述样本标签用于指示对应的训练样本为正训练样本还是负训练样本,所述正训练样本是指未被EK攻击的HTTP报文流数据,所述负训练样本是指被EK攻击的HTTP报文流数据;
    根据所述多个训练样本,以及所述多个训练样本中每个训练样本对应的样本标签,对多个初始检测模型进行训练,得到所述多个行为检测模型,所述多个初始检测模型分别对应EK的攻击行为轨迹中的不同阶段。
  7. 如权利要求6所述的方法,其特征在于,所述根据所述多个训练样本,以及所述多个训练样本中每个训练样本对应的样本标签,对多个初始检测模型进行训练,得到所述多个行为检测模型,包括:
    从所述多个初始检测模型中选择一个初始检测模型,根据选择的初始检测模型执行以下操作,直至根据所述多个初始检测模型中的每个初始检测模型均已执行以下操作为止:
    根据所述多个训练样本中每个训练样本包括的样本HTTP报文,确定所述选择的初始检测模型对应的样本特征集,所述样本特征集包括与所述多个训练样本一一对应的多个样本特征向量;
    将所述多个样本特征向量分别输入所述选择的初始检测模型,对所述选择的初始检测模型进行训练,以使所述选择的初始检测模型的输出分别为所述多个训练样本中相应训练样本对应的样本标签,从而得到一个行为检测模型。
  8. 如权利要求3所述的方法,其特征在于,所述根据所述HTTP报文流数据,通过多个行为检测模型,确定多个初始概率值之前,还包括:
    获取多个训练样本,以及所述多个训练样本中每个训练样本对应的样本标签,所述训练样本包括属于第二数据流的一个或多个样本HTTP报文中的数据,所述样本标签用于指示对应的训练样本为正训练样本还是负训练样本,所述正训练样本是指未被EK攻击的HTTP报文流数据,所述负训练样本是指被EK攻击的HTTP报文流数据;
    根据所述多个训练样本,以及所述多个训练样本中每个训练样本对应的样本标签,对多 个初始检测模型进行训练,得到所述多个行为检测模型,所述多个初始检测模型分别对应EK的攻击行为轨迹中的不同阶段;
    根据所述多个行为检测模型,以及所述多个行为检测模型中每个行为检测模型对应的样本特征集,确定样本交叉特征集,所述样本交叉特征集包括与所述多个训练样本一一对应的多个样本交叉特征向量;
    将所述多个样本交叉特征向量分别输入初始分析模型,对所述初始分析模型进行训练,以使所述初始分析模型的输出分别为所述多个训练样本中相应训练样本对应的样本标签,从而得到所述关联分析模型。
  9. 如权利要求8所述的方法,其特征在于,所述根据所述多个行为检测模型,以及所述多个行为检测模型中每个行为检测模型对应的样本特征集,确定样本交叉特征集之前,还包括:
    对所述多个训练样本分别进行漏洞文件检测和恶意软件检测,得到所述多个训练样本中每个训练样本对应的漏洞文件检测结果和恶意软件检测结果;
    所述根据所述多个行为检测模型,以及所述多个行为检测模型中每个行为检测模型对应的样本特征集,确定样本交叉特征集,包括:
    根据所述多个行为检测模型、所述多个行为检测模型中每个行为检测模型对应的样本特征集,以及所述多个训练样本中每个训练样本对应的漏洞文件检测结果和恶意软件检测结果,确定所述样本交叉特征集。
  10. 如权利要求9所述的方法,其特征在于,所述根据所述多个行为检测模型、所述多个行为检测模型中每个行为检测模型对应的样本特征集,以及所述多个训练样本中每个训练样本对应的漏洞文件检测结果和恶意软件检测结果,确定所述样本交叉特征集,包括:
    从所述多个训练样本中选择一个训练样本,对选择的训练样本执行以下处理,直至处理完所述多个训练样本中的每个训练样本为止:
    将所述多个行为检测模型对应的样本特征集中与所述选择的训练样本对应的样本特征向量分别输入所述多个行为检测模型,获得所述多个行为检测模型分别输出的样本概率值,从而得到多个样本概率值;
    根据所述多个样本概率值,以及所述选择的训练样本对应的漏洞文件检测结果和恶意软件检测结果,确定多个样本交叉特征,所述样本交叉特征是指所述多个样本概率值、所述选择的训练样本对应的漏洞文件检测结果和恶意软件检测结果中的两个不同的数据相乘后得到的;
    根据所述多个样本交叉特征生成一个样本交叉特征向量。
  11. 一种攻击行为检测装置,其特征在于,所述装置包括:
    第一获取模块,用于获取主机在参考时间段内传输的HTTP报文流数据,所述HTTP报文流数据包括一个或多个HTTP报文中的数据,所述一个或多个HTTP报文属于第一数据流,所述参考时间段为当前时间之前且距离当前时间参考时长的时间段;
    第一确定模块,用于根据所述HTTP报文流数据,通过多个行为检测模型,确定多个初 始概率值,所述多个行为检测模型分别用于描述EK的攻击行为轨迹中的不同阶段,所述初始概率值是指所述多个行为检测模型中的一个行为检测模型输出的概率值;
    第二确定模块,用于根据所述多个初始概率值确定综合概率值,所述综合概率值用于指示所述主机传输所述第一数据流的过程中被EK攻击的可能性;
    第三确定模块,用于如果所述综合概率值大于预设概率阈值,则确定所述主机传输所述第一数据流的过程中存在EK的攻击行为。
  12. 如权利要求11所述的装置,其特征在于,所述第一确定模块具体用于:
    从所述多个行为检测模型中选择一个行为检测模型,根据选择的行为检测模型执行以下操作,直至根据所述多个行为检测模型中的每个行为检测模型均已执行以下操作为止:
    根据所述HTTP报文流数据,确定所述选择的行为检测模型对应的特征向量;
    将所述特征向量输入所述选择的行为检测模型,获得所述选择的行为检测模型输出的初始概率值。
  13. 如权利要求11所述的装置,其特征在于,所述第二确定模块包括:
    第一确定单元,用于根据所述多个初始概率值确定多个交叉特征,所述交叉特征是指所述多个初始概率值中两个不同的初始概率值相乘后得到的;
    生成单元,用于根据所述多个交叉特征生成交叉特征向量;
    综合分析单元,用于将所述交叉特征向量输入关联分析模型,获得所述关联分析模型输出的所述综合概率值,所述关联分析模型用于对EK的攻击行为轨迹中的多个不同阶段进行综合分析。
  14. 如权利要求13所述的装置,其特征在于,所述第二确定模块还包括:
    第二确定单元,用于对所述HTTP报文流数据进行漏洞文件检测和恶意软件检测,得到漏洞文件检测结果和恶意软件检测结果;
    所述第一确定单元具体用于:
    根据所述多个初始概率值,以及所述漏洞文件检测结果和所述恶意软件检测结果,确定所述多个交叉特征,所述交叉特征是指所述多个初始概率值、所述漏洞文件检测结果和所述恶意软件检测结果中的两个不同的数据相乘后得到的。
  15. 如权利要求14所述的装置,其特征在于,所述第一确定单元具体用于:
    根据所述多个初始概率值,以及所述漏洞文件检测结果和恶意软件检测结果,生成一个概率矩阵,所述概率矩阵为X行X列的矩阵,X为所述多个初始概率值、所述漏洞文件检测结果和恶意软件检测结果的总个数,X行和X列均对应所述多个初始概率值、所述漏洞文件检测结果和恶意软件检测结果,所述概率矩阵中的元素是将交叉的两个数据相乘后得到的;
    按照交叉特征选择策略,从所述概率矩阵中筛选出多个元素,将筛选出的多个元素作为所述多个交叉特征。
  16. 如权利要求11-12、14-15任一所述的装置,其特征在于,所述装置还包括:
    第二获取模块,用于获取多个训练样本,以及所述多个训练样本中每个训练样本对应的样本标签,所述训练样本包括属于第二数据流的一个或多个样本HTTP报文中的数据,所述样本标签用于指示对应的训练样本为正训练样本还是负训练样本,所述正训练样本是指未被EK攻击的HTTP报文流数据,所述负训练样本是指被EK攻击的HTTP报文流数据;
    第一训练模块,用于根据所述多个训练样本,以及所述多个训练样本中每个训练样本对应的样本标签,对多个初始检测模型进行训练,得到所述多个行为检测模型,所述多个初始检测模型分别对应EK的攻击行为轨迹中的不同阶段。
  17. 如权利要求16所述的装置,其特征在于,所述第一训练模块具体用于:
    从所述多个初始检测模型中选择一个初始检测模型,根据选择的初始检测模型执行以下操作,直至根据所述多个初始检测模型中的每个初始检测模型均已执行以下操作为止:
    根据所述多个训练样本中每个训练样本包括的样本HTTP报文,确定所述选择的初始检测模型对应的样本特征集,所述样本特征集包括与所述多个训练样本一一对应的多个样本特征向量;
    将所述多个样本特征向量分别输入所述选择的初始检测模型,对所述选择的初始检测模型进行训练,以使所述选择的初始检测模型的输出分别为所述多个训练样本中相应训练样本对应的样本标签,从而得到一个行为检测模型。
  18. 如权利要求13所述的装置,其特征在于,所述装置还包括:
    第二获取模块,用于获取多个训练样本,以及所述多个训练样本中每个训练样本对应的样本标签,所述训练样本包括属于第二数据流的一个或多个样本HTTP报文中的数据,所述样本标签用于指示对应的训练样本为正训练样本还是负训练样本,所述正训练样本是指未被EK攻击的HTTP报文流数据,所述负训练样本是指被EK攻击的HTTP报文流数据;
    第一训练模块,用于根据所述多个训练样本,以及所述多个训练样本中每个训练样本对应的样本标签,对多个初始检测模型进行训练,得到所述多个行为检测模型,所述多个初始检测模型分别对应EK的攻击行为轨迹中的不同阶段;
    第三确定模块,用于根据所述多个行为检测模型,以及所述多个行为检测模型中每个行为检测模型对应的样本特征集,确定样本交叉特征集,所述样本交叉特征集包括与所述多个训练样本一一对应的多个样本交叉特征向量;
    第二训练模块,用于将所述多个样本交叉特征向量分别输入初始分析模型,对所述初始分析模型进行训练,以使所述初始分析模型的输出分别为所述多个训练样本中相应训练样本对应的样本标签,从而得到所述关联分析模型。
  19. 如权利要求18所述的装置,其特征在于,所述装置还包括:
    第四确定模块,用于对所述多个训练样本分别进行漏洞文件检测和恶意软件检测,得到所述多个训练样本中每个训练样本对应的漏洞文件检测结果和恶意软件检测结果;
    所述第三确定模块用于:
    根据所述多个行为检测模型、所述多个行为检测模型中每个行为检测模型对应的样本特征集,以及所述多个训练样本中每个训练样本对应的漏洞文件检测结果和恶意软件检测结果, 确定所述样本交叉特征集。
  20. 如权利要求19所述的装置,其特征在于,所述第三确定模块具体用于:
    从所述多个训练样本中选择一个训练样本,对选择的训练样本执行以下处理,直至处理完所述多个训练样本中的每个训练样本为止:
    将所述多个行为检测模型对应的样本特征集中与所述选择的训练样本对应的样本特征向量分别输入所述多个行为检测模型,获得所述多个行为检测模型分别输出的样本概率值,从而得到多个样本概率值;
    根据所述多个样本概率值,以及所述选择的训练样本对应的漏洞文件检测结果和恶意软件检测结果,确定多个样本交叉特征,所述样本交叉特征是指所述多个样本概率值、所述选择的训练样本对应的漏洞文件检测结果和恶意软件检测结果中的两个不同的数据相乘后得到的;
    根据所述多个样本交叉特征生成一个样本交叉特征向量。
  21. 一种攻击检测设备,其特征在于,所述攻击检测设备包括存储器和处理器,所述存储器中存储有用于实现攻击行为检测方法的程序以及所涉及的数据,所述处理器用于执行所述存储器中存储的程序,以实现权利要求1-10任一所述的方法的步骤。
PCT/CN2020/118782 2020-02-27 2020-09-29 攻击行为检测方法、装置及攻击检测设备 WO2021169293A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP20922229.8A EP4060958B1 (en) 2020-02-27 2020-09-29 Attack behavior detection method and apparatus, and attack detection device
US17/867,976 US20220368706A1 (en) 2020-02-27 2022-07-19 Attack Behavior Detection Method and Apparatus, and Attack Detection Device

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010123839.X 2020-02-27
CN202010123839.XA CN113315742B (zh) 2020-02-27 2020-02-27 攻击行为检测方法、装置及攻击检测设备

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/867,976 Continuation US20220368706A1 (en) 2020-02-27 2022-07-19 Attack Behavior Detection Method and Apparatus, and Attack Detection Device

Publications (1)

Publication Number Publication Date
WO2021169293A1 true WO2021169293A1 (zh) 2021-09-02

Family

ID=77370308

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/118782 WO2021169293A1 (zh) 2020-02-27 2020-09-29 攻击行为检测方法、装置及攻击检测设备

Country Status (4)

Country Link
US (1) US20220368706A1 (zh)
EP (1) EP4060958B1 (zh)
CN (1) CN113315742B (zh)
WO (1) WO2021169293A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114338211A (zh) * 2021-12-31 2022-04-12 上海浦东发展银行股份有限公司 一种网络攻击的溯源方法和装置、电子设备及存储介质
CN114697143A (zh) * 2022-06-02 2022-07-01 苏州英博特力信息科技有限公司 基于指纹考勤系统的信息处理方法及指纹考勤服务系统
CN114726654A (zh) * 2022-05-25 2022-07-08 青岛众信创联电子科技有限公司 一种应对云计算网络攻击的数据分析方法及服务器

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113704772B (zh) * 2021-08-31 2022-05-17 中数智创科技有限公司 基于用户行为大数据挖掘的安全防护处理方法及系统
CN113872976B (zh) * 2021-09-29 2023-06-02 绿盟科技集团股份有限公司 一种基于http2攻击的防护方法、装置及电子设备
CN114710354B (zh) * 2022-04-11 2023-09-08 中国电信股份有限公司 异常事件检测方法及装置、存储介质及电子设备
US11785028B1 (en) * 2022-07-31 2023-10-10 Uab 360 It Dynamic analysis for detecting harmful content

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160337387A1 (en) * 2015-05-14 2016-11-17 International Business Machines Corporation Detecting web exploit kits by tree-based structural similarity search
CN107770168A (zh) * 2017-10-18 2018-03-06 杭州白客安全技术有限公司 基于攻击链马尔科夫决策过程的低误报率ids/ips
CN110075524A (zh) * 2019-05-10 2019-08-02 腾讯科技(深圳)有限公司 异常行为检测方法和装置
KR102007809B1 (ko) * 2019-03-05 2019-08-06 에스지에이솔루션즈 주식회사 이미지를 이용한 신경망 기반 익스플로잇킷 탐지 시스템

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10158664B2 (en) * 2014-07-22 2018-12-18 Verisign, Inc. Malicious code detection
US9407646B2 (en) * 2014-07-23 2016-08-02 Cisco Technology, Inc. Applying a mitigation specific attack detector using machine learning
CN105227548B (zh) * 2015-09-14 2018-06-26 中国人民解放军国防科学技术大学 基于办公局域网稳态模型的异常流量筛选方法
CN108418843B (zh) * 2018-06-11 2021-06-18 中国人民解放军战略支援部队信息工程大学 基于攻击图的网络攻击目标识别方法及系统

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160337387A1 (en) * 2015-05-14 2016-11-17 International Business Machines Corporation Detecting web exploit kits by tree-based structural similarity search
CN107770168A (zh) * 2017-10-18 2018-03-06 杭州白客安全技术有限公司 基于攻击链马尔科夫决策过程的低误报率ids/ips
KR102007809B1 (ko) * 2019-03-05 2019-08-06 에스지에이솔루션즈 주식회사 이미지를 이용한 신경망 기반 익스플로잇킷 탐지 시스템
CN110075524A (zh) * 2019-05-10 2019-08-02 腾讯科技(深圳)有限公司 异常行为检测方法和装置

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114338211A (zh) * 2021-12-31 2022-04-12 上海浦东发展银行股份有限公司 一种网络攻击的溯源方法和装置、电子设备及存储介质
CN114338211B (zh) * 2021-12-31 2023-10-20 上海浦东发展银行股份有限公司 一种网络攻击的溯源方法和装置、电子设备及存储介质
CN114726654A (zh) * 2022-05-25 2022-07-08 青岛众信创联电子科技有限公司 一种应对云计算网络攻击的数据分析方法及服务器
CN114726654B (zh) * 2022-05-25 2022-12-06 北京徽享科技有限公司 应对云计算网络攻击的数据分析方法及服务器
CN114697143A (zh) * 2022-06-02 2022-07-01 苏州英博特力信息科技有限公司 基于指纹考勤系统的信息处理方法及指纹考勤服务系统
CN114697143B (zh) * 2022-06-02 2022-08-23 苏州英博特力信息科技有限公司 基于指纹考勤系统的信息处理方法及指纹考勤服务系统

Also Published As

Publication number Publication date
US20220368706A1 (en) 2022-11-17
CN113315742A (zh) 2021-08-27
CN113315742B (zh) 2022-08-09
EP4060958A1 (en) 2022-09-21
EP4060958B1 (en) 2023-11-08
EP4060958A4 (en) 2023-01-25

Similar Documents

Publication Publication Date Title
WO2021169293A1 (zh) 攻击行为检测方法、装置及攻击检测设备
US11258805B2 (en) Computer-security event clustering and violation detection
US11783035B2 (en) Multi-representational learning models for static analysis of source code
US10121000B1 (en) System and method to detect premium attacks on electronic networks and electronic devices
US11392689B2 (en) Computer-security violation detection using coordinate vectors
US10176321B2 (en) Leveraging behavior-based rules for malware family classification
US10375143B2 (en) Learning indicators of compromise with hierarchical models
US10581874B1 (en) Malware detection system with contextual analysis
US9294501B2 (en) Fuzzy hash of behavioral results
US10574695B2 (en) Gateway apparatus, detecting method of malicious domain and hacked host thereof, and non-transitory computer readable medium
US11062024B2 (en) Computer-security event security-violation detection
KR101388090B1 (ko) 이벤트 분석에 기반한 사이버 공격 탐지 장치 및 방법
US11615184B2 (en) Building multi-representational learning models for static analysis of source code
US20180034837A1 (en) Identifying compromised computing devices in a network
CN111224941B (zh) 一种威胁类型识别方法及装置
US10122722B2 (en) Resource classification using resource requests
US11665196B1 (en) Graph stream mining pipeline for efficient subgraph detection
Shin et al. Unsupervised multi-stage attack detection framework without details on single-stage attacks
Praseed et al. HTTP request pattern based signatures for early application layer DDoS detection: A firewall agnostic approach
EP3039566A1 (en) Distributed pattern discovery
Niu et al. HTTP‐Based APT Malware Infection Detection Using URL Correlation Analysis
Radivilova et al. Statistical and Signature Analysis Methods of Intrusion Detection
Banitalebi Dehkordi Examining the status of CPU working load, processing load and controller bandwidth under the influence of packet-in buffer status located in Openflow switches in SDN-based IoT framework
Bessy et al. ENHANCED MALICIOUS URL DETECTION SYSTEM WITH MACHINE LEARNING ALGORITHMS
Abaid Time-sensitive prediction of malware attacks and analysis of machine-learning classifiers in adversarial settings

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20922229

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2020922229

Country of ref document: EP

Effective date: 20220614

NENP Non-entry into the national phase

Ref country code: DE