CN116743399A - Malicious single-stream detection method and device and electronic equipment - Google Patents

Malicious single-stream detection method and device and electronic equipment Download PDF

Info

Publication number
CN116743399A
CN116743399A CN202210198145.1A CN202210198145A CN116743399A CN 116743399 A CN116743399 A CN 116743399A CN 202210198145 A CN202210198145 A CN 202210198145A CN 116743399 A CN116743399 A CN 116743399A
Authority
CN
China
Prior art keywords
target
single stream
sample
malicious
features
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210198145.1A
Other languages
Chinese (zh)
Inventor
赖文杰
刘晨曦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Guancheng Technology Co ltd
Original Assignee
Beijing Guancheng Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Guancheng Technology Co ltd filed Critical Beijing Guancheng Technology Co ltd
Priority to CN202210198145.1A priority Critical patent/CN116743399A/en
Publication of CN116743399A publication Critical patent/CN116743399A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1416Event detection, e.g. attack signature detection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/02Protocols based on web technology, e.g. hypertext transfer protocol [HTTP]

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention provides a malicious single-stream detection method, a malicious single-stream detection device and electronic equipment, wherein the method comprises the following steps: acquiring a target single stream; performing feature processing on the protocol field of the target single stream to obtain target features of the target single stream; the target features comprise at least one of target length features, target time features and target statistical features; and inputting the target characteristics into the recognition model, and determining whether the target single stream is a malicious single stream. According to the malicious single stream detection method, the malicious single stream detection device and the electronic equipment provided by the embodiment of the invention, the characteristic processing of the protocol field is performed from the request head or the response head of the flow packet aiming at the protocol characteristic of the HTTP single stream, so that the used target characteristic has good distinguishing property, higher precision and better robustness, and the malicious single stream is more accurately identified.

Description

Malicious single-stream detection method and device and electronic equipment
Technical Field
The present invention relates to the field of network security technologies, and in particular, to a malicious single stream detection method, a malicious single stream detection device, an electronic device, and a computer readable storage medium.
Background
In recent years, in the field of internet information security, identification and classification of malicious traffic are becoming a current research hotspot. Wherein, the Trojan horse virus is generally communicated by adopting the HTTP protocol. For Trojan viruses propagated in this way, most of the current detection methods rely on threat information or some known and fixed characteristic rules to detect, and the method can only identify HTTP malicious traffic with the known fixed characteristics, but cannot identify HTTP malicious traffic without the known fixed characteristics, so that the conventional method for identifying the HTTP malicious traffic has certain limitation, and if the Trojan viruses change the structure or the mode of the communication content to a certain extent, the fixed characteristics fail, so that the HTTP malicious traffic cannot be detected.
Disclosure of Invention
In order to solve the existing technical problems, the embodiment of the invention provides a malicious single stream detection method, a malicious single stream detection device, electronic equipment and a computer readable storage medium.
In a first aspect, an embodiment of the present invention provides a malicious single stream detection method, including: acquiring a target single stream, wherein the target single stream is a session to be identified and transmitted by a hypertext transfer protocol; performing feature processing on the protocol field of the target single stream to obtain target features of the target single stream; the target features comprise at least one of target length features, target time features and target statistical features; and inputting the target features into a recognition model, and determining whether the target single stream is a malicious single stream or not based on the output result of the recognition model, wherein the recognition model is capable of recognizing whether the target features are malicious single stream features or not.
Optionally, performing feature processing on the protocol field of the target single stream to obtain target features of the target single stream includes: at least one of determining the target length characteristic, determining the target time characteristic, and determining the target statistical characteristic; the determining the target length feature includes: calculating the number of characters contained in the protocol field of the target single stream to obtain a length value corresponding to the protocol field of the target single stream, and taking the length value as a target length characteristic; the determining the target temporal feature includes: determining a time-related protocol field in the protocol field of the target single stream, subtracting an initial time from a latest change time recorded in the time-related protocol field, calculating a time difference value corresponding to the time-related protocol field, and taking the time difference value as a target time characteristic; the determining the target statistical feature includes: and counting the number of special characters appearing in a protocol field of the target single stream and/or the entropy value of the uniform resource identifier, and taking the number of special characters appearing and/or the entropy value of the uniform resource identifier as target statistical characteristics.
Optionally, before inputting the target feature into the recognition model and determining whether the target single stream is a malicious single stream based on an output result of the recognition model, the method further includes: obtaining a plurality of sample single flows, wherein the sample single flows are divided into a normal sample single flow and a malicious sample single flow; performing feature processing on the protocol field of the sample single flow to obtain sample features of the sample single flow, wherein the sample features comprise at least one of sample length features, sample time features and sample statistical features; and inputting the sample characteristics into a preset model for training, and generating the identification model.
Optionally, acquiring the plurality of sample uniflows comprises: obtaining normal sample traffic packets transmitted by using a hypertext transfer protocol and malicious sample traffic packets transmitted by using the hypertext transfer protocol from a network; or, obtaining normal sample flow packets transmitted by using a hypertext transfer protocol from a network, and running the collected Trojan horse virus in a sandbox to obtain malicious sample flow packets generated when the Trojan horse virus establishes a session and transmitted by using the hypertext transfer protocol; and shunting the normal sample flow packet and the malicious flow packet to obtain at least one normal sample single flow and at least one malicious sample single flow, wherein the at least one normal sample single flow and the at least one malicious sample single flow are used as the sample single flows, and uniform resource identifiers contained in each sample single flow are different from each other.
Optionally, the sample features further comprise sample base features; the feature processing is performed on the protocol field of the sample single stream to obtain the sample feature of the sample single stream, including: and extracting a basic attribute value corresponding to the protocol field of the sample single stream, and taking the basic attribute value as a sample basic characteristic.
Optionally, the target features further comprise target base features; the feature processing is performed on the protocol field of the target single stream to obtain the target feature of the target single stream, including: and extracting a basic attribute value corresponding to the protocol field of the target single stream, and taking the basic attribute value as a target basic feature.
In a second aspect, an embodiment of the present invention provides a malicious single stream detection apparatus, including: the device comprises an acquisition module, a processing module and a determination module.
The acquisition module is used for acquiring a target single stream, wherein the target single stream is a session to be identified and transmitted in a hypertext transfer protocol.
The processing module is used for carrying out feature processing on the protocol field of the target single stream to obtain the target feature of the target single stream; the target features include at least one of target length features, target time features, target statistics features.
The determining module is used for inputting the target features into the identifying model, and determining whether the target single stream is a malicious single stream or not based on the output result of the identifying model, wherein the identifying model is capable of identifying whether the target features are the malicious single stream features or not.
Optionally, the processing module includes: at least one of a target length feature unit, a target time feature unit and a target statistics feature unit.
The target length feature unit is used for calculating the number of characters contained in the protocol field of the target single stream, obtaining a length value corresponding to the protocol field of the target single stream, and taking the length value as a target length feature.
The target time feature unit is used for determining a time-related protocol field in the protocol fields of the target single stream, subtracting the initial time from the latest change time recorded in the time-related protocol field, calculating a time difference value corresponding to the time-related protocol field, and taking the time difference value as a target time feature.
The target statistical feature unit is used for counting the number of special characters appearing in the protocol field of the target single stream and/or the entropy value of the uniform resource identifier, and taking the number of special characters appearing and/or the entropy value of the uniform resource identifier as a target statistical feature.
In a third aspect, an embodiment of the present invention provides an electronic device, including: a bus, a transceiver, a memory, a processor, and a computer program stored on the memory and executable on the processor; the transceiver, the memory and the processor are connected by the bus, and the computer program when executed by the processor implements the steps in the malicious single stream detection method as described above.
In a fourth aspect, embodiments of the present invention provide a computer-readable storage medium comprising: a computer program stored on a readable storage medium; the computer program when executed by a processor implements the steps in the malicious single stream detection method as described above.
According to the malicious single stream detection method, the malicious single stream detection device, the electronic equipment and the computer readable storage medium, through feature processing of the acquired protocol field of the target single stream, the target feature which is more suitable for judging whether the target single stream is the malicious single stream can be obtained, the target feature is input into the recognition model, whether the target feature is the malicious single stream feature is determined through the output result of the recognition model, and whether the target single stream is the malicious single stream can be further determined. The method starts from a request head or a response head of a flow packet according to the protocol characteristics of the HTTP single stream, and performs characteristic processing of a protocol field, so that the used target characteristics have good distinguishing property, high precision and high robustness, and the malicious single stream is more accurately identified.
Drawings
In order to more clearly describe the embodiments of the present invention or the technical solutions in the background art, the following description will describe the drawings that are required to be used in the embodiments of the present invention or the background art.
FIG. 1 shows a flowchart of a malicious single stream detection method provided by an embodiment of the present invention;
FIG. 2 shows a detailed flowchart of a malicious single stream detection method provided by an embodiment of the present invention;
fig. 3 is a schematic structural diagram of a malicious single-stream detection device according to an embodiment of the present invention;
fig. 4 shows a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
Embodiments of the present invention will be described below with reference to the accompanying drawings in the embodiments of the present invention.
Fig. 1 shows a flowchart of a malicious single stream detection method provided by an embodiment of the present invention. As shown in fig. 1, the method may include the following steps 101-103.
Step 101: and acquiring a target single stream, wherein the target single stream is a session to be identified and transmitted in a hypertext transfer protocol.
In the network traffic transmission process, the end that makes the request to the outside may be referred to as the requestor, and the end that receives the request and responds may be referred to as the responder. A complete network session may typically include request traffic packets from multiple requesters and response traffic packets from multiple respondents. In the embodiment of the invention, a complete network session may be referred to as a single flow, that is, a single flow may be from the time of establishing a connection (such as a request party sending a handshake request traffic packet) to the time of ending the connection (such as a response party replying to a final response traffic packet). When it is required to identify whether a single stream is a malicious single stream, the single stream to be identified may be acquired and used as a target single stream, where the target single stream is a session transmitted in a hypertext transfer protocol (HTTP protocol). The method for acquiring the target single stream can be as follows: directly acquiring a plurality of flow packets transmitted by using an HTTP protocol from a network, taking the flow packets transmitted by using the HTTP protocol as target flow packets, splitting the target flow packets, namely integrating the target flow packets belonging to the same single flow, and taking the single flow obtained after splitting as a target single flow; alternatively, the single stream may be directly obtained from the network, and the obtained single stream is used as the target single stream, which is not limited in the embodiment of the present invention.
For example, 20 traffic packets transmitted using the HTTP protocol may be obtained from the network as target traffic packets, the 20 target traffic packets are split to obtain two single flows each including 10 target traffic packets, and both the single flows are regarded as target single flows.
Step 102: performing feature processing on the protocol field of the target single stream to obtain target features of the target single stream; the target features include at least one of target length features, target time features, target statistics features.
Typically, the request traffic packet transmitted in the single stream has a request header, where the request header includes a protocol field, where the protocol field is specific content carried by the request header, for example, the protocol field of the request header may be an Accept field for indicating an acceptable type of the requester, a Host field for indicating a Host and a port requested by the requester, and so on. Similarly, the response flow packet transmitted in the single flow has a response header, and the response header also includes a protocol field, where the protocol field is specific Content carried by the response header, for example, the protocol field of the response header may be a cache Control field for specifying all caching mechanisms, a Content-Type field for representing a document Content Type, and the like. The target single stream contains one or more protocol fields.
In the embodiment of the invention, the protocol field contained in the target single stream can be subjected to characteristic processing, namely, the protocol field of the request header and/or the protocol field of the response header contained in the target single stream are subjected to characteristic processing, so that the target characteristics corresponding to the target single stream are obtained. The feature processing may be to perform feature extraction on a protocol field of a target single stream to obtain feature metadata of the protocol field of the target single stream, and further calculate and sort the feature metadata to finally obtain a required target feature of the target single stream, where the target feature can represent characteristics and rules of the target single stream. In the embodiment of the invention, one or more of the target length characteristic, the target time characteristic and the target statistical characteristic can be obtained by carrying out characteristic processing on the protocol field of the target single stream. Wherein the target length feature is a feature related to the length value of some specific protocol fields of the target single stream, and may generally represent the rule of the field length of the protocol fields of the target single stream; the target time feature may represent a rule of the target single flow in a time dimension (e.g., a time corresponding to a first target traffic packet and a last target traffic packet in the target single flow); the target statistical characteristic is some characteristics of the protocol field of the target traffic, which are determined by performing multi-dimensional statistics on the protocol field of the target traffic.
Step 103: and inputting the target features into a recognition model, and determining whether the target single stream is a malicious single stream or not based on the output result of the recognition model, wherein the recognition model is capable of recognizing whether the target features are malicious single stream features or not.
After obtaining the target feature, the target feature may be input into an identification model capable of identifying whether the target feature is a malicious single stream feature, and through an output result of the identification model, it is determined whether the input target feature is a feature of the malicious single stream, if the input target feature is the malicious single stream feature, the target single stream corresponding to the target feature may be considered to be the malicious single stream.
According to the embodiment of the invention, the obtained protocol field of the target single stream is subjected to feature processing, so that the target feature which is more suitable for judging whether the target single stream is a malicious single stream or not can be obtained, the target feature is input into the identification model, and whether the target feature is a malicious single stream or not is determined through the output result of the identification model, so that whether the target single stream is a malicious single stream or not can be determined. The method starts from a request head or a response head of a flow packet according to the protocol characteristics of the HTTP single stream, and performs characteristic processing of a protocol field, so that the used target characteristics have good distinguishing property, high precision and high robustness, and the malicious single stream is more accurately identified.
Optionally, performing feature processing on the protocol field of the target single stream to obtain the target feature of the target single stream includes: at least one of determining a target length characteristic, determining a target time characteristic, and determining a target statistics characteristic.
Wherein determining the target length feature comprises: and calculating the number of characters contained in the protocol field of the target single stream to obtain a length value corresponding to the protocol field of the target single stream, and taking the length value as a target length characteristic.
In the embodiment of the invention, one or more modes of determining the target length characteristic, determining the target time characteristic and determining the target statistical characteristic can be adopted to correspondingly obtain one or more of the target length characteristic, the target time characteristic and the target statistical characteristic of the target single stream, so that the target characteristic of the target single stream is determined. The length of certain specific protocol fields corresponding to the normal single stream is generally longer than the length of the protocol fields corresponding to the malicious single stream, i.e. the normal single stream and the malicious single stream have a certain difference in the length of certain protocol fields, so that based on the rule, the length of certain specific protocol fields of the target single stream can be calculated, for example, the number of characters contained in certain protocol fields of the target single stream is calculated, and the number of characters is used as the length value of the protocol fields of the target single stream, i.e. the target length feature corresponding to the target single stream. Among other things, the protocol fields that can be used to calculate and derive the length value may include: URI (Uniform Resource Identifier ) parameter field of the request header, HOST field of the request header, accept field of the request header, authority field of the request header, reference field of the request header, user-Agent field of the request header, X-Online-HOST field of the request header, cookie field of the request header, content-Length field of the request header for describing the transmission Length of HTTP message entity, vary field of the response header, location field of the response header, and the like.
For example, the embodiment of the invention can calculate the number of characters in the HOST field of the request header corresponding to the target single stream, take the number of characters in the HOST field as a length value, and take the length value as the target length characteristic corresponding to the target single stream.
Determining the target temporal feature includes: determining a time-related protocol field in the protocol fields of the target single stream, subtracting the initial time from the latest change time recorded in the time-related protocol field, calculating a time difference value corresponding to the time-related protocol field, and taking the time difference value as a target time characteristic.
The protocol fields contained in the target single stream can be various, and different kinds of target features can be obtained according to different kinds of protocol fields. Since the normal single stream has a certain difference between the time consumed for starting and ending the session and the time consumed for starting and ending the session by the malicious single stream, the rule can be used for performing feature processing on the protocol field of the related time of the target single stream, such as the protocol field used for recording the sending time of the traffic packet, and the difference value is calculated for the protocol field, so that the obtained time difference value is taken as the corresponding target time feature of the target single stream. Among other things, the protocol fields that can be used to calculate and derive the time difference value may include: a date field of the request header, a date field of the response header, a last-modified field of the request header, a last-modified field of the response header, and the like.
For example, the embodiment of the invention can calculate the difference between the date field of the request header and the last-modified (latest modified time) field of the request header corresponding to the target single stream, that is, the time recorded in the last-modified (latest modified time) field of the request header minus the time recorded in the date field of the request header, and the obtained time difference can represent the duration of the target single stream from the beginning of the session to the ending of the session, and the time difference is taken as the target time feature corresponding to the target single stream.
Determining the target statistics includes: and counting the number of special characters appearing in a protocol field of the target single stream and/or the entropy value of the uniform resource identifier, and taking the number of special characters appearing and/or the entropy value of the uniform resource identifier as the target statistical characteristic.
Wherein, since the special characters of the normal single stream in some specific protocol fields are different in number from the special characters of the malicious single stream in some specific protocol fields, the entropy of the normal single stream in some specific protocol fields is also different from the entropy of the malicious single stream in some specific protocol fields in value. Therefore, the rule can be utilized to perform feature processing on certain protocol fields of the target single stream, such as counting the number of occurrence of a special character (such as comma) in the target single stream or counting the entropy value of a certain protocol field, so that the data subjected to such feature processing is used as the corresponding target statistical feature of the target single stream. Wherein the protocol field that can be used to count the number of special characters or calculate the entropy value may include: the URI field of the request header, the HOST field of the request header, the Accept-Language field of the request header, the Accept-Charset field of the request header, the Accept-Encoding (data Encoding scheme capable of decoding) field of the request header, the Content-Encoding (selected Encoding) field of the response header, the Via (routed) field of the response header, and the like.
For example, the embodiment of the invention can count the entropy of the URI field of the request header corresponding to the target single flow, count how many commas (special characters) are in the Accept-Language field of the request header, count how many protocol fields of the request header are contained in the protocol field of the target flow, count whether the HOST field of the request header contains a port number, etc. In the embodiment of the present invention, the port number contained in the HOST field of the request header may also be regarded as a special character. The entropy value obtained by the statistics and the number of occurrence of the special characters can be used as the target statistical characteristics corresponding to the target single stream.
The embodiment of the invention can pertinently perform characteristic processing on different protocol fields of the target single stream, namely, the required protocol fields are respectively extracted from the request header and/or the response header of the target single stream, and the operations of calculating entropy value, calculating the number of characters and the like are performed on the protocol fields, so that the target characteristics which are more suitable for detecting whether the target single stream is a malicious single stream are finally obtained, and the diversified characteristic processing lays an advantage for the subsequent identification of an identification model, so that the model prediction result is more accurate.
Optionally, before inputting the target feature into the recognition model and determining whether the target single stream is a malicious single stream based on the output result of the recognition model, the method further comprises the following steps A1-A3.
Step A1: a plurality of sample single flows are obtained, and the sample single flows are divided into normal sample single flows and malicious sample single flows.
In order to train an identification model capable of identifying whether the target feature is a malicious single-stream feature, a plurality of single streams transmitted by using an HTTP protocol can be obtained, the obtained single streams are used as sample single streams, and the sample single streams are divided into normal sample single streams and malicious sample single streams, namely, any one of the sample single streams can be a normal sample single stream, such as a single stream of the HTTP protocol generated by normal internet surfing, or can be a malicious sample single stream, such as a single stream of the HTTP protocol propagated by Trojan viruses, and the sample single stream is specifically determined based on the characteristics of the sample single stream. For example, 100 single flows of HTTP protocol generated by normal surfing may be acquired, and the 100 single flows of HTTP protocol generated by normal surfing may be referred to as normal sample single flows; the method comprises the steps of obtaining the single stream of the HTTP protocol transmitted by 100 Trojan viruses, and enabling the single stream of the HTTP protocol transmitted by the 100 Trojan viruses to be called as a malicious sample single stream. In the embodiment of the invention, the 100 normal sample uniflows and the 100 malicious sample uniflows can be used as sample uniflows.
Step A2: and carrying out feature processing on the protocol field of the sample single stream to obtain sample features of the sample single stream, wherein the sample features comprise at least one of sample length features, sample time features and sample statistical features.
One skilled in the art can understand that the target length feature and the sample length feature are both a length feature, and can be obtained by adopting a similar processing mode; the target time feature and the sample time feature are both time features, and can be obtained by adopting a similar processing mode; similarly, the target statistical feature and the sample statistical feature are both statistical features, and can be obtained by adopting a similar processing mode. The feature processing is performed on the protocol field of the sample single flow according to one or more methods of determining the target length feature, determining the target time feature and determining the target statistics feature, so that one or more of the sample length feature, the sample time feature and the sample statistics feature of the sample single flow are obtained correspondingly, and are used as sample features, which are not described herein.
Step A3: and inputting the sample characteristics into a preset model for training, and generating an identification model.
In the embodiment of the invention, after the sample characteristics are obtained for the sample uniflow, the sample characteristics can be input into a preset model for training to obtain the model parameters of the preset model, and an identification model capable of identifying whether the target characteristics are malicious uniflow characteristics is generated according to the model parameters. The method can be used for model training and parameter adjustment by using a random forest method, and finally a required identification model is obtained.
According to the embodiment of the invention, the normal sample single flow and the malicious sample single flow are obtained as sample single flows, and the protocol field of the sample single flow is subjected to characteristic processing to obtain the sample characteristics capable of representing the sample single flow, and the preset model is trained according to the sample characteristics, so that the identification model meeting the requirements is obtained. The method can process and obtain a plurality of sample features which are more specific and are more suitable for identifying malicious single streams before training the model, and the unique and diversified sample features lay favorable conditions for subsequent model training and generate an identification model with more accurate identification results.
Alternatively, acquiring a plurality of sample uniflows may include the following steps B1-B2.
Step B1: obtaining normal sample traffic packets transmitted by using a hypertext transfer protocol and malicious sample traffic packets transmitted by using the hypertext transfer protocol from a network; or, obtaining normal sample traffic packets transmitted by using a hypertext transfer protocol from a network, and running the collected Trojan virus in a sandbox to obtain malicious sample traffic packets generated when the Trojan virus establishes a session and transmitted by using the hypertext transfer protocol.
For the sample single flow used for training the preset model, a large number of normal sample flow packets can be obtained by acquiring the flow packets of the HTTP protocol transmitted during normal surfing from the network and taking the flow packets of the HTTP protocol transmitted during normal surfing as normal sample flow packets. The embodiment of the invention can acquire more diversified malicious sample flow packets from a plurality of channels by using different means, such as the flow packets of malicious HTTP protocol transmitted by different Trojan viruses, thereby enabling the malicious sample uniflow finally used for model training to be more diversified and have more characteristics. In the embodiment of the invention, as the mode of acquiring the normal sample flow packet, a plurality of transmitted malicious HTTP flow packets can be directly acquired from the network and used as malicious sample flow packets; alternatively, different trojans may be started in the sandbox respectively, to obtain malicious HTTP packets generated by establishing a session between each trojan and a C2 server (the command and control server is a central server or a computer or a control center used by an attacker to control a target computer) in the sandbox, and these malicious HTTP packets are referred to as malicious sample packets.
For example, the traffic packets generated in the offices of some companies can be obtained from the network outlets corresponding to the companies in the white list, and the obtained traffic packets are used as normal sample traffic packets; and collecting malicious network communication traffic packets from the network, screening out traffic packets transmitted by using the HTTP protocol from the malicious network communication traffic packets, and taking the traffic packets as malicious sample traffic packets. Wherein the whitelist is a list for storing domain names that are allowed to be accessed and are safe.
Step B2: and splitting the normal sample flow packet and the malicious flow packet to obtain at least one normal sample single flow and at least one malicious sample single flow, wherein the at least one normal sample single flow and the at least one malicious sample single flow are taken as sample single flows, and uniform resource identifiers contained in each sample single flow are different from each other.
In the embodiment of the invention, the normal sample flow packet can be split to obtain at least one normal sample uniflow; similarly, the malicious sample flow packet is also split, so that at least one malicious sample single flow can be obtained; the resulting at least one normal sample uniflow may be referred to as a sample uniflow and the resulting at least one malicious sample uniflow may be referred to as a sample uniflow. Each sample single flow corresponds to a uniform resource identifier (i.e., URI), which is stored in the request traffic packet in each single flow, and the URI may be an identifier for uniquely identifying a resource, which may be a domain name address of the destination host to be accessed, or a port name of the host to be accessed, etc. In order to solve the problem of repetition of training data (i.e., sample uniflow), namely to avoid the problems of increasing the calculation amount in training, reducing the training efficiency and the like caused by using a plurality of groups of sample uniflow which access the same target host machine for model training, the embodiment of the invention can generate the sample uniflow and simultaneously lead the URI contained in each sample uniflow to be unique URI, namely the URI in each sample uniflow to be different from each other, thereby ensuring that the sample uniflow which is subsequently input into a preset model for training is training data which is not repeated from each other.
When the embodiment of the invention obtains the malicious sample uniflow, besides directly obtaining the malicious sample uniflow from a network, the malicious flow packets generated by establishing the session in the sandbox by using the known Trojan horse viruses can be used for shunting the malicious flow packets to obtain the malicious sample uniflow; and the URIs corresponding to the sample uniflows are controlled to be different, so that repeated sample uniflows are reduced to be trained. The method not only can enrich the diversity of malicious sample uniflow, so that the obtained sample characteristics are more representative, and the accuracy of the generated recognition model is higher, but also can improve the efficiency of model training and reduce the unnecessary calculation amount during model training.
Optionally, the sample features further comprise sample base features; and (C) performing feature processing on the protocol field of the sample single stream to obtain sample features of the sample single stream, wherein the step (C) can be included.
Step C: and extracting a basic attribute value corresponding to the protocol field of the sample single stream, and taking the basic attribute value as a sample basic characteristic.
In general, an attribute included in a protocol field of a single stream can represent a basic feature of the single stream, and thus, by comparing an attribute included in a protocol field of a malicious single stream with an attribute included in a protocol field of a normal single stream, some basic features specific to the protocol field of the malicious single stream can be found. In the embodiment of the invention, the basic attribute values which can be used for representing certain basic characteristics of the sample single stream, such as the protocol version number, the request mode, the response state and the like corresponding to the sample single stream, can be directly extracted from the protocol field of the sample single stream. The basic attribute values can describe the basic characteristics of the sample single stream more accurately, so that the basic attribute values extracted from the protocol fields of the sample single stream can be directly used as the sample basic characteristics, namely, one sample characteristic of the sample single stream. Wherein the protocol field from which the base attribute value can be directly extracted may include: a method field of the request header, a URI field of the request header, a HOST field of the request header, a Version number field of the request header, a connection field of the request header, an Accept-Language field of the request header, an Encoding field of the request header, a code status field of the response header, a Content-Type field of the response header, a ver field of the response header, a cache Control field of the response header, a connection field of the response header, an Upgrade field of the response header, a Server field of the response header, a Content-Encoding field of the response header, an application/x-msdown load field of the response header, and the like.
For example, the protocol Version number corresponding to the sample single stream may be obtained from the Version number field of the request header of the sample single stream, the request mode used for obtaining the sample single stream from the method field of the request header of the sample single stream, for example, POST, the state code corresponding to the sample single stream may be obtained from the state code field of the response header of the sample single stream as 200, and the obtained content may be used as the basic attribute value corresponding to the protocol field of the sample single stream, that is, the sample basic feature corresponding to the sample single stream.
The sample features according to the embodiments of the present invention may further include a sample basic feature, that is, a basic attribute value that may be directly obtained from a protocol field of a sample single flow without further calculation or statistics, and such directly obtained basic attribute value is used as a sample basic feature, that is, a sample feature corresponding to the sample single flow. According to the method, the sample characteristics can be simply, conveniently and quickly obtained from the attributes of the sample single stream, the sample characteristics which can be used for training a preset model are added, the generated identification model can more comprehensively identify the target characteristics of the target single stream, and the accuracy of the identification model is improved.
Optionally, the target features further comprise target base features; the feature processing is performed on the protocol field of the target single stream, and obtaining the target feature of the target single stream may include the following step D.
Step D: and extracting a basic attribute value corresponding to the protocol field of the target single stream, and taking the basic attribute value as a target basic feature.
In the embodiment of the invention, when the target feature is acquired aiming at the target single stream, the acquired target feature can also be a target basic feature, namely, certain basic attribute values can be directly extracted from the protocol field of the target single stream. The method described in the step C may be adopted to extract a basic attribute value corresponding to a protocol field of the target single stream, and use the basic attribute value as a target basic feature of the target single stream, that is, a target feature corresponding to the target single stream.
When the embodiment of the invention acquires the target feature aiming at the target single stream, the protocol field of the target single stream can be subjected to feature processing to obtain at least one of the target length feature, the target time feature and the target statistical feature, and the basic attribute value can be directly extracted from the protocol field of the target single stream to serve as the target basic feature. According to the method, based on the recognition model trained by using the sample basic features, the obtained target basic features are input into the recognition model, and whether the target features are malicious single-stream features can be determined more accurately, so that whether the target single-stream is malicious single-stream can be determined more accurately, the judgment basis of the malicious single-stream is increased, and the accuracy of a recognition result is improved.
The malicious single stream detection method flow is described in detail below by way of one embodiment. Referring to fig. 2, the method includes the following steps 201-207.
Step 201: the method comprises the steps of obtaining a normal HTTP traffic packet generated by normal internet surfing from a network as a normal sample traffic packet, and obtaining a malicious HTTP traffic packet from the network as a malicious sample traffic packet.
Step 202: and splitting the normal sample flow packet to obtain a normal sample single flow, splitting the malicious sample flow packet to obtain a malicious sample single flow, taking a set of the normal sample single flow and the malicious sample single flow as the sample single flow, wherein URIs corresponding to the sample single flows are different from each other.
Step 203: extracting a protocol field of a sample single flow, carrying out feature processing on the protocol field of the sample single flow to respectively obtain a sample time feature, a sample length feature, a sample statistical feature and a sample basic feature, and inputting the features as sample features into a preset model for training to obtain an identification model capable of identifying whether the input features are malicious single flow features.
Step 204: and obtaining a target single stream from the network, carrying out feature processing on a protocol field of the target single stream to obtain a target time feature, a target length feature, a target statistical feature and a target basic feature corresponding to the target single stream, and taking the features as target features.
Step 205: and inputting the target feature into the recognition model, judging whether the target feature is a malicious single-stream feature according to the output result of the recognition model, if the target feature is the malicious single-stream feature, executing step 206, otherwise executing step 207.
Step 206: it is determined that the target single stream is a malicious single stream.
Step 207: it is determined that the target single stream is not a malicious single stream.
The embodiment of the invention provides a malicious single-stream detection device, which is shown in fig. 3, and comprises the following components: an acquisition module 31, a processing module 32 and a determination module 33.
The obtaining module 31 is configured to obtain a target single stream, where the target single stream is a session to be identified for transmission in a hypertext transfer protocol.
The processing module 32 is configured to perform feature processing on the protocol field of the target single stream to obtain a target feature of the target single stream; the target features include at least one of target length features, target time features, target statistics features.
The determining module 33 is configured to input the target feature into a recognition model, and determine whether the target single stream is a malicious single stream based on an output result of the recognition model, where the recognition model is capable of recognizing whether the target feature is a malicious single stream feature.
Optionally, the processing module 32 includes: at least one of a target length feature unit, a target time feature unit and a target statistics feature unit.
The target length feature unit is used for calculating the number of characters contained in the protocol field of the target single stream, obtaining a length value corresponding to the protocol field of the target single stream, and taking the length value as a target length feature.
The target time feature unit is used for determining a time-related protocol field in the protocol fields of the target single stream, subtracting the initial time from the latest change time recorded in the time-related protocol field, calculating a time difference value corresponding to the time-related protocol field, and taking the time difference value as a target time feature.
The target statistical feature unit is used for counting the number of special characters appearing in the protocol field of the target single stream and/or the entropy value of the uniform resource identifier, and taking the number of special characters appearing and/or the entropy value of the uniform resource identifier as a target statistical feature.
Optionally, the apparatus further comprises a sample acquisition module, a sample processing module and a training module.
The sample acquisition module is used for acquiring a plurality of sample single flows, and the sample single flows are divided into a normal sample single flow and a malicious sample single flow.
The sample processing module is used for carrying out feature processing on the protocol field of the sample single flow to obtain sample features of the sample single flow, wherein the sample features comprise at least one of sample length features, sample time features and sample statistical features.
The training module is used for inputting the sample characteristics into a preset model for training and generating the identification model.
Optionally, the obtaining the sample module includes: and acquiring a flow packet unit and a shunt unit.
The traffic packet acquisition unit is used for acquiring normal sample traffic packets transmitted by using a hypertext transfer protocol and malicious sample traffic packets transmitted by using the hypertext transfer protocol from a network; or, obtaining normal sample traffic packets transmitted by using a hypertext transfer protocol from a network, and running the collected Trojan virus in a sandbox to obtain malicious sample traffic packets generated when the Trojan virus establishes a session and transmitted by using the hypertext transfer protocol.
The splitting unit is used for splitting the normal sample flow packet and the malicious flow packet to obtain at least one normal sample single flow and at least one malicious sample single flow, the at least one normal sample single flow and the at least one malicious sample single flow are used as the sample single flows, and uniform resource identifiers contained in each sample single flow are different from each other.
Optionally, the sample features further comprise sample base features; the processing sample module includes: sample base feature unit.
And the sample basic characteristic unit is used for extracting a basic attribute value corresponding to the protocol field of the sample single stream, and taking the basic attribute value as a sample basic characteristic.
Optionally, the target features further comprise target base features; the processing module 32 includes: target basic feature unit.
And the target basic feature unit is used for extracting a basic attribute value corresponding to the protocol field of the target single stream and taking the basic attribute value as a target basic feature.
According to the device provided by the embodiment of the invention, the obtained protocol field of the target single stream is subjected to feature processing, so that the target feature which is more suitable for judging whether the target single stream is a malicious single stream can be obtained, the target feature is input into the identification model, and whether the target feature is a malicious single stream feature or not is determined through the output result of the identification model, so that whether the target single stream is a malicious single stream or not can be determined. The device starts from the request head or the response head of the flow packet according to the protocol characteristics of the HTTP single stream, and performs the characteristic processing of the protocol field, so that the used target characteristics have good distinguishing property, higher precision and better robustness, and the malicious single stream is more accurately identified.
In addition, the embodiment of the invention also provides an electronic device, which comprises a bus, a transceiver, a memory, a processor and a computer program stored in the memory and capable of running on the processor, wherein the transceiver, the memory and the processor are respectively connected through the bus, and when the computer program is executed by the processor, the computer program realizes the processes of the embodiment of the malicious single stream detection method and can achieve the same technical effect, and in order to avoid repetition, the description is omitted.
In particular, referring to FIG. 4, an embodiment of the invention also provides an electronic device comprising a bus 1110, a processor 1120, a transceiver 1130, a bus interface 1140, a memory 1150, and a user interface 1160.
In an embodiment of the present invention, the electronic device further includes: computer programs stored on the memory 1150 and executable on the processor 1120, which when executed by the processor 1120, implement the various processes of the malicious single stream detection method embodiments described above.
A transceiver 1130 for receiving and transmitting data under the control of the processor 1120.
In an embodiment of the invention, represented by bus 1110, bus 1110 may include any number of interconnected buses and bridges, with bus 1110 connecting various circuits, including one or more processors, represented by processor 1120, and memory, represented by memory 1150.
Bus 1110 represents one or more of any of several types of bus structures, including a memory bus and a memory controller, a peripheral bus, an accelerated graphics port (Accelerate Graphical Port, AGP), a processor, or a local bus using any of a variety of bus architectures. By way of example, and not limitation, such an architecture includes: industry standard architecture (Industry Standard Architecture, ISA) bus, micro channel architecture (Micro Channel Architecture, MCA) bus, enhanced ISA (EISA) bus, video electronics standards association (Video Electronics Standards Association, VESA) bus, peripheral component interconnect (Peripheral Component Interconnect, PCI) bus.
Processor 1120 may be an integrated circuit chip with signal processing capabilities. In implementation, the steps of the above method embodiments may be implemented by instructions in the form of integrated logic circuits in hardware or software in a processor. The processor includes: general purpose processors, central processing units (Central Processing Unit, CPU), network processors (Network Processor, NP), digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), field programmable gate arrays (Field Programmable Gate Array, FPGA), complex programmable logic devices (Complex Programmable Logic Device, CPLD), programmable logic arrays (Programmable Logic Array, PLA), micro control units (Microcontroller Unit, MCU) or other programmable logic devices, discrete gates, transistor logic devices, discrete hardware components. The methods, steps and logic blocks disclosed in the embodiments of the present invention may be implemented or performed. For example, the processor may be a single-core processor or a multi-core processor, and the processor may be integrated on a single chip or located on multiple different chips.
The processor 1120 may be a microprocessor or any conventional processor. The steps of the method disclosed in connection with the embodiments of the present invention may be performed directly by a hardware decoding processor, or by a combination of hardware and software modules in the decoding processor. The software modules may be located in a random access Memory (Random Access Memory, RAM), flash Memory (Flash Memory), read-Only Memory (ROM), programmable ROM (PROM), erasable Programmable ROM (EPROM), registers, and so forth, as are known in the art. The readable storage medium is located in a memory, and the processor reads the information in the memory and, in combination with its hardware, performs the steps of the above method.
Bus 1110 may also connect together various other circuits such as peripheral devices, voltage regulators, or power management circuits, bus interface 1140 providing an interface between bus 1110 and transceiver 1130, all of which are well known in the art. Accordingly, the embodiments of the present invention will not be further described.
The transceiver 1130 may be one element or a plurality of elements, such as a plurality of receivers and transmitters, providing a means for communicating with various other apparatus over a transmission medium. For example: the transceiver 1130 receives external data from other devices, and the transceiver 1130 is configured to transmit the data processed by the processor 1120 to the other devices. Depending on the nature of the computer system, a user interface 1160 may also be provided, for example: touch screen, physical keyboard, display, mouse, speaker, microphone, trackball, joystick, stylus.
It should be appreciated that in embodiments of the present invention, the memory 1150 may further comprise memory located remotely from the processor 1120, such remotely located memory being connectable to a server through a network. One or more portions of the above-described networks may be an ad hoc network (ad hoc network), an intranet, an extranet (extranet), a Virtual Private Network (VPN), a Local Area Network (LAN), a Wireless Local Area Network (WLAN), a Wide Area Network (WAN), a Wireless Wide Area Network (WWAN), a Metropolitan Area Network (MAN), the Internet (Internet), a Public Switched Telephone Network (PSTN), a plain old telephone service network (POTS), a cellular telephone network, a wireless fidelity (Wi-Fi) network, and a combination of two or more of the above-described networks. For example, the cellular telephone network and wireless network may be a global system for mobile communications (GSM) system, a Code Division Multiple Access (CDMA) system, a Worldwide Interoperability for Microwave Access (WiMAX) system, a General Packet Radio Service (GPRS) system, a Wideband Code Division Multiple Access (WCDMA) system, a Long Term Evolution (LTE) system, an LTE Frequency Division Duplex (FDD) system, an LTE Time Division Duplex (TDD) system, a long term evolution-advanced (LTE-a) system, a Universal Mobile Telecommunications (UMTS) system, an enhanced mobile broadband (Enhance Mobile Broadband, embbb) system, a mass machine type communication (massive Machine Type of Communication, mctc) system, an ultra reliable low latency communication (Ultra Reliable Low Latency Communications, uirllc) system, and the like.
It should be appreciated that the memory 1150 in embodiments of the present invention may be either volatile memory or nonvolatile memory, or may include both volatile and nonvolatile memory. Wherein the nonvolatile memory includes: read-Only Memory (ROM), programmable ROM (PROM), erasable Programmable EPROM (EPROM), electrically Erasable EPROM (EEPROM), or Flash Memory (Flash Memory).
The volatile memory includes: random access memory (Random Access Memory, RAM) which acts as an external cache. By way of example, and not limitation, many forms of RAM are available, such as: static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (ddr SDRAM), enhanced SDRAM (Enhanced SDRAM), synchronous DRAM (SLDRAM), and Direct RAM (DRAM). The memory 1150 of the electronic device described in embodiments of the present invention includes, but is not limited to, the above and any other suitable types of memory.
In an embodiment of the invention, memory 1150 stores the following elements of operating system 1151 and application programs 1152: an executable module, a data structure, or a subset thereof, or an extended set thereof.
Specifically, the operating system 1151 includes various system programs, such as: a framework layer, a core library layer, a driving layer and the like, which are used for realizing various basic services and processing tasks based on hardware. The applications 1152 include various applications such as: a Media Player (Media Player), a Browser (Browser) for implementing various application services. A program for implementing the method of the embodiment of the present invention may be included in the application 1152. The application 1152 includes: applets, objects, components, logic, data structures, and other computer system executable instructions that perform particular tasks or implement particular abstract data types.
In addition, the embodiment of the invention further provides a computer readable storage medium, on which a computer program is stored, where the computer program when executed by a processor implements each process of the above embodiment of the malicious single stream detection method, and the same technical effects can be achieved, so that repetition is avoided, and no further description is given here.
The computer-readable storage medium includes: persistent and non-persistent, removable and non-removable media are tangible devices that may retain and store instructions for use by an instruction execution device. The computer-readable storage medium includes: electronic storage, magnetic storage, optical storage, electromagnetic storage, semiconductor storage, and any suitable combination of the foregoing. The computer-readable storage medium includes: phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), non-volatile random access memory (NVRAM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disk read only memory (CD-ROM), digital Versatile Disks (DVD) or other optical storage, magnetic cassette storage, magnetic tape disk storage or other magnetic storage devices, memory sticks, mechanical coding (e.g., punch cards or bump structures in grooves with instructions recorded thereon), or any other non-transmission medium that may be used to store information that may be accessed by a computing device. In accordance with the definition in the present embodiments, the computer-readable storage medium does not include a transitory signal itself, such as a radio wave or other freely propagating electromagnetic wave, an electromagnetic wave propagating through a waveguide or other transmission medium (e.g., a pulse of light passing through a fiber optic cable), or an electrical signal transmitted through a wire.
In the several embodiments provided by the present application, it should be understood that the disclosed apparatus, electronic device, and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, e.g., the division of the modules or units is merely a logical functional division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted, or not performed. In addition, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices, or elements, or may be an electrical, mechanical, or other form of connection.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one position, or may be distributed over a plurality of network units. Some or all of the units can be selected according to actual needs to solve the problem to be solved by the scheme of the embodiment of the application.
In addition, each functional unit in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the embodiments of the present invention is essentially or partly contributing to the prior art, or all or part of the technical solution may be embodied in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (including: a personal computer, a server, a data center or other network device) to perform all or part of the steps of the method according to the embodiments of the present invention. And the storage medium includes various media as exemplified above that can store program codes.
In the description of the embodiments of the present invention, those skilled in the art will appreciate that the embodiments of the present invention may be implemented as a method, an apparatus, an electronic device, and a computer-readable storage medium. Thus, embodiments of the present invention may be embodied in the following forms: complete hardware, complete software (including firmware, resident software, micro-code, etc.), a combination of hardware and software. Furthermore, in some embodiments, embodiments of the invention may also be implemented in the form of a computer program product in one or more computer-readable storage media having computer program code embodied therein.
Any combination of one or more computer-readable storage media may be employed by the computer-readable storage media described above. The computer-readable storage medium includes: an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination thereof. More specific examples of the computer readable storage medium include the following: portable computer diskette, hard disk, random Access Memory (RAM), read-only Memory (ROM), erasable programmable read-only Memory (EPROM), flash Memory (Flash Memory), optical fiber, compact disc read-only Memory (CD-ROM), optical storage device, magnetic storage device, or any combination thereof. In embodiments of the present invention, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, device.
The computer program code embodied in the computer readable storage medium may be transmitted using any appropriate medium, including: wireless, wire, fiber optic cable, radio Frequency (RF), or any suitable combination thereof.
Computer program code for carrying out operations of embodiments of the present invention may be written in assembly instructions, instruction Set Architecture (ISA) instructions, machine-related instructions, microcode, firmware instructions, state setting data, integrated circuit configuration data, or in one or more programming languages, including an object oriented programming language such as: java, smalltalk, C ++, also include conventional procedural programming languages, such as: c language or similar programming language. The computer program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of remote computers, the remote computers may be connected via any sort of network, including: a Local Area Network (LAN) or a Wide Area Network (WAN), which may be connected to the user's computer or to an external computer.
The embodiment of the invention describes a method, a device and electronic equipment through flowcharts and/or block diagrams.
It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer-readable program instructions. These computer-readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
These computer readable program instructions may also be stored in a computer readable storage medium that can cause a computer or other programmable data processing apparatus to function in a particular manner. Thus, instructions stored in a computer-readable storage medium produce an instruction means which implement the functions/acts specified in the flowchart and/or block diagram block or blocks.
The computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide processes for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
The foregoing is merely a specific implementation of the embodiment of the present invention, but the protection scope of the embodiment of the present invention is not limited thereto, and any person skilled in the art can easily think about changes or substitutions within the technical scope of the embodiment of the present invention, and the changes or substitutions are covered by the protection scope of the embodiment of the present invention. Therefore, the protection scope of the embodiments of the present invention shall be subject to the protection scope of the claims.

Claims (10)

1. A malicious single stream detection method, comprising:
acquiring a target single stream, wherein the target single stream is a session to be identified and transmitted by a hypertext transfer protocol;
performing feature processing on the protocol field of the target single stream to obtain target features of the target single stream; the target features comprise at least one of target length features, target time features and target statistical features;
and inputting the target features into a recognition model, and determining whether the target single stream is a malicious single stream or not based on the output result of the recognition model, wherein the recognition model is capable of recognizing whether the target features are malicious single stream features or not.
2. The method of claim 1, wherein the performing feature processing on the protocol field of the target single stream to obtain the target feature of the target single stream comprises: at least one of determining the target length characteristic, determining the target time characteristic, and determining the target statistical characteristic;
The determining the target length feature includes: calculating the number of characters contained in the protocol field of the target single stream to obtain a length value corresponding to the protocol field of the target single stream, and taking the length value as a target length characteristic;
the determining the target temporal feature includes: determining a time-related protocol field in the protocol field of the target single stream, subtracting an initial time from a latest change time recorded in the time-related protocol field, calculating a time difference value corresponding to the time-related protocol field, and taking the time difference value as a target time characteristic;
the determining the target statistical feature includes: and counting the number of special characters appearing in a protocol field of the target single stream and/or the entropy value of the uniform resource identifier, and taking the number of special characters appearing and/or the entropy value of the uniform resource identifier as target statistical characteristics.
3. The method of claim 1, wherein prior to the inputting the target feature into the recognition model, determining whether the target single stream is a malicious single stream based on an output result of the recognition model, further comprises:
Obtaining a plurality of sample single flows, wherein the sample single flows are divided into a normal sample single flow and a malicious sample single flow;
performing feature processing on the protocol field of the sample single flow to obtain sample features of the sample single flow, wherein the sample features comprise at least one of sample length features, sample time features and sample statistical features;
and inputting the sample characteristics into a preset model for training, and generating the identification model.
4. The method of claim 3, wherein the acquiring a plurality of sample uniflows comprises:
obtaining normal sample traffic packets transmitted by using a hypertext transfer protocol and malicious sample traffic packets transmitted by using the hypertext transfer protocol from a network; or, obtaining normal sample flow packets transmitted by using a hypertext transfer protocol from a network, and running the collected Trojan horse virus in a sandbox to obtain malicious sample flow packets generated when the Trojan horse virus establishes a session and transmitted by using the hypertext transfer protocol;
and shunting the normal sample flow packet and the malicious flow packet to obtain at least one normal sample single flow and at least one malicious sample single flow, wherein the at least one normal sample single flow and the at least one malicious sample single flow are used as the sample single flows, and uniform resource identifiers contained in each sample single flow are different from each other.
5. The method of claim 3 or 4, wherein the sample features further comprise sample base features; the feature processing is performed on the protocol field of the sample single stream to obtain the sample feature of the sample single stream, including:
and extracting a basic attribute value corresponding to the protocol field of the sample single stream, and taking the basic attribute value as a sample basic characteristic.
6. The method of claim 5, wherein the target features further comprise target base features; the feature processing is performed on the protocol field of the target single stream to obtain the target feature of the target single stream, including:
and extracting a basic attribute value corresponding to the protocol field of the target single stream, and taking the basic attribute value as a target basic feature.
7. A malicious single stream detection apparatus, comprising: the device comprises an acquisition module, a processing module and a determination module;
the acquisition module is used for acquiring a target single stream, wherein the target single stream is a session to be identified and transmitted by a hypertext transfer protocol;
the processing module is used for carrying out feature processing on the protocol field of the target single stream to obtain target features of the target single stream; the target features comprise at least one of target length features, target time features and target statistical features;
The determining module is used for inputting the target features into the identifying model, and determining whether the target single stream is a malicious single stream or not based on the output result of the identifying model, wherein the identifying model is capable of identifying whether the target features are the malicious single stream features or not.
8. The apparatus of claim 7, wherein the processing module comprises: at least one of a target length feature unit, a target time feature unit and a target statistics feature unit;
the target length feature unit is used for calculating the number of characters contained in the protocol field of the target single stream, obtaining a length value corresponding to the protocol field of the target single stream, and taking the length value as a target length feature;
the target time feature unit is configured to determine a time-related protocol field in the protocol fields of the target single stream, subtract an initial time from a latest modification time recorded in the time-related protocol field, calculate a time difference value corresponding to the time-related protocol field, and use the time difference value as a target time feature;
the target statistical feature unit is used for counting the number of special characters appearing in the protocol field of the target single stream and/or the entropy value of the uniform resource identifier, and taking the number of special characters appearing and/or the entropy value of the uniform resource identifier as the target statistical feature.
9. An electronic device comprising a bus, a transceiver, a memory, a processor and a computer program stored on the memory and executable on the processor, the transceiver, the memory and the processor being connected by the bus, characterized in that the computer program when executed by the processor implements the steps in the malicious single stream detection method according to any one of claims 1 to 6.
10. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps in the malicious single stream detection method according to any one of claims 1 to 6.
CN202210198145.1A 2022-03-01 2022-03-01 Malicious single-stream detection method and device and electronic equipment Pending CN116743399A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210198145.1A CN116743399A (en) 2022-03-01 2022-03-01 Malicious single-stream detection method and device and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210198145.1A CN116743399A (en) 2022-03-01 2022-03-01 Malicious single-stream detection method and device and electronic equipment

Publications (1)

Publication Number Publication Date
CN116743399A true CN116743399A (en) 2023-09-12

Family

ID=87903142

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210198145.1A Pending CN116743399A (en) 2022-03-01 2022-03-01 Malicious single-stream detection method and device and electronic equipment

Country Status (1)

Country Link
CN (1) CN116743399A (en)

Similar Documents

Publication Publication Date Title
US10104101B1 (en) Method and apparatus for intelligent aggregation of threat behavior for the detection of malware
Wang et al. Seeing through network-protocol obfuscation
CN112468520B (en) Data detection method, device and equipment and readable storage medium
US20180307832A1 (en) Information processing device, information processing method, and computer readable medium
EP3697042A1 (en) Traffic analysis method, public service traffic attribution method and corresponding computer system
WO2021243663A1 (en) Session detection method and apparatus, and detection device and computer storage medium
CN111371778B (en) Attack group identification method, device, computing equipment and medium
WO2015081693A1 (en) Network sharing user identification method and apparatus
CN111353036B (en) Rule file generation method, device, equipment and readable storage medium
US11093367B2 (en) Method and system for testing a system under development using real transaction data
EP4293550A1 (en) Traffic processing method and protection system
CN112671724B (en) Terminal security detection analysis method, device, equipment and readable storage medium
CN113132329A (en) WEBSHELL detection method, device, equipment and storage medium
CN116743399A (en) Malicious single-stream detection method and device and electronic equipment
CN115208682B (en) High-performance network attack feature detection method and device based on snort
CN107517237A (en) A kind of video frequency identifying method and device
CN115632801A (en) Method and device for detecting malicious traffic and electronic equipment
US11556649B2 (en) Methods and apparatus to facilitate malware detection using compressed data
CN113630367B (en) Anonymous flow identification method and device and electronic equipment
CN114760083A (en) Method and device for issuing attack detection file and storage medium
CN114765634B (en) Network protocol identification method, device, electronic equipment and readable storage medium
CN116647348A (en) Method and device for identifying encrypted traffic and electronic equipment
CN111046416A (en) Big health data management platform based on block chain
CN116827562A (en) Method and device for identifying attack based on graph data structure and electronic equipment
CN116418754A (en) Method and device for identifying encryption application and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination