CN113923042A - Malicious software abuse DoH detection and identification system and method - Google Patents
Malicious software abuse DoH detection and identification system and method Download PDFInfo
- Publication number
- CN113923042A CN113923042A CN202111245911.7A CN202111245911A CN113923042A CN 113923042 A CN113923042 A CN 113923042A CN 202111245911 A CN202111245911 A CN 202111245911A CN 113923042 A CN113923042 A CN 113923042A
- Authority
- CN
- China
- Prior art keywords
- cluster
- sequence
- doh
- matrix
- final
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/14—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
- H04L63/1408—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/50—Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
- G06F21/55—Detecting local intrusion or implementing counter-measures
- G06F21/56—Computer malware detection or handling, e.g. anti-virus arrangements
- G06F21/566—Dynamic detection, i.e. detection performed at run-time, e.g. emulation, suspicious activities
Landscapes
- Engineering & Computer Science (AREA)
- Computer Security & Cryptography (AREA)
- General Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Computer Hardware Design (AREA)
- Software Systems (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Computer Networks & Wireless Communication (AREA)
- Artificial Intelligence (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Signal Processing (AREA)
- Computing Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Virology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
The invention discloses a detection and identification system and a method for malicious software abuse (DoH) in the technical field of deep learning and network security, wherein the detection and identification system comprises the following steps: acquiring a pcap flow packet in a network; after extracting the time sequence characteristics in the pcap flow packet, establishing a packet cluster; generating a cluster sequence based on the clusters of all packets; extracting a final characteristic set in the cluster sequence; inputting the final characteristic set into a Transformer model for calculation to obtain a prediction label; and judging the malicious software abuse DoH flow based on the prediction label type. According to the method, through more relevant time characteristics in the multi-head attention mechanism mining sequence, overall analysis is reduced, so that the accuracy of the model on DoH flow detection under malicious software is improved, and the classification effect of the model is improved.
Description
Technical Field
The invention relates to a detection and identification system and method for malicious software abuse (DoH), and belongs to the technical field of deep learning and network security.
Background
The Domain Name System (DNS) is one of the important basic core services in the internet today, and mainly translates domain names easy for human memory into IP addresses easy for machine recognition, and a large number of network services are developed depending on the domain name service. DNS is therefore one of the early vulnerable network protocols, and DNS abuse has been a field of great interest to network security researchers. To overcome some DNS vulnerabilities related to privacy and data manipulation, the internet engineering task force introduced in RFC8484 dnSoverHTTPS (DoH), the communication of hypertext transfer protocol (HTTP) through Secure Socket Layer (SSL) or Transport Layer Security (TLS) transport was largely successful in preventing DNS attacks, and at the same time, DoH improved user privacy and security by preventing eavesdropping and DNS data manipulation. Encryption of traffic effectively provides better privacy, but it also reduces the visibility of network traffic by various security tools, which can affect the security level of the network.
Malware includes computer viruses, worms, trojans, zombie programs, or other programs with malicious intent that are intended to disrupt the operation of a computer system, steal proprietary information, or gain access control rights. When malicious software abuses the DNS protocol, communication between the infected host and the command and control server is typically accomplished using IP-Flux or Domain-Flux technology. In recent years there has been a first known family of malware that uses encryption to hide DNS activity in the DoH tunnel, such as the malware named Godlua, by HTTPS requests to retrieve text records of domain names using DNS, where the URLs of subsequent command and control servers are stored, to which the Godlua malware connects to obtain further instructions, and this technique of retrieving second or third stage command and control server URL addresses from DNS text records is not new. The novelty here is that a DoH request is used instead of a traditional DNS request. In this way, malware hides the frequency of DNS resolution. The reduction in network visibility forces administrators to block the use of DoH encryption in their networks, typically by blocking specific IP addresses of authoritative DoH resolvers. This solution is not perfect, as any malware wants to hide DNS traffic, and can easily create its own DoH resolver on non-standard addresses and ports.
Disclosure of Invention
The invention aims to overcome the defects in the prior art, and provides a detection and identification system and method for malicious software abuse DoH, which can achieve the effect of improving the detection accuracy.
In order to achieve the purpose, the invention is realized by adopting the following technical scheme:
in a first aspect, the present invention provides a method for detecting and identifying malware abuse DoH, including:
acquiring a pcap flow packet in a network;
after extracting the time sequence characteristics in the pcap flow packet, establishing a packet cluster;
generating a cluster sequence based on the clusters of all packets;
extracting a final characteristic set in the cluster sequence;
inputting the final characteristic set into a Transformer model for calculation to obtain a prediction label;
and judging the malicious software abuse DoH flow based on the prediction label type.
Further, the cluster of packets is:
C={size,pktCount,direction,duration,interarrivalTime
where C is the cluster of packets, size is the size of the cluster, pktCount is the number of packets in the cluster, direction is the direction of all packets in the cluster, duration is the duration of the cluster, and interrrivaltimei is the inter-arrival time.
Further, the final feature set is:
Fl={(Ci,...,Ci+l)|1≤i<n-l}
S=(C1,...,Cn)
wherein, FlFor the final feature set, S is the cluster sequence, CiIs the ith cluster in the cluster sequence, i is the cluster number, CnAnd representing the nth cluster in the cluster sequence, wherein n is the number of clusters in the cluster sequence, and l is the length of each sequence in the final feature set.
Further, the Transformer model comprises an encoder and a decoder, wherein the encoder extracts a sequence matrix based on the time-series characteristics of the final characteristic set, and the decoder generates a position vector matrix through the extracted sequence matrix.
Further, inputting the final feature set into a Transformer model for calculation to obtain a prediction label, including:
inputting a Transformer model based on the final characteristic set to obtain a sequence matrix, wherein the expression is as follows:
Q={E1,...,Ei-1,Ei,Ei+1,...,El}
wherein Q is a sequence matrix, EiFor cluster vectorization, l is the length of each sequence in the final feature set;
obtaining the position information of the cluster based on the sequence matrix Q, wherein the expression is as follows:
PE(pos,2j)=sin(pos/100002j/d)
PE(pos,2j+1)=cos(pos/100002j/d)
wherein, PE represents the calculated position vector, pos is the position serial number of the cluster in the sequence, j belongs to (0, d) is the serial number of each value in the cluster vector, 2j represents an even position, 2j +1 represents an odd position, and d is the embedding dimension of the cluster;
and adding the sequence matrix Q and the position vector matrix PE to obtain a final coding matrix.
Further, inputting the final feature set into a transform model for calculation to obtain a prediction label, and further comprising:
performing linear mapping on the final coding matrix for multiple times to obtain subsequence codes in different subspaces;
self-attention calculation is carried out on the subsequence codes in each subspace to obtain the subsequence codes A after the dependency weight weightingi;
A is to beiAnd performing linear transformation after connection to obtain a characteristic matrix, wherein the expression is as follows:
α=concat(A1,...,Ai-1,Ai,Ai+1,...,Ah)W
wherein α ∈ Rl×lFor the feature matrix, concat is the join function, h is the number of subsequences encoded, W ∈ Rhd×1Is a parameter matrix.
Further, inputting the final feature set into a transform model for calculation to obtain a prediction label, and further comprising:
performing characteristic matrix down-sampling on the global average pooling layer;
inputting the characteristic matrix after down-sampling into a full-connection layer for dimensionality reduction;
inputting a Softmax layer for classification detection based on the feature matrix after dimension reduction to obtain a prediction label;
the predictive tag includes: non-DoH, benign DoH and malicious DoH.
In a second aspect, the present invention provides a detection and identification system for a malware abuse DoH, including:
an acquisition module: the method comprises the steps of obtaining a pcap flow packet in a network;
a cluster creation module: the method comprises the steps of establishing a cluster of packets after extracting time sequence features in a pcap flow packet;
a cluster sequence generation module: for generating a cluster sequence based on the clusters of all packets;
the characteristic set extraction module: extracting a final characteristic set in the cluster sequence;
a predictive tag output module: the system is used for inputting the final characteristic set into a Transformer model for calculation to obtain a prediction label;
a judging module: for malware abuse DoH traffic determination based on predictive tag type.
In a third aspect, a device for detecting and identifying malicious software abuse (DoH) comprises a processor and a storage medium;
the storage medium is used for storing instructions;
the processor is configured to operate in accordance with the instructions to perform the steps of the method according to any of the above.
In a fourth aspect, a computer-readable storage medium has stored thereon a computer program which, when executed by a processor, performs the steps of any of the methods described above.
Compared with the prior art, the invention has the following beneficial effects:
according to the method, whether the DoH flow in the network environment is malicious DoH abused by malicious software is detected by capturing the pcap flow packet in real time, time sequence features are extracted from the DoH flow packet, then a Transformer self-attention mechanism is adopted, modeling is carried out completely depending on the overall dependency relationship of the attention mechanism on input and output, more relevant time features in the sequence are mined through a multi-head attention mechanism, overall analysis is reduced, and therefore the accuracy of the model on the DoH flow detection under the malicious software is improved, and the classification effect of the model is improved.
Drawings
Fig. 1 is a schematic diagram illustrating a detection and identification process of a malware abuse DoH according to an embodiment of the present invention.
Detailed Description
The invention is further described below with reference to the accompanying drawings. The following examples are only for illustrating the technical solutions of the present invention more clearly, and the protection scope of the present invention is not limited thereby.
The first embodiment is as follows:
a detection and identification method for malicious software abuse DoH is used for detecting the malicious software abuse DoH based on time series characteristics and self-attention mechanism identification, wherein FIG. 1 is a specific flow of the detection method for identifying the malicious software abuse DoH based on the time series characteristics and the self-attention mechanism and comprises the following steps:
capturing pcap traffic packets in a network, extracting time series characteristics in the data packets, and creating a cluster sequence of packets stored in JSON format to reduce the dimensionality of data, each generated packet cluster having five parameters for representing the characteristics of the cluster, which are the size of the cluster (the sum of data packets in bytes), the number of data packets in the cluster, the direction of all packets in the cluster (incoming or outgoing), the duration of the cluster (the time difference between the first and last cluster), and the inter-arrival time (the time difference between the current and previous cluster), as follows:
1) a packet cluster refers to a sequence of one or more consecutive packets in the same direction (having the same source and destination) in a network flow, creating a cluster of packets, and the basic principle is to combine these packets to find the application traffic between several packets during TLS fragmentation and IP fragmentation. The threshold timeout value t of the cluster is also taken into account so that two packets with a large time difference will not appear in the same cluster.
2) Traffic shape parameters such as packet size, packet direction, time difference between packets are used to infer some information about the underlying traffic. Extracting each generated packet cluster and expressing the packet cluster as follows by using quintuple characteristics:
C={size,pktCount,direction,duration,interarrivalTime}
where C is the cluster of packets, size is the cluster size (sum of packets in bytes), pktCount is the number of packets in the cluster, direction is the direction (incoming or outgoing) of all packets in the cluster, duration is the cluster duration (difference between the first and last cluster time), interrrivaltimei is the inter-arrival time (difference between current and previous cluster time).
Generating a cluster sequence of a packet stored in a JSON format, wherein the size of the sequence depends on the network flow inside the stream, and customizing a sliding window to generate the cluster sequence, so that one stream consists of a plurality of cluster sequences, and the method specifically comprises the following steps:
by generating the clustering process, the clustering sequence of any network flow can be represented as a clustering sequence S:
S=(C1,...,Cn)
Cnrepresenting the nth cluster in a cluster sequence, the size of the sequence n depending on the network traffic inside the stream, using a sliding window of length l to generate the cluster sequence, the cluster sequences smaller than l being filled with empty clusters. If l is a hyper-parameter of the number of clusters in the sequence, the final feature set F extracted from the cluster sequence SlExpressed as:
Fl={(Ci,...,Ci+l)|1≤i<n-l}
Flfor the final feature set, S is the cluster sequence, CiIs the ith cluster in the cluster sequence, i is the cluster number, CnRepresenting the nth cluster in the cluster sequence, wherein n is the number of clusters in the cluster sequence, l is the length of each sequence in the final feature set, and l needs to be customized to find the maximum of lThe optimal value is achieved, and the optimal detection effect is achieved. Finding the best value of/is a trade-off between the accuracy of the detection and the response time.
And thirdly, establishing a Transformer model, wherein the Transformer adopts the structures of an encoder and a decoder, and the two substructures mainly model the extracted time sequence characteristics through a multi-head attention mechanism. The input to the model needs to pass through both substructures simultaneously. The encoder models the timing relationship between clusters in the source sequence, and the decoder generates new information through the information vector extracted by the encoding end. Both the encoder and the decoder adopt a multi-head attention mechanism, a position embedding layer is used for representing time sequence information between sequences, and a multi-head self-attention layer is used for extracting information of clusters in the sequences, wherein the information is as follows:
1) the input layer, which is the input to the model, through the encoder and decoder accepts (l, 5), where 5 is the 5 parameters contained in the cluster. Obtaining a sequence vectorization representation:
Q={E1,...,Ei-1,Ei,Ei+1,...,El}
wherein Q is a sequence matrix, EiFor cluster vectorization, l is the length of each sequence in the final feature set;
2) in both substructures, the input matrix is subjected to a position encoding operation. In the model herein, a structure such as a recurrent neural network is not used, and thus sequence information cannot be directly captured. But the sequence information is very important and represents a global structure, so the relative or absolute position information of clusters in the sequence must be utilized. The calculation formula of the position information is as follows:
PE(pos,2j)=sin(pos/100002j/d)
PE(pos,2j+1)=cos(pos/100002j/d)
wherein PE represents the calculated position vector, pos is the position sequence number of the cluster in the sequence, and j belongs to (0, d) as the cluster vector CiThe serial number of each value in (1) is coded by sine at even position 2j, coded by cosine at odd position 2j +1, and d is the embedding dimension of the cluster.
3) The dimensionality of the sequence matrix Q is the same as that of the position vector matrix PE, and the two matrixes are added to obtain a final coding matrix.
4) In the multi-head attention calculation, each 1 head is 1 linear mapping. And performing linear mapping on the final coding matrix for multiple times to obtain subsequence codes in different subspaces. Self-attention calculation is carried out on the subsequence coding in each subspace, and the subsequence coding is coded as A after the dependency weight weightingi. For the extracted AiAnd (3) connecting, and obtaining a characteristic matrix alpha after linear transformation:
α=concat(A1,...,Ai-1,Ai,Ai+1,...,Ah)W
wherein α ∈ Rl×lFor the feature matrix, concat is the join function, h is the number of subsequences encoded, W ∈ Rhd×1Is a parameter matrix.
And fourthly, the second layer of the detection model is a global averaging pooling layer, after the feature matrix alpha is obtained, the feature matrix alpha is slid on the feature map in a window mode (window sliding similar to convolution), the average value in the window is taken as a result, one tensor of alpha-W-H-D is changed into a tensor of g-1-D, and the feature matrix is subjected to characteristic matrix down-sampling in the global averaging pooling layer, so that the overfitting phenomenon is reduced. Wherein, α is the original feature map, D is the number of sequence files, the number of feature maps is equal to the number of sequence files, and the average value of each feature map is calculated by the following calculation formula:
gi=avg(αi)
wherein g isiIs the result of averaging each feature map.
Fifthly, mixing giAfter the dimension reduction of the full-connection layer is input, a Softmax layer is input for classification detection, and a prediction label (whether malicious DoH exists) is obtained by cluster sequence classification:
wherein, the final output dimension of the Dense layer (Dense layer) is 3, which represents three categories: non-DoH, benign DoH and malignantDoH is intended.And taking the maximum probability value of each class probability as a classification result, namely, the class to which the probability value belongs, so that the malicious DoH can be detected.
Example two:
a detection and identification system for malware abuse DoH, comprising:
an acquisition module: the method comprises the steps of obtaining a pcap flow packet in a network;
a cluster creation module: the method comprises the steps of establishing a cluster of packets after extracting time sequence features in a pcap flow packet;
a cluster sequence generation module: for generating a cluster sequence based on the clusters of all packets;
the characteristic set extraction module: extracting a final characteristic set in the cluster sequence;
a predictive tag output module: the system is used for inputting the final characteristic set into a Transformer model for calculation to obtain a prediction label;
a judging module: for malware abuse DoH traffic determination based on predictive tag type.
Example three:
the embodiment of the invention also provides a device for detecting and identifying the malicious software abuse DoH, which comprises a processor and a storage medium;
the storage medium is used for storing instructions;
the processor is configured to operate in accordance with the instructions to perform the steps of the method of:
acquiring a pcap flow packet in a network;
after extracting the time sequence characteristics in the pcap flow packet, establishing a packet cluster;
generating a cluster sequence based on the clusters of all packets;
extracting a final characteristic set in the cluster sequence;
inputting the final characteristic set into a Transformer model for calculation to obtain a prediction label;
and judging the malicious software abuse DoH flow based on the prediction label type.
Example four:
an embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the following method steps:
acquiring a pcap flow packet in a network;
after extracting the time sequence characteristics in the pcap flow packet, establishing a packet cluster;
generating a cluster sequence based on the clusters of all packets;
extracting a final characteristic set in the cluster sequence;
inputting the final characteristic set into a Transformer model for calculation to obtain a prediction label;
and judging the malicious software abuse DoH flow based on the prediction label type.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The above description is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, several modifications and variations can be made without departing from the technical principle of the present invention, and these modifications and variations should also be regarded as the protection scope of the present invention.
Claims (10)
1. A detection and identification method for malicious software abuse (DoH) is characterized by comprising the following steps:
acquiring a pcap flow packet in a network;
after extracting the time sequence characteristics in the pcap flow packet, establishing a packet cluster;
generating a cluster sequence based on the clusters of all packets;
extracting a final characteristic set in the cluster sequence;
inputting the final characteristic set into a Transformer model for calculation to obtain a prediction label;
and judging the malicious software abuse DoH flow based on the prediction label type.
2. The method for detecting and identifying DoH of malicious software according to claim 1,
the cluster of packets is:
C={size,pktCount,direction,duration,interarrivalTime}
where C is the cluster of packets, size is the size of the cluster, pktCount is the number of packets in the cluster, direction is the direction of all packets in the cluster, duration is the duration of the cluster, and interrrivaltimei is the inter-arrival time.
3. The method for detecting and identifying DoH of malicious software according to claim 2,
the final feature set is:
Fl={(Ci,...,Ci+l)|1≤i<n-l}
S=(C1,...,Cn)
wherein, FlFor the final feature set, S is the cluster sequence, CiIs the ith cluster in the cluster sequence, i is the cluster number, CnAnd representing the nth cluster in the cluster sequence, wherein n is the number of clusters in the cluster sequence, and l is the length of each sequence in the final feature set.
4. The method for detecting and identifying malicious software abuse (DoH) according to claim 1, wherein the Transformer model comprises an encoder and a decoder, the encoder extracts a sequence matrix based on the time-series characteristics of the final characteristic set, and the decoder generates a position vector matrix through the extracted sequence matrix.
5. The method for detecting and identifying DoH of malicious software according to claim 1,
inputting the final characteristic set into a Transformer model for calculation to obtain a prediction label, wherein the prediction label comprises the following steps:
inputting a Transformer model based on the final characteristic set to obtain a sequence matrix, wherein the expression is as follows:
Q={E1,...,Ei-1,Ei,Ei+1,...,El}
wherein the content of the first and second substances,q is a sequence matrix, EiFor cluster vectorization, l is the length of each sequence in the final feature set;
obtaining the position information of the cluster based on the sequence matrix Q, wherein the expression is as follows:
PE(pos,2j)=sin(pos/100002j/d)
PE(pos,2j+1)=cos(pos/100002j/d)
wherein, PE represents the calculated position vector, pos is the position serial number of the cluster in the sequence, j belongs to (0, d) is the serial number of each value in the cluster vector, 2j represents an even position, 2j +1 represents an odd position, and d is the embedding dimension of the cluster;
and adding the sequence matrix Q and the position vector matrix PE to obtain a final coding matrix.
6. The method for detecting and identifying DoH of malicious software according to claim 5,
inputting the final feature set into a Transformer model for calculation to obtain a prediction label, and further comprising:
performing linear mapping on the final coding matrix for multiple times to obtain subsequence codes in different subspaces;
self-attention calculation is carried out on the subsequence codes in each subspace to obtain the subsequence codes A after the dependency weight weightingi;
A is to beiAnd performing linear transformation after connection to obtain a characteristic matrix, wherein the expression is as follows:
α=concat(A1,...,Ai-1,Ai,Ai+1,...,Ah)W
wherein α ∈ Rl×lFor the feature matrix, concat is the join function, h is the number of subsequences encoded, W ∈ Rhd×1Is a parameter matrix.
7. The method for detecting and identifying DoH of malicious software according to claim 6,
inputting the final feature set into a Transformer model for calculation to obtain a prediction label, and further comprising:
performing characteristic matrix down-sampling on the global average pooling layer;
inputting the characteristic matrix after down-sampling into a full-connection layer for dimensionality reduction;
inputting a Softmax layer for classification detection based on the feature matrix after dimension reduction to obtain a prediction label;
the predictive tag includes: non-DoH, benign DoH and malicious DoH.
8. A detection and identification system for DoH (malware abuse over) comprising:
an acquisition module: the method comprises the steps of obtaining a pcap flow packet in a network;
a cluster creation module: the method comprises the steps of establishing a cluster of packets after extracting time sequence features in a pcap flow packet;
a cluster sequence generation module: for generating a cluster sequence based on the clusters of all packets;
the characteristic set extraction module: extracting a final characteristic set in the cluster sequence;
a predictive tag output module: the system is used for inputting the final characteristic set into a Transformer model for calculation to obtain a prediction label;
a judging module: for malware abuse DoH traffic determination based on predictive tag type.
9. The device for detecting and identifying the abuse of DoH of the malicious software is characterized by comprising a processor and a storage medium;
the storage medium is used for storing instructions;
the processor is configured to operate in accordance with the instructions to perform the steps of the method according to any one of claims 1 to 7.
10. Computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111245911.7A CN113923042B (en) | 2021-10-26 | 2021-10-26 | Detection and identification system and method for malicious software abuse (DoH) |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111245911.7A CN113923042B (en) | 2021-10-26 | 2021-10-26 | Detection and identification system and method for malicious software abuse (DoH) |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113923042A true CN113923042A (en) | 2022-01-11 |
CN113923042B CN113923042B (en) | 2023-09-15 |
Family
ID=79243014
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111245911.7A Active CN113923042B (en) | 2021-10-26 | 2021-10-26 | Detection and identification system and method for malicious software abuse (DoH) |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113923042B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114513341A (en) * | 2022-01-21 | 2022-05-17 | 上海斗象信息科技有限公司 | Malicious traffic detection method, device, terminal and computer readable storage medium |
CN114900360A (en) * | 2022-05-12 | 2022-08-12 | 国家计算机网络与信息安全管理中心山西分中心 | Method for detecting DoH flow in HTTPS flow |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170134404A1 (en) * | 2015-11-06 | 2017-05-11 | Cisco Technology, Inc. | Hierarchical feature extraction for malware classification in network traffic |
US9705904B1 (en) * | 2016-07-21 | 2017-07-11 | Cylance Inc. | Neural attention mechanisms for malware analysis |
CN108667816A (en) * | 2018-04-19 | 2018-10-16 | 重庆邮电大学 | A kind of the detection localization method and system of Network Abnormal |
CN110276439A (en) * | 2019-05-08 | 2019-09-24 | 平安科技(深圳)有限公司 | Time Series Forecasting Methods, device and storage medium based on attention mechanism |
CN111669385A (en) * | 2020-05-29 | 2020-09-15 | 重庆理工大学 | Malicious traffic monitoring system fusing deep neural network and hierarchical attention mechanism |
CN113316163A (en) * | 2021-06-18 | 2021-08-27 | 东南大学 | Long-term network traffic prediction method based on deep learning |
CN113472809A (en) * | 2021-07-19 | 2021-10-01 | 华中科技大学 | Encrypted malicious traffic detection method and system and computer equipment |
-
2021
- 2021-10-26 CN CN202111245911.7A patent/CN113923042B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170134404A1 (en) * | 2015-11-06 | 2017-05-11 | Cisco Technology, Inc. | Hierarchical feature extraction for malware classification in network traffic |
US9705904B1 (en) * | 2016-07-21 | 2017-07-11 | Cylance Inc. | Neural attention mechanisms for malware analysis |
CN108667816A (en) * | 2018-04-19 | 2018-10-16 | 重庆邮电大学 | A kind of the detection localization method and system of Network Abnormal |
CN110276439A (en) * | 2019-05-08 | 2019-09-24 | 平安科技(深圳)有限公司 | Time Series Forecasting Methods, device and storage medium based on attention mechanism |
CN111669385A (en) * | 2020-05-29 | 2020-09-15 | 重庆理工大学 | Malicious traffic monitoring system fusing deep neural network and hierarchical attention mechanism |
CN113316163A (en) * | 2021-06-18 | 2021-08-27 | 东南大学 | Long-term network traffic prediction method based on deep learning |
CN113472809A (en) * | 2021-07-19 | 2021-10-01 | 华中科技大学 | Encrypted malicious traffic detection method and system and computer equipment |
Non-Patent Citations (1)
Title |
---|
陈伟;胡磊;杨龙;: "基于载荷特征的加密流量快速识别方法", 计算机工程, no. 12 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114513341A (en) * | 2022-01-21 | 2022-05-17 | 上海斗象信息科技有限公司 | Malicious traffic detection method, device, terminal and computer readable storage medium |
CN114513341B (en) * | 2022-01-21 | 2023-09-12 | 上海斗象信息科技有限公司 | Malicious traffic detection method, malicious traffic detection device, terminal and computer readable storage medium |
CN114900360A (en) * | 2022-05-12 | 2022-08-12 | 国家计算机网络与信息安全管理中心山西分中心 | Method for detecting DoH flow in HTTPS flow |
CN114900360B (en) * | 2022-05-12 | 2023-09-22 | 国家计算机网络与信息安全管理中心山西分中心 | Method for detecting DoH flow in HTTPS flow |
Also Published As
Publication number | Publication date |
---|---|
CN113923042B (en) | 2023-09-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Abdullahi et al. | Fractal coding-based robust and alignment-free fingerprint image hashing | |
CN113923042B (en) | Detection and identification system and method for malicious software abuse (DoH) | |
Guo et al. | Blind image watermarking method based on linear canonical wavelet transform and QR decomposition | |
Sun et al. | Secure and robust image hashing via compressive sensing | |
Sharma et al. | An enhanced Huffman-PSO based image optimization algorithm for image steganography | |
Hosny et al. | Robust image hashing using exact Gaussian–Hermite moments | |
Hamghalam et al. | Geometric modelling of the wavelet coefficients for image watermarking using optimum detector | |
Ma et al. | Secure multimodal biometric authentication with wavelet quantization based fingerprint watermarking | |
Rani et al. | A robust watermarking scheme exploiting balanced neural tree for rightful ownership protection | |
Pérez et al. | Universal steganography detector based on an artificial immune system for JPEG images | |
Xue et al. | A multi-layer steganographic method based on audio time domain segmented and network steganography | |
Farhat et al. | Towards blind detection of low‐rate spatial embedding in image steganalysis | |
Chen et al. | Using adversarial examples to bypass deep learning based url detection system | |
Xu et al. | A deep learning framework supporting model ownership protection and traitor tracing | |
CN115622793A (en) | Attack type identification method and device, electronic equipment and storage medium | |
CN111159588B (en) | Malicious URL detection method based on URL imaging technology | |
CN114362988A (en) | Network traffic identification method and device | |
CN115134095A (en) | Botnet control terminal detection method and device, storage medium and electronic equipment | |
CN112613055A (en) | Image processing system and method based on distributed cloud server and digital-image conversion | |
Dutta et al. | A secure algorithm for biometric-based digital image watermarking in DCT domain | |
Kamal et al. | Review of Different Steganographic techniques on Medical images regarding their efficiency | |
CN114124563B (en) | Abnormal flow detection method and device, electronic equipment and storage medium | |
Michaylov | Exploring the Use of Steganography and Steganalysis in Forensic Investigations for Analysing Digital Evidence | |
Fan et al. | A fingerprint-based audio authentication scheme using frequency domain statistical characteristic | |
Fan et al. | Audio and video matching zero-watermarking algorithm based on NSCT |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |