CN114567487A

CN114567487A - DNS hidden tunnel detection method with multi-feature fusion

Info

Publication number: CN114567487A
Application number: CN202210198998.5A
Authority: CN
Inventors: 林飞; 李鼎; 易永波; 古元; 毛华阳; 华仲峰
Original assignee: Beijing Act Technology Development Co ltd
Current assignee: Beijing Act Technology Development Co ltd
Priority date: 2022-03-03
Filing date: 2022-03-03
Publication date: 2022-05-31
Anticipated expiration: 2042-03-03
Also published as: CN114567487B

Abstract

A DNS hidden tunnel detection method with multi-feature fusion relates to the technical field of information, and the method comprises the following steps: 1) acquiring a DNS hidden tunnel flow packet by a black sample collector through a self-built DNS hidden tunnel; 2) preprocessing DNS hidden tunnel flow packet data by a black sample standardization module, and extracting DNS hidden tunnel flow packet data characteristics; 3) a white sample standardization module acquires a normal DNS request sample; 4) constructing a neural network model module; 5) constructing a rapid pre-screening module by using a white sample; the quick pre-screening module can simply distinguish the normal request domain name from the tunnel request domain name, efficiently and quickly eliminates the normal request domain name occupying most of the actual work, and in the aspect of deep learning detection, the quick pre-screening module combines the general rule characteristic and the deep domain name text characteristic for DNS hidden tunnel detection, thereby improving the detection accuracy and reducing the detection difficulty.

Description

Multi-feature fusion DNS hidden tunnel detection method

Technical Field

The invention relates to the technical field of information.

Background

With the continuous development of the internet, the DNS becomes an indispensable service, so that a general firewall cannot detect and filter DNS traffic. Therefore, lawless persons can take the opportunity to use the DNS as a hidden channel to realize remote control, file transmission and other operations, and great threat is brought to network security. Whether a DNS hidden tunnel exists or not is detected and identified, so that the user loss can be effectively reduced, and the health and the safety of a network environment are guaranteed.

At present, related patents detect DNS hidden tunnels, for example, patent [ CN111786993A ] designs manually to extract DNS request related features, such as request record type, length of a single label of a domain name, various character ratios, and then sets multiple thresholds to determine whether a DNS tunnel exists. The method designs abundant characteristics, but the judgment is carried out by completely depending on set rules, so that the manual interference is too much, and the misjudgment is easily caused. Patent [ CN110149418A ] uses a deep learning method for detection, a deep neural network can often learn hidden features which cannot be designed manually, and the detection effect is good, but the method does not use features such as request record types, and the features are also closely related to the detection result. Secondly, the problem of high computation complexity of a deep learning model is not considered, and finally, training data is not amplified by using data enhancement, so that the problem of cost consumption of manual labeling data can be solved by domain name data enhancement.

The method comprises the steps of extracting the domain name and the related request information of each request, using the fusion characteristics as input, and using a deep learning model as a detection model to judge whether the DNS flow has a hidden tunnel.

Using known techniques

N-Gram is an algorithm based on a statistical language model. The basic idea is to perform a sliding window operation with the size of N on the content in the text according to bytes, and form a byte fragment sequence with the length of N. Each byte segment is called as a gram, the occurrence frequency of all the grams is counted, and filtering is performed according to a preset threshold value to form a key gram list, namely a vector feature space of the text, wherein each gram in the list is a feature vector dimension. The model is based on the assumption that the nth word occurs only in relation to the first N-1 words, but not in relation to any other word, and that the probability of a complete sentence is the product of the probabilities of occurrence of the words. These probabilities can be obtained by counting the number of times that N words occur simultaneously directly from the corpus. Binary Bi-grams and ternary Tri-grams are commonly used.

The DNS tunnel, which is one of the covert channels, referred to herein as a DNS covert tunnel, establishes communication by encapsulating other protocols in a DNS protocol for transmission. Since DNS is an essential service in our network world, most firewalls and intrusion detection devices rarely filter DNS traffic, which provides the DNS as a covert channel that can be used to perform operations such as remote control, file transfer, etc., and increasing research has now demonstrated that DNS covert tunnels also often play an important role in botnet and APT attacks.

DNS hidden tunneling has been implemented by a plurality of tools from the proposal to the present, NSTX and Ozymandns are relatively early in history, iodine and dnscat2 are relatively active at present, and Denise, DNS2tcp and Heyoka are also available. The core principles of different tools are similar, but there is a certain difference in terms of coding, implementation details and target application scenarios. The implementation tool of the DNS hidden tunnel comprises: NSTX, Ozymandns, iododine, dnscat2, Denise, dns2tcp and Heyoka.

The PCAP is a data packet capture library, and a plurality of software uses the PCAP as a data packet capture tool. WireShark also uses the PCAP library to capture data packets. The data packets captured by the PCAP are not original network byte streams, but are newly assembled to form a new data format.

the tcpdump adopts a command line mode to screen and capture the data packet of the interface, and the rich characteristic of the tcpdump is expressed on a flexible expression. Tcpdump without any option will grab the first network interface by default and will stop grabbing the package only if the tcpdump process is terminated.

dropout refers to that in the deep learning training process, for a neural network training unit, the neural network training unit is removed from the network according to a certain probability, and it is noted that temporarily, for the descending of the random gradient, each mini-batch is training a different network due to random discarding.

Disclosure of Invention

In view of the defects of the prior art, the method for detecting the DNS hidden tunnel with the multi-feature fusion provided by the invention comprises the following steps:

1) obtaining DNS hidden tunnel flow packets through self-built DNS hidden tunnels by black sample collector

The method comprises the following steps that a black sample collector builds a DNS hidden tunnel by using two servers and a DNS hidden tunnel implementation tool, wherein one server serves as a server end of the DNS server deployment DNS hidden tunnel implementation tool, and the other server serves as an access end of the DNS server deployment DNS hidden tunnel implementation tool; the DNS server is deployed to be the DNS server for analyzing the specific domain name, and the specific domain name is only set in a test environment between two servers, so that the external network environment is not influenced and is not influenced by the external network environment; editing data of any content as transmission sample data, wherein the size of the transmission sample data is not limited; deploying a tcpdump tool on a DNS server to collect DNS traffic, storing the DNS traffic in a PCAP (personal computer application protocol) packet mode, and using the DNS traffic as a DNS hidden tunnel traffic packet;

2) preprocessing DNS hidden tunnel flow packet data by a black sample standardization module, and extracting DNS hidden tunnel flow packet data characteristics

Extracting key fields in the PCAP flow packet by using a Wireshark tool, wherein the key fields mainly comprise a source ip, a source port, a destination ip, a destination port, a requested domain name and a request type; removing the domain name suffix, and dividing the sub domain name without the domain name suffix into a plurality of character strings by taking the character as a boundary, namely a plurality of sub domain name fragments; randomly replacing characters in the sub-domain names according to an expansion rule to expand data, so that a plurality of groups of sub-domain name fragments can be obtained; the expansion rule is that only characters of the same type are replaced when characters are replaced, the replacement positions and the number of the replaced characters are determined randomly, at least 1 character is replaced when the characters are replaced, the number of the characters replaced at most when the characters are replaced is half of the length of the character string, and the length of the replaced sub domain name is the same as that of the atomic domain name; extracting the domain name length; extracting the number of domain name labels, wherein the number of the domain name labels refers to the number of domain name fragments segmented in'; extracting the DNS request record type; taking a group of a plurality of sub domain name fragments, domain name length, domain name label number and DNS request record type as a DNS hidden tunnel traffic packet data sample, wherein the DNS hidden tunnel traffic packet data sample is called a black sample;

3) obtaining normal DNS request sample by white sample standardization module

Storing DNS traffic in daily work as a PCAP (personal computer application protocol) packet by collecting the DNS traffic in daily work, and extracting key fields in the PCAP traffic packet by using a Wireshark tool, wherein the key fields mainly comprise a source ip, a source port, a destination ip, a destination port, a requested domain name and a request type; removing the domain name suffix, and dividing the sub domain name without the domain name suffix into a plurality of character strings by taking the character as a boundary, namely a plurality of sub domain name fragments; extracting the domain name length; extracting domain name label number features; extracting the DNS request record type; taking a group of a plurality of sub domain name fragments, domain name length, domain name label number and DNS request record type as a normal DNS flow packet data sample, wherein the normal DNS flow packet data sample is called a white sample;

4) model module for constructing neural network

Numbering the domain name characters and features, and establishing a word list so as to be used for neural network model input: for the domain name length feature, the domain name fragment length is less than 10 and is coded as 1, the domain name fragment length is 10-20 and is coded as 2, the domain name fragment length is 20-30 and is coded as 3, the domain name fragment length is 30-50 and is coded as 4, and the domain name fragment length is 5 above 50; for the domain name label number characteristics, the number of domain name labels less than 3 is coded as 6, the number of domain name labels from 3 to 5 is coded as 7, and the number of domain name labels more than or equal to 5 is coded as 8; for the DNS record type feature, the DNS record type is a TXT record, the code is 9, and the code is 10 if the DNS record type is not a TXT record; for the domain name string feature, characters a through Z correspond to encodings 11 through 36, characters a through Z correspond to encodings 37 through 62, and characters 0-9 correspond to encodings 63-72; randomly taking 70% of samples of all samples as a training set, and randomly dividing the rest samples into a verification set and a test set in equal parts, wherein all samples comprise DNS hidden tunnel traffic packet data samples and normal DNS traffic packet data samples; performing filling operation before inputting a sample, setting the maximum length of a domain name fragment code to be 64, cutting off the part of input data which is longer than 64, and if the length is less than 64, supplementing 0 at the tail part; the word vector layer is used for converting each digital code into a vector form; in a CNN convolutional neural network layer, fully learning text characteristics of a domain name through one-dimensional convolutions with different convolution kernel sizes, splicing results of three convolutional layers, performing maximum pooling, adding a Dropout layer for reducing model complexity and preventing overfitting, and finally using a full-connection classification layer, wherein two classes exist, namely a DNS hidden tunnel class and a normal DNS request class, the DNS hidden tunnel class is called a black sample class, and the normal DNS request class is called a white sample class;

5) construction of a Rapid prescreening Module Using white samples

Before the class judgment of the newly acquired DNS traffic is carried out, the newly acquired DNS traffic is defaulted to be a white sample class, and the class of the newly acquired DNS traffic is changed into a black sample class until the white sample is judged to be a black sample class through the judgment of the neural network model module; the fast pre-screening module is used for fast judging the newly received white samples, eliminating data with low probability of becoming black samples in the white samples, accelerating the speed of judging the category of the newly acquired DNS traffic and reducing the calculation amount of the neural network model module;

the steps of constructing the rapid pre-screening module comprise:

taking white samples acquired by a white sample standardization module as training samples, recording each training sample as a sub-domain name sequence, recording the sub-domain name sequence as S, and expressing the occurrence probability of the whole sequence as follows when the length is m:

secondly, according to the Markov assumption, the occurrence of a word is only related to the previous n words, n takes a value of 3, and the conditional probability calculation in the formula is simplified as follows:

thirdly, a Bayesian formula is utilized, and the calculation mode of each item is as follows:

wherein count (…) represents the number of times these words in the sample set co-occur in succession; in order to avoid the condition that the denominator is zero, after smoothing, the existence probability calculation formula of each sample is obtained and is expressed as follows:

v is the number of words in the word list, and for the whole sample set, the probability of all ternary combinations can be calculated and stored as a model for use in subsequent prediction;

calculating existence probability p of each sample sequence in the training sample, wherein due to the fact that the lengths of different sequences are different, the number of triples is different, and the probability difference calculated through the formula product is large, one-time conversion is conducted, the number of ternary combinations in the sequences is set to be t, and the existence probability of each sample sequence is expressed as

；

Fifthly, determining a segmentation threshold, taking the median of the existence probability of all the training samples as a threshold, directly marking the training samples larger than the probability threshold as white samples, and enabling the white samples smaller than the probability threshold to enter a neural network model module for category judgment.

Advantageous effects

The fast pre-screening module can simply distinguish the normal request domain name from the tunnel request domain name, and the efficiency is high, so that the normal request domain name occupying most of the actual work can be quickly eliminated, and the subsequent deep learning model detection with high complexity and low speed is avoided. In the aspect of deep learning detection, the method combines the general rule characteristic and the deep domain name text characteristic for DNS hidden tunnel detection. Compared with the prior DNS detection method, the characteristics of domain name complexity, information entropy and the like do not need to be designed manually, and the deep network model can automatically learn the information in the domain name text, so that the method has a better effect compared with manual design. Next, network request characteristics such as the request record type, which are not in the domain name text, are also encoded as model inputs. In the aspect of deep model training, the method uses domain name data enhancement, can effectively amplify training data, improves the model effect, and reduces the cost of manpower and material resources brought by manually collecting data. Finally, the method also uses a black-and-white list mechanism, so that the previous detection result plays a role, and the detection efficiency and effect are improved.

Drawings

FIG. 1 is a flow chart of the present invention.

Detailed Description

Example one

Referring to fig. 1, the implementation steps of the method for detecting a DNS hidden tunnel with multi-feature fusion provided by the present invention include:

s01 obtaining DNS hidden tunnel flow packet by black sample collector through self-built DNS hidden tunnel

The method comprises the following steps that a black sample collector builds a DNS hidden tunnel by using two servers and a DNS hidden tunnel implementation tool, wherein one server serves as a server end of the DNS server deployment DNS hidden tunnel implementation tool, and the other server serves as an access end of the DNS server deployment DNS hidden tunnel implementation tool; the DNS server is deployed to be the DNS server for analyzing the specific domain name, and the specific domain name is only set in a test environment between two servers, so that the external network environment is not influenced and is not influenced by the external network environment; editing data of any content as transmission sample data, wherein the size of the transmission sample data is not limited; deploying a tcpdump tool on a DNS server to collect DNS traffic, storing the DNS traffic in a PCAP packet mode, and using the DNS traffic as a DNS hidden tunnel traffic packet;

s02 preprocessing DNS hidden tunnel flow packet data by a black sample standardization module and extracting DNS hidden tunnel flow packet data characteristics

Extracting key fields in the PCAP flow packet by using a Wireshark tool, wherein the key fields mainly comprise a source ip, a source port, a destination ip, a destination port, a requested domain name and a request type; removing the domain name suffix, and dividing the sub domain name without the domain name suffix into a plurality of character strings by taking the character as a boundary, namely a plurality of sub domain name fragments; randomly replacing characters in the sub-domain names according to an expansion rule to expand data, so that a plurality of groups of sub-domain name fragments can be obtained; the expansion rule is that only characters of the same type are replaced when characters are replaced, the replacement positions and the number of the replaced characters are determined randomly, at least 1 character is replaced when the characters are replaced, the number of the characters replaced at most when the characters are replaced is half of the length of the character string, and the length of the replaced sub domain name is the same as that of the atomic domain name; extracting the domain name length; extracting the number of domain name labels, wherein the number of the domain name labels refers to the number of domain name fragments divided in a 'way'; extracting the DNS request record type; taking a group of a plurality of sub domain name fragments, domain name length, domain name label number and DNS request record type as a DNS hidden tunnel traffic packet data sample, wherein the DNS hidden tunnel traffic packet data sample is called a black sample;

s03 obtaining normal DNS request sample by white sample standardization module

Storing DNS traffic in daily work as a PCAP packet by collecting the DNS traffic, and extracting key fields in the PCAP traffic packet by using a Wireshark tool, wherein the key fields mainly comprise a source ip, a source port, a destination ip, a destination port, a requested domain name and a request type; removing the domain name suffix, and dividing the sub domain name without the domain name suffix into a plurality of character strings by taking the character as a boundary, namely a plurality of sub domain name fragments; extracting the domain name length; extracting domain name label number features; extracting the DNS request record type; taking a group of multiple sub domain name fragments, domain name length, domain name label number and DNS request record type as a normal DNS flow packet data sample, wherein the normal DNS flow packet data sample is called a white sample;

s04 neural network model building module

Numbering the domain name characters and features, and establishing a word list so as to be used for neural network model input: for the domain name length feature, the domain name fragment length is less than 10 and is coded as 1, the domain name fragment length is 10-20 and is coded as 2, the domain name fragment length is 20-30 and is coded as 3, the domain name fragment length is 30-50 and is coded as 4, and the domain name fragment length is more than 50 and is coded as 5; for the domain name label number characteristics, the number of domain name labels less than 3 is coded as 6, the number of domain name labels from 3 to 5 is coded as 7, and the number of domain name labels more than or equal to 5 is coded as 8; for the DNS record type feature, the DNS record type is a TXT record, the code is 9, and the code is 10 if the DNS record type is not a TXT record; for the domain name string feature, characters a through Z correspond to encodings 11 through 36, characters a through Z correspond to encodings 37 through 62, and characters 0-9 correspond to encodings 63-72; randomly taking 70% of samples of all samples as a training set, and randomly dividing the rest samples into a verification set and a test set in equal parts, wherein all samples comprise DNS hidden tunnel traffic packet data samples and normal DNS traffic packet data samples; performing filling operation before inputting a sample, setting the maximum length of a domain name fragment code to be 64, cutting off the part of input data which is longer than 64, and if the length is less than 64, supplementing 0 at the tail part; the word vector layer is used for converting each digital code into a vector form; in a CNN convolutional neural network layer, fully learning text characteristics of a domain name through one-dimensional convolutions with different convolution kernel sizes, splicing results of three convolutional layers, performing maximum pooling, adding a Dropout layer for reducing model complexity and preventing overfitting, and finally using a full-connection classification layer, wherein two classes exist, namely a DNS hidden tunnel class and a normal DNS request class, the DNS hidden tunnel class is called a black sample class, and the normal DNS request class is called a white sample class;

s05 construction of Rapid prescreening Module Using white samples

the steps of constructing the rapid pre-screening module comprise:

wherein count (…) represents the number of times that the words in the sample set co-occur consecutively; in order to avoid the condition that the denominator is zero, after the smoothing processing, the existence probability calculation formula of each sample is obtained and is expressed as follows:

；

Example two

Newly acquired DNS network traffic classification

1) Inputting newly acquired DNS network flow into a white sample standardization module to obtain a white sample;

2) inputting the white sample into a quick pre-screening module to filter the white sample into a white sample with low black sample probability;

3) and inputting the white samples with high probability of becoming black samples in the white samples into a neural network model module, and finally classifying the input white samples.

Claims

1. A DNS hidden tunnel detection method with multi-feature fusion is characterized by comprising the following implementation steps:

3) obtaining normal DNS request sample by white sample standardization module

4) model module for constructing neural network

5) construction of a Rapid prescreening Module Using white samples

the steps of constructing the rapid pre-screening module comprise:

；