CN115086080B - DNS hidden tunnel detection method based on flow characteristics - Google Patents

DNS hidden tunnel detection method based on flow characteristics Download PDF

Info

Publication number
CN115086080B
CN115086080B CN202210928837.7A CN202210928837A CN115086080B CN 115086080 B CN115086080 B CN 115086080B CN 202210928837 A CN202210928837 A CN 202210928837A CN 115086080 B CN115086080 B CN 115086080B
Authority
CN
China
Prior art keywords
data
class
type
value
identification
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210928837.7A
Other languages
Chinese (zh)
Other versions
CN115086080A (en
Inventor
左源
唐麒隆
李磊
向君耀
吴志远
谢虎
李琳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sino Telecom Technology Co inc
Original Assignee
Sino Telecom Technology Co inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sino Telecom Technology Co inc filed Critical Sino Telecom Technology Co inc
Priority to CN202210928837.7A priority Critical patent/CN115086080B/en
Publication of CN115086080A publication Critical patent/CN115086080A/en
Application granted granted Critical
Publication of CN115086080B publication Critical patent/CN115086080B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1425Traffic logging, e.g. anomaly detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention relates to the field of information security, in particular to a multi-factor DNS hidden tunnel detection method based on flow characteristics, which comprises the steps of carrying out analysis processing on DNS flow data under the state of acquiring the DNS flow data so as to form a first detection data flow and a second detection data packet; forming first class data and second class data according to the first class detection data, and forming third class data, fourth class data, fifth class data and sixth class data according to the second detection data; and forming a detection result output according to the first type data, the second type data, the third type data, the fourth type data, the fifth type data and the sixth type data.

Description

DNS hidden tunnel detection method based on flow characteristics
Technical Field
The invention relates to the field of information security, in particular to a multi-factor DNS hidden tunnel detection method based on flow characteristics.
Background
DNS (Domain NAME SYSTEM ) is a distributed database on the internet that maps Domain names and IP addresses to each other, enabling users to more conveniently access the internet without having to remember IP strings that can be read directly by the machine. The process of finally obtaining the IP address corresponding to the hostname by the hostname is called domain name resolution (or hostname resolution). The DNS protocol runs on top of the UDP protocol, using port number 53. RFC 2181 in the RFC document has specification on DNS, RFC 2136 on dynamic update of DNS, and RFC 2308 on reverse caching of DNS queries. Since DNS is based on the UDP protocol, it is also possible to encapsulate data to be transmitted in the DNS protocol in addition to being used for domain name resolution, and then complete transmission data (communication) in DNS request and response packets, i.e., DNS covert communication. DNS is an indispensable service in the internet world, and it is difficult for firewalls and other security devices to completely filter the service, so it is clearly a good choice for an attacker to use 53 ports (default DNS service ports) for tunneling. DNS tunnels can be broadly divided into two types, direct connection and relay, depending on the implementation. Direct connection: the user end establishes connection with the appointed target DNS server directly, and then the data code to be transmitted is encapsulated in the DNS protocol for communication. The method has the advantages of high speed, weak concealment and obvious characteristics. In addition, the direct connection mode has a relatively large limitation, and in order to reduce the risk of network attack as much as possible in many enterprise networks at present, related strategies are generally configured to only allow traffic to pass through with a designated trusted DNS server. Relay tunnel: the relay DNS tunnel realized by DNS iterative query is extremely secret, and can be successfully deployed in most scenes. However, since the data packet needs to jump through multiple nodes before reaching the destination DNS server, the data transmission speed and transmission capacity are much slower than those of a direct connection.
Currently, machine learning techniques have been successfully applied to the fields of speech recognition, image recognition, and the like, and have good effects. However, machine learning can be successfully applied to the network security industry in a few cases at present, mainly because abnormal samples with large data volume cannot be obtained. In a DNS tunnel detection scene, the problem of scarce abnormal sample data also exists in the industry, so that a training model cannot fit an actual production environment, the false alarm rate of detecting the DNS tunnel is high, and the detection result is not very accurate.
Disclosure of Invention
Aiming at the defects of the prior art, a DNS hidden tunnel detection method based on flow characteristics is provided, wherein: comprising the steps of (a) a step of,
Analyzing the DNS traffic data in a state of acquiring the DNS traffic data to form a first detection data stream and a second detection data packet;
Forming first class data and second class data according to the first class detection data, and forming third class data, fourth class data, fifth class data and sixth class data according to the second detection data;
And forming a detection result output according to the first type data, the second type data, the third type data, the fourth type data, the fifth type data and the sixth type data.
Preferably, the above method for detecting a DNS hidden tunnel based on traffic characteristics, wherein: in a state that DNS traffic data is acquired, performing an analysis process on the DNS traffic data to form a first detection data stream and a second detection data packet specifically includes:
Acquiring a first characteristic value, a second characteristic value and a third characteristic value of the current DNS traffic data according to the DNS traffic data in a state of acquiring the DNS traffic data;
forming identification data according to the first characteristic value, the second characteristic value and the third characteristic value;
And reading the data stream matched with the identification data to form a first detection data stream and a second detection data packet.
Preferably, the above method for detecting a DNS hidden tunnel based on traffic characteristics, wherein: the first detection data stream at least comprises the following data of the first type:
S1=(R/r)*A;
wherein S1 is the first type data; r is a reference number of a standard RTT; a is RTT influence factor; r is the RTT value actually detected in the first detected data stream.
Preferably, the above method for detecting a DNS hidden tunnel based on traffic characteristics, wherein: the forming manner that the first detection data stream at least contains the second class data comprises the following steps:
acquiring a basic threshold of the second class data; the basic threshold for the second class of data is:
S2 is a basic threshold value of the second class data, and n is the number of actual request response packets; n1 and N2 are respectively a first parameter value and a second reference value of the second class data;
And forming the second type data by combining the basic threshold of the second type data according to the number of the data packets matched with the second type data in the first detection data stream.
Preferably, the above method for detecting a DNS hidden tunnel based on traffic characteristics, wherein: the second detection data packet at least comprises third class data, wherein the third class data is formed in a way that,
Acquiring a basic threshold of the third class of data; the base threshold for the third class of data is:
S3 is a basic threshold value of third-class data, and m is the abnormal number of request types; m1 and M2 are respectively a first parameter value and a second reference value of third-class data;
and forming the third type of data by combining the basic threshold of the third type of data according to the number of the data packets matched with the third type of data in the first detection data stream.
Preferably, the above method for detecting a DNS hidden tunnel based on traffic characteristics, wherein: the second detection data packet at least comprises fourth type data, the fourth type data is formed in a mode that,
Acquiring a basic threshold of fourth-class data; the base threshold for the fourth class of data is:
S4=(n1*D1+n2*D2+n3*D3)/(n1+n2+n3)
Wherein: s4 is a basic threshold value of fourth class data, D1 is a first identification value of the fourth class data, and the identification value of the current data packet is set to be D1 when the domain name length of the current data packet is smaller than a first reference value of the domain name; d2 is a second identification value of the fourth type data, and the identification value of the current data packet is set to D2 in a state that the domain name length of the current data packet is not less than the first reference value of the domain name and less than the second reference value of the domain name; d3 is a third identification value of the fourth type data, and the identification value of the current data packet is set to be D3 in a state that the domain name length of the current data packet is not less than a third reference value of the domain name; n1 is the total number of the identification values of the data packets D1; n2 is the total number of the identification values of the data packets D2; n3 is the total number of the identification values of the data packets D3;
And forming the fourth type data by combining the basic threshold of the fourth type data according to the number of the data packets matched with the fourth type data in the first detection data stream.
Preferably, the above method for detecting a DNS hidden tunnel based on traffic characteristics, wherein: the second detection data packet at least comprises fifth type data, the forming mode of the fifth type data is that,
Acquiring a basic threshold of the fifth type of data; the base threshold for the fifth class of data is:
S5=(f1*E1+f2*E2+f3*E3)/(f1+f2+f3)
Wherein: s5 is a basic threshold value of the fifth type data, E1 is a first identification value of the fifth type data, and the identification value of the current data packet is set to be E1 when the number of the subdomains of the current data packet is smaller than a first reference value of the subdomains; e2 is a second identification value of the fifth type data, and the identification value of the current data packet is set as E2 in a state that the number of the sub-field names of the current data packet is not smaller than the first reference value of the sub-field names and smaller than the second reference value of the sub-field names; e3 is a third identification value of the fifth type of data, and the identification value of the current data packet is set as E3 in a state that the number of the subdomains of the current data packet is not smaller than a third reference value of the subdomains; f1 is the total number of the identification values of the data packets E1; f2 is the total number of the identification values of the data packets E2; f3 is the total number of the identification values of the data packets E3;
and forming the fifth type data by combining the basic threshold of the fifth type data according to the number of the data packets matched with the fifth type data in the first detection data stream.
Preferably, the above method for detecting a DNS hidden tunnel based on traffic characteristics, wherein: the second detection data packet at least comprises sixth class data, the sixth class data is formed in a way that,
Acquiring a basic threshold of the fifth type of data; the base threshold for the fifth class of data is:
S6=(j1*F1+j2*F2+j3*F3)/(j1+j2+j3)
Wherein: s5 is a basic threshold value of the sixth data, F1 is a first identification value of the sixth data, and the identification value of the current data packet is set to be F1 when the length of the subdomain name of the current data packet is smaller than a first reference value of the subdomain name; f2 is a second identification value of the sixth type of data, and the identification value of the current data packet is set to be F2 when the length of the subdomain name of the current data packet is not less than the first reference value of the subdomain name and is less than the second reference value of the subdomain name; f3 is a third identification value of the sixth type of data, and the identification value of the current data packet is set to be F3 in a state that the length of the subdomain name of the current data packet is not less than a third reference value of the subdomain name; j1 is the total number of the data packets with the identification value of F1; j2 is the total number of the data packets with the identification value of F2; j3 is the total number of the data packets with the identification value of F3;
And forming the sixth type of data by combining the basic threshold of the sixth type of data according to the number of the data packets matched with the sixth type of data in the first detection data stream.
Preferably, the above method for detecting a DNS hidden tunnel based on traffic characteristics, wherein: forming a detection result output according to the first class data, the second class data, the third class data, the fourth class data, the fifth class data and the sixth class data specifically comprises:
Calculating to form suspicious factors according to the first class data, the second class data, the third class data, the fourth class data, the fifth class data and the sixth class data by combining weight coefficients matched with the first class data, the second class data, the third class data, the fourth class data, the fifth class data and the sixth class data;
Judging whether a hidden tunnel exists or not according to the suspicious factors.
In another aspect, the present application further provides an electronic device, including a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor implements a DNS hidden tunnel detection method based on traffic characteristics as described in any one of the above when executing the computer program
Compared with the prior art, the invention has the beneficial effects that:
1) The real-time detection is carried out according to the flow characteristics, so that the detection speed is high;
2) And the detection accuracy is high according to the comparison of the suspicious factors of six dimensions with the baseline model.
Drawings
Fig. 1 is a schematic flow diagram of a DNS hidden tunnel detection method based on a flow characteristic according to an embodiment of the present invention;
fig. 2 is a schematic flow chart of a DNS hidden tunnel detection method based on a flow characteristic according to an embodiment of the present invention;
Fig. 3 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The invention provides a DNS hidden tunnel detection method based on flow characteristics, wherein: comprising the steps of (a) a step of,
As shown in fig. 1, step S110 is to perform a parsing process on DNS traffic data in a state of acquiring the DNS traffic data to form a first detection data stream and a second detection data packet; specifically comprises
As shown in fig. 2, in step S1101, in a state of acquiring DNS traffic data, a first feature value, a second feature value, and a third feature value of current DNS traffic data are acquired according to DNS traffic data; illustratively, the first characteristic value may be second-layer network information in the communication protocol, the second characteristic value may be third-layer network information in the communication protocol, and the third characteristic value may be fourth-layer network information in the communication protocol;
Step S1102, forming identification data according to the first feature value, the second feature value, and the third feature value; specifically, five-tuple information is formed according to the first characteristic value, the second characteristic value and the third characteristic value;
Step S1103, reads the data stream matched with the identification data to form a first detection data stream and a second detection data packet. And forming first detection data streams according to the quintuple information, wherein each first detection data stream is provided with quintuple information, reading quintuple information contained in DNS traffic data in a state of acquiring the DNS traffic data, and carrying out shunting processing on the data in the DNS traffic data according to the quintuple information to form the first detection data streams. Wherein the second detection packet is used to further resolve the DNS protocol.
And step S120, forming first class data and second class data according to the first class detection data, and forming third class data, fourth class data, fifth class data and sixth class data according to the second class detection data.
Specifically: wherein,
In step S1201, the first detection data stream at least includes the first type of data, where a forming manner of the first type of data is:
S1=(R/r)*A;
Wherein S1 is the first type data; r is a reference number of a standard RTT; a is RTT influence factor; r is the RTT value actually detected in the first detected data stream. The value of a may be 0.3, and further, the first type of data may be a request response delay exception factor of the number of uplink and downlink packets during the stream persistence.
Step S1202, the forming manner that the first detection data stream at least includes the second class data includes: the second type of data may be a request response packet number anomaly factor.
Step S12021, obtaining a basic threshold of the second class data; the basic threshold for the second class of data is:
s2 is a basic threshold value of the second class data, and n is the number of actual request response packets; n1 and N2 are respectively a first parameter value and a second reference value of the second class data; illustratively, N1 may take a value of 4, N2 may take a value of 8, B1 may take a value of 0.2, B2 may take a value of 0.3, and B3 may take a value of 0.5.
Step S12022, forming the second type data according to the number of data packets matched with the second type data in the first detection data stream and combining the basic threshold of the second type data.
In step S1203, the second detection data packet includes at least third class data, where the third class data is formed in such a way that the third class data may be a request type exception factor.
Step S12031, acquiring a basic threshold of the third class of data; the base threshold for the third class of data is:
s3 is a basic threshold value of third-class data, and m is the abnormal number of request types; m1 and M2 are respectively a first parameter value and a second reference value of third-class data; illustratively, M1 may take a value of 4, M2 may take a value of 8, C1 may take a value of 0.2, C2 may take a value of 0.3, and C3 may take a value of 0.5.
Step S12032, forming the third class data according to the number of data packets matched with the third class data in the first detected data stream and combining the basic threshold of the third class data.
In step S1204, the second detection data packet includes at least fourth type data, where the fourth type data is formed in a manner that the fourth type data may be a request domain name length anomaly factor.
Step S12041, acquiring a basic threshold of the fourth type of data; the base threshold for the fourth class of data is:
S4=(n1*D1+n2*D2+n3*D3)/(n1+n2+n3)
wherein: s4 is a basic threshold value of fourth class data, D1 is a first identification value of the fourth class data, and the identification value of the current data packet is set to be D1 when the domain name length of the current data packet is smaller than a first reference value of the domain name; d2 is a second identification value of the fourth type data, and the identification value of the current data packet is set to D2 in a state that the domain name length of the current data packet is not less than the first reference value of the domain name and less than the second reference value of the domain name; d3 is a third identification value of the fourth type data, and the identification value of the current data packet is set to be D3 in a state that the domain name length of the current data packet is not less than the second reference value of the domain name; n1 is the total number of the identification values of the data packets D1; n2 is the total number of the identification values of the data packets D2; n3 is the total number of the identification values of the data packets D3; illustratively, D1 has a value of 0.2, D2 has a value of 0.3, D3 has a value of 0.5, the domain name first reference value may be 48, and the domain name second reference value may be 64.
Step S12042, according to the number of data packets matched with the fourth type of data in the first detection data stream, combining the basic threshold of the fourth type of data to form the fourth type of data.
In step S1205, the second detection data packet includes at least a fifth type of data, where the fifth type of data is formed in such a way that the fifth type of data may be an abnormal factor of the number of the request subdomains.
Step S12051, acquiring a basic threshold of the fifth type of data; the base threshold for the fifth class of data is:
S5=(f1*E1+f2*E2+f3*E3)/(f1+f2+f3)
Wherein: s5 is a basic threshold value of the fifth type data, E1 is a first identification value of the fifth type data, and the identification value of the current data packet is set to be E1 in a state that the number of the subdomains of the current data packet is smaller than a first reference value of the number of the subdomains; e2 is a second identification value of the fifth type data, and the identification value of the current data packet is set as E2 in a state that the number of the sub-field names of the current data packet is not smaller than a first reference value of the number of the sub-field names and smaller than a second reference value of the number of the sub-field names; e3 is a third identification value of the fifth type data, and the identification value of the current data packet is set as E3 in a state that the number of the subdomains of the current data packet is not smaller than a second reference value of the number of the subdomains; f1 is the total number of the identification values of the data packets E1; f2 is the total number of the identification values of the data packets E2; f3 is the total number of the identification values of the data packets E3; illustratively, the value of E1 is 0.2, the value of E2 is 0.3, the value of E3 is 0.5, the first reference value of the number of sub-domain names may be 48, and the second reference value of the number of domain names may be 64.
Step S12052, according to the number of data packets matched with the fifth type of data in the first detection data stream, forming the fifth type of data in combination with the basic threshold of the fifth type of data.
In step S1206, the second detection data packet includes at least a sixth type of data, where the sixth type of data is formed in such a way that the sixth type of data may be a request subfield name length exception factor.
Step S12061, acquiring a basic threshold of the sixth class of data; the base threshold for the sixth class of data is:
S6=(j1*F1+j2*F2+j3*F3)/(j1+j2+j3)
Wherein: s5 is a basic threshold value of the sixth data, F1 is a first identification value of the sixth data, and the identification value of the current data packet is set to be F1 in a state that the length of the subdomain name of the current data packet is smaller than a first reference value of the length of the subdomain name; f2 is a second identification value of the sixth type of data, and the identification value of the current data packet is set to be F2 in a state that the length of the subdomain name of the current data packet is not smaller than a first reference value of the length of the subdomain name and smaller than a second reference value of the length of the subdomain name; f3 is a third identification value of the sixth type of data, and the identification value of the current data packet is set to be F3 in a state that the length of the subdomain name of the current data packet is not less than a third reference value of the subdomain name; j1 is the total number of the data packets with the identification value of F1; j2 is the total number of the data packets with the identification value of F2; j3 is the total number of the data packets with the identification value of F3; in an illustrative manner,
Step S12062, according to the number of data packets matched with the sixth type of data in the first detection data stream, forming the sixth type of data in combination with the basic threshold of the sixth type of data. The first reference value of the sub domain name length is 12, and the second reference value of the sub domain name length is 24. The value of F1 is 0.2, the value of F2 is 0.3, and the value of F3 is 0.5.
Step S130, forming a detection result output according to the first class data, the second class data, the third class data, the fourth class data, the fifth class data and the sixth class data. Specifically comprises
Step S1301, calculating according to the first class data, the second class data, the third class data, the fourth class data, the fifth class data and the sixth class data and the weight coefficient matched with the first class data, the second class data, the third class data, the fourth class data and the fifth class data to form suspicious factors, specifically
S=(S1*x1+S2*x2+S3*x3+S4*x4+S5*x5+S6*x6)/(x1+x2+x3+x4+x5+x6);
Wherein S is a suspicious factor; x1 is the weight of the first class of data, x2 is the weight of the second class of data, x3 is the weight of the third class of data, x4 is the weight of the fourth class of data, x5 is the weight of the fifth class of data, x6 is the weight of the sixth class of data,
Step S1302, judging whether a hidden tunnel exists according to the suspicious factors. And judging that the hidden tunnel exists in the current data stream in the state that the suspicious factor is larger than the suspicious factor threshold value, and judging that the hidden tunnel does not exist in the current data stream in the state that the suspicious factor is not larger than the suspicious factor threshold value. Where the suspicious factor threshold is user-defined, there are no specific restrictions here.
One specific embodiment is listed: the method comprises a flow acquisition A1, a deep analysis A2, a flow processing A30, a packet processing A31, an average RTT anomaly grinding A300, a request response packet number anomaly grinding A301, a request type anomaly grinding A310, a request domain name length anomaly grinding A311, a request subdomain name number anomaly grinding A312, a request subdomain name length anomaly grinding A313, a request subdomain name anomaly grinding A314, a baseline model A4, a static configuration A40, a dynamic learning A41 and a comprehensive grinding A5; ;
the interaction relation among the modules is as follows:
The flow collection and the deep analysis are interacted, so that the input flow is balanced to the receiving processing thread;
the deep analysis, stream processing and packet processing are interacted, tunnel protocols in the network are automatically identified and restored, and the information of the data packets with two layers, three layers and four layers is analyzed and sent to a stream processing module and a packet processing module;
The flow processing and average RTT anomaly research, request response packet quantity anomaly research A301 interaction, statistics of the uplink and downlink data packet quantity of the flow, and calculation of average RTT;
the packet processing and the abnormal research and judgment of the request type, the abnormal research and judgment of the length of the request domain name, the abnormal research and judgment of the number of the request subdomain names, the abnormal research and judgment of the length of the request subdomain names and the abnormal research and judgment of the name symbols of the request subdomain are interacted, and the information such as the request type, the request domain name, the number of the request subdomain names, the length of the request subdomain names, whether the domain name contains illegal characters or not is analyzed and extracted;
The average RTT anomaly research and judgment is interacted with the baseline model to provide suspicious factors of average RTT anomaly;
The abnormal research and judgment of the number of the request response packets interacts with the baseline model to provide suspicious factors for the abnormal research and judgment of the number of the request response packets;
the request type abnormality research and judgment is interacted with the baseline model, and suspicious factors of the request type abnormality research and judgment are provided;
the research and judgment of the abnormal length of the request domain name interacts with a baseline model to provide suspicious factors for the research and judgment of the abnormal length of the request domain name;
The abnormal research and judgment of the number of the request sub-domain name interacts with a baseline model to provide suspicious factors for the abnormal research and judgment of the number of the request sub-domain name;
The research and judgment of the abnormal length of the request sub-domain name interacts with a baseline model to provide suspicious factors for the research and judgment of the abnormal length of the request sub-domain name;
the static configuration interacts with the baseline model to provide configuration parameters of the baseline model; the dynamic learning interacts with the baseline model, and dynamically learns each parameter of the baseline model according to normal DNS flow; the baseline model stores corresponding threshold values or reference values, respectively.
The baseline model interacts with the comprehensive research and judgment, suspicious factors of each dimension are compared with the baseline model, and the result is sent to the comprehensive research and judgment; the development basis is as follows:
and the comprehensive research judgment judges whether the stream is a hidden tunnel or not according to the comparison result.
The implementation of the DNS hidden tunnel detection method based on the traffic characteristics can be specifically understood as comprising the following steps:
S1: the flow acquisition module receives input flow and distributes the load of the input flow to the deep analysis thread in a balanced manner; the deep analysis module analyzes the data packet to obtain two-layer three-layer four-layer network information; the flow processing module builds a flow according to the quintuple information, and the packet processing module further analyzes the DNS protocol;
S2: the flow processing module counts the uplink and downlink packet numbers during the flow continuous storage period, records the response time delay of the request, and the more the uplink and downlink packet numbers respectively exceed 2, the higher the suspicious factors are, and the higher the suspicious factors are, the smaller the average RTT is; the packet processing module extracts the request domain name of the DNS, the request type, and performs statistical analysis on unusual types such as TXT, CNAME, NULL, MX and the like, wherein the suspicious factors with more quantity are higher; performing interval statistics on the length of the request domain name, wherein the longer the length is, the higher the suspicious factor is; meanwhile, the sub domain names are analyzed and counted, for example, the number of the sub domain names exceeds 4, the suspicious factors with the number being larger are higher, the length of a single sub domain name exceeds 24 bytes, the suspicious factors with the length being longer are higher, the suspicious factors with non-numbers and letters exist, and the suspicious factors with the number being larger are higher; when the stream ages, according to the statistics, suspicious factors of each dimension are output according to different dimensions. The suspicious threshold of each dimension can be flexibly configured to adapt to various different traffic scenes;
s3: after the suspicious factors of six dimensions in the S2 are obtained, the weighting factors of the hidden tunnels are calculated according to the baseline model of the system, and when the weighting factors are larger than a threshold value, the flow can be judged to be DNS hidden tunnel flow.
Example two
An embodiment of the present application provides an electronic device, as shown in fig. 3, and this embodiment provides an electronic device 400, which includes: one or more processors 420; storage 410 for storing one or more programs that, when executed by the one or more processors 420, cause the one or more processors 420 to implement:
analyzing the DNS traffic data in a state of acquiring the DNS traffic data to form a first detection data stream and a second detection data packet;
Forming first class data and second class data according to the first class detection data, and forming third class data, fourth class data, fifth class data and sixth class data according to the second detection data;
And forming a detection result output according to the first type data, the second type data, the third type data, the fourth type data, the fifth type data and the sixth type data.
As shown in fig. 3, the electronic device 400 includes a processor 420, a storage device 410, an input device 430, and an output device 440; the number of processors 420 in the electronic device may be one or more, one processor 420 being taken as an example in fig. 3; the processor 420, the storage device 410, the input device 430, and the output device 440 in the electronic device may be connected by a bus or other means, which is illustrated in fig. 3 as being connected by a bus 450.
The storage device 410 is a computer readable storage medium, and may be used to store a software program, a computer executable program, and a module unit, such as program instructions corresponding to a control method based on a relevant operating environment in an embodiment of the present application.
The storage device 410 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, at least one application program required for functions; the storage data area may store data created according to the use of the terminal, etc. In addition, the storage 410 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid-state storage device. In some examples, storage device 410 may further include memory located remotely from processor 420, which may be connected via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The input device 430 may be used to receive input numeric, character information, or voice information, and to generate key signal inputs related to user settings and function control of the electronic device. The output device 440 may include a display screen, speakers, etc.
Example III
In some embodiments, the methods described above may be implemented as a computer program product. The computer program product may include a computer readable storage medium having computer readable program instructions embodied thereon for performing aspects of the present disclosure. Specifically:
analyzing the DNS traffic data in a state of acquiring the DNS traffic data to form a first detection data stream and a second detection data packet;
Forming first class data and second class data according to the first class detection data, and forming third class data, fourth class data, fifth class data and sixth class data according to the second detection data;
And forming a detection result output according to the first type data, the second type data, the third type data, the fourth type data, the fifth type data and the sixth type data.
The computer readable storage medium described above can be a tangible device that can hold and store instructions for use by an instruction execution device. The computer readable storage medium may be, for example, but not limited to, an electronic storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer-readable storage medium would include the following: portable computer disks, hard disks, random Access Memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), static Random Access Memory (SRAM), portable compact disk read-only memory (CD-ROM), digital Versatile Disks (DVD), memory sticks, floppy disks, mechanical coding devices, punch cards or in-groove structures such as punch cards or grooves having instructions stored thereon, and any suitable combination of the foregoing. Computer-readable storage media, as used herein, are not to be construed as transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through waveguides or other transmission media (e.g., optical pulses through fiber optic cables), or electrical signals transmitted through wires.
The computer readable program instructions described herein may be downloaded from a computer readable storage medium to a respective computing/processing device or to an external computer or external storage device over a network, such as the internet, a local area network, a wide area network wireless network. The network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computer edge servers. The network interface card or network interface in each computing/processing device receives computer readable program instructions from the network and forwards the computer readable program instructions for storage in a computer readable storage medium in the respective computing/processing device.
The computer program instructions for performing the operations of the present disclosure can be assembly instructions, instruction Set Architecture (ISA) instructions, machine-related instructions, microcode, firmware instructions, state setting data, or source or object code written in any combination of one or more programming languages, including an object-oriented programming language and conventional procedural programming languages. The computer readable program instructions may be executed entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider). In some embodiments, aspects of the present disclosure are implemented by personalizing electronic circuitry, such as programmable logic circuitry, field Programmable Gate Arrays (FPGAs), or Programmable Logic Arrays (PLAs), with state information of computer readable program instructions, which can execute the computer readable program instructions.
These computer readable program instructions may be provided to a processing unit of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processing unit of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart block or blocks. These computer readable program instructions may also be stored in a computer readable storage medium that can direct a computer, other devices of a programmable data processing apparatus, or other apparatus to function in a particular manner, such that the computer readable medium having the instructions stored therein includes an article of manufacture including instructions which implement the function/act specified in the flowchart block or blocks.
The computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer, other programmable apparatus or other devices implement the functions/acts specified in the flowchart block or blocks.
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of devices, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams, and combinations of blocks in the block diagrams, can be implemented by special purpose hardware-based systems that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The foregoing description of the embodiments of the present disclosure has been presented for purposes of illustration and description, and is not intended to be exhaustive or limited to the embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the various embodiments described. The terminology used herein was chosen in order to best explain the principles of the embodiments, the practical application, or the technical improvement of the technology in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.

Claims (3)

1. The DNS hidden tunnel detection method based on the flow characteristics is characterized by comprising the following steps of: comprising the steps of (a) a step of,
Analyzing the DNS traffic data in a state of acquiring the DNS traffic data to form a first detection data stream and a second detection data packet;
Forming first class data and second class data according to the first detection data flow, and forming third class data, fourth class data, fifth class data and sixth class data according to the second detection data packet; the first type of data is a request response time delay abnormality factor of the number of uplink and downlink packets in a stream continuous period, the second type of data is a request response packet number abnormality factor, the third type of data is a request type abnormality factor, the fourth type of data is a request domain name length abnormality factor, the fifth type of data is a request subdomain name number abnormality factor, and the sixth type of data is a request subdomain name length abnormality factor;
forming a detection result according to the first type data, the second type data, the third type data, the fourth type data, the fifth type data and the sixth type data and outputting the detection result;
The method for analyzing the DNS traffic data to form a first detection data stream and a second detection data packet in the state that the DNS traffic data is acquired specifically comprises the following steps: acquiring a first characteristic value, a second characteristic value and a third characteristic value of the current DNS traffic data according to the DNS traffic data in a state of acquiring the DNS traffic data; forming identification data according to the first characteristic value, the second characteristic value and the third characteristic value; reading the data stream matched with the identification data to form a first detection data stream and a second detection data packet;
the first characteristic value is second-layer network information in a communication protocol; the second characteristic value is third-layer network information in the communication protocol, and the third characteristic value is fourth-layer network information in the communication protocol;
the forming mode of the first type of data comprises the following steps:
S1=(R/r)*A;
Wherein S1 is the first type data; r is a reference number of a standard RTT; a is RTT influence factor; r is the RTT value actually detected in the first detected data stream;
the forming mode of the second type of data comprises the following steps:
acquiring a basic threshold of the second class data; the basic threshold for the second class of data is:
S2 is a basic threshold value of the second class data, and n is the number of actual request response packets; n1 and N2 are respectively a first parameter value and a second reference value of the second class data;
forming the second type data by combining the basic threshold value of the second type data according to the number of data packets matched with the second type data in the first detection data stream;
the forming mode of the third type of data comprises the following steps:
Acquiring a basic threshold of the third class of data; the base threshold for the third class of data is:
S3 is a basic threshold value of third-class data, and m is the abnormal number of request types; m1 and M2 are respectively a first parameter value and a second reference value of third-class data;
forming third class data by combining a basic threshold of the third class data according to the number of data packets matched with the third class data in the first detection data stream;
the forming mode of the fourth type of data comprises the following steps:
Acquiring a basic threshold of fourth-class data; the base threshold for the fourth class of data is:
S4=(n1*D1+n2*D2+n3*D3)/(n1+n2+n3)
Wherein: s4 is a basic threshold value of fourth class data, D1 is a first identification value of the fourth class data, and the identification value of the current data packet is set to be D1 when the domain name length of the current data packet is smaller than a first reference value of the domain name; d2 is a second identification value of the fourth type data, and the identification value of the current data packet is set to D2 in a state that the domain name length of the current data packet is not less than the first reference value of the domain name and less than the second reference value of the domain name; d3 is a third identification value of the fourth type data, and the identification value of the current data packet is set to be D3 in a state that the domain name length of the current data packet is not less than a third reference value of the domain name; n1 is the total number of the identification values of the data packets D1; n2 is the total number of the identification values of the data packets D2; n3 is the total number of the identification values of the data packets D3;
forming fourth-class data by combining a basic threshold of the fourth-class data according to the number of data packets matched with the fourth-class data in the first detection data stream;
The forming mode of the fifth type of data comprises the following steps:
Acquiring a basic threshold of the fifth type of data; the base threshold for the fifth class of data is:
S5=(f1*E1+f2*E2+f3*E3)/(f1+f2+f3)
Wherein: s5 is a basic threshold value of the fifth type data, E1 is a first identification value of the fifth type data, and the identification value of the current data packet is set to be E1 when the number of the subdomains of the current data packet is smaller than a first reference value of the subdomains; e2 is a second identification value of the fifth type data, and the identification value of the current data packet is set as E2 in a state that the number of the sub-field names of the current data packet is not smaller than the first reference value of the sub-field names and smaller than the second reference value of the sub-field names; e3 is a third identification value of the fifth type of data, and the identification value of the current data packet is set as E3 in a state that the number of the subdomains of the current data packet is not smaller than a third reference value of the subdomains; f1 is the total number of the identification values of the data packets E1; f2 is the total number of the identification values of the data packets E2; f3 is the total number of the identification values of the data packets E3;
forming fifth-class data by combining a basic threshold of the fifth-class data according to the number of data packets matched with the fifth-class data in the first detection data stream;
The forming mode of the sixth type of data comprises the following steps:
Acquiring a basic threshold of the fifth type of data; the base threshold for the fifth class of data is:
S6=(j1*F1+j2*F2+j3*F3)/(j1+j2+j3)
Wherein: s5 is a basic threshold value of the sixth data, F1 is a first identification value of the sixth data, and the identification value of the current data packet is set to be F1 when the length of the subdomain name of the current data packet is smaller than a first reference value of the subdomain name; f2 is a second identification value of the sixth type of data, and the identification value of the current data packet is set to be F2 when the length of the subdomain name of the current data packet is not less than the first reference value of the subdomain name and is less than the second reference value of the subdomain name; f3 is a third identification value of the sixth type of data, and the identification value of the current data packet is set to be F3 in a state that the length of the subdomain name of the current data packet is not less than a third reference value of the subdomain name; j1 is the total number of the data packets with the identification value of F1; j2 is the total number of the data packets with the identification value of F2; j3 is the total number of the data packets with the identification value of F3;
And forming the sixth type of data by combining the basic threshold of the sixth type of data according to the number of the data packets matched with the sixth type of data in the first detection data stream.
2. The method for detecting the DNS hidden tunnel based on the traffic characteristics according to claim 1, wherein the method is characterized by: forming a detection result output according to the first class data, the second class data, the third class data, the fourth class data, the fifth class data and the sixth class data specifically comprises:
Calculating to form suspicious factors according to the first class data, the second class data, the third class data, the fourth class data, the fifth class data and the sixth class data by combining weight coefficients matched with the first class data, the second class data, the third class data, the fourth class data, the fifth class data and the sixth class data;
Judging whether a hidden tunnel exists or not according to the suspicious factors.
3. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements a DNS hidden tunnel detection method based on traffic characteristics according to any of claims 1-2 when executing the computer program.
CN202210928837.7A 2022-08-03 2022-08-03 DNS hidden tunnel detection method based on flow characteristics Active CN115086080B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210928837.7A CN115086080B (en) 2022-08-03 2022-08-03 DNS hidden tunnel detection method based on flow characteristics

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210928837.7A CN115086080B (en) 2022-08-03 2022-08-03 DNS hidden tunnel detection method based on flow characteristics

Publications (2)

Publication Number Publication Date
CN115086080A CN115086080A (en) 2022-09-20
CN115086080B true CN115086080B (en) 2024-05-07

Family

ID=83242509

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210928837.7A Active CN115086080B (en) 2022-08-03 2022-08-03 DNS hidden tunnel detection method based on flow characteristics

Country Status (1)

Country Link
CN (1) CN115086080B (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3313044A1 (en) * 2016-10-24 2018-04-25 Verisign, Inc. Real-time cloud based detection and mitigation of dns data exfiltration and dns tunneling
CN109639744A (en) * 2019-02-27 2019-04-16 深信服科技股份有限公司 A kind of detection method and relevant device in the tunnel DNS
CN110602100A (en) * 2019-09-16 2019-12-20 上海斗象信息科技有限公司 DNS tunnel flow detection method
CN111953673A (en) * 2020-08-10 2020-11-17 深圳市联软科技股份有限公司 DNS hidden tunnel detection method and system
CN112822223A (en) * 2021-04-19 2021-05-18 北京智源人工智能研究院 DNS hidden tunnel event automatic detection method and device and electronic equipment
CN113114524A (en) * 2021-03-04 2021-07-13 北京六方云信息技术有限公司 Spark streaming based DNS tunnel detection method and device and electronic equipment
CN113347210A (en) * 2021-08-03 2021-09-03 北京观成科技有限公司 DNS tunnel detection method and device and electronic equipment
CN114567487A (en) * 2022-03-03 2022-05-31 北京亚鸿世纪科技发展有限公司 DNS hidden tunnel detection method with multi-feature fusion

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9003518B2 (en) * 2010-09-01 2015-04-07 Raytheon Bbn Technologies Corp. Systems and methods for detecting covert DNS tunnels
US10212123B2 (en) * 2015-11-24 2019-02-19 International Business Machines Corporation Trustworthiness-verifying DNS server for name resolution
US10412107B2 (en) * 2017-03-22 2019-09-10 Microsoft Technology Licensing, Llc Detecting domain name system (DNS) tunneling based on DNS logs and network data

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3313044A1 (en) * 2016-10-24 2018-04-25 Verisign, Inc. Real-time cloud based detection and mitigation of dns data exfiltration and dns tunneling
CN109639744A (en) * 2019-02-27 2019-04-16 深信服科技股份有限公司 A kind of detection method and relevant device in the tunnel DNS
CN110602100A (en) * 2019-09-16 2019-12-20 上海斗象信息科技有限公司 DNS tunnel flow detection method
CN111953673A (en) * 2020-08-10 2020-11-17 深圳市联软科技股份有限公司 DNS hidden tunnel detection method and system
CN113114524A (en) * 2021-03-04 2021-07-13 北京六方云信息技术有限公司 Spark streaming based DNS tunnel detection method and device and electronic equipment
CN112822223A (en) * 2021-04-19 2021-05-18 北京智源人工智能研究院 DNS hidden tunnel event automatic detection method and device and electronic equipment
CN113347210A (en) * 2021-08-03 2021-09-03 北京观成科技有限公司 DNS tunnel detection method and device and electronic equipment
CN114567487A (en) * 2022-03-03 2022-05-31 北京亚鸿世纪科技发展有限公司 DNS hidden tunnel detection method with multi-feature fusion

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
T. Pauly ; Apple Inc. ; P. Wouters ; Red Hat ; .Split DNS Configuration for IKEv2draft-ietf-ipsecme-split-dns-16.IETF .2018,全文. *
一种基于HTTP协议的隐蔽隧道及其检测方法;赵琦;蒋朝惠;周雪梅;宋紫华;;计算机与现代化;20190614(06);全文 *
基于DNS的隐蔽通道流量检测;章思宇;邹福泰;王鲁华;陈铭;;通信学报;20130525(第05期);全文 *

Also Published As

Publication number Publication date
CN115086080A (en) 2022-09-20

Similar Documents

Publication Publication Date Title
US7813350B2 (en) System and method to process data packets in a network using stateful decision trees
KR100834570B1 (en) Realtime stateful packet inspection method and apparatus for thereof
CN106464577A (en) Network system, control apparatus, communication apparatus, communication control method, and communication control program
CN111953673B (en) DNS hidden tunnel detection method and system
US11546295B2 (en) Industrial control system firewall module
CN111835777B (en) Abnormal flow detection method, device, equipment and medium
Kim et al. ONTAS: Flexible and scalable online network traffic anonymization system
CN111245784A (en) Method for multi-dimensional detection of malicious domain name
CN112822223B (en) DNS hidden tunnel event automatic detection method and device and electronic equipment
Yu et al. Behavior Analysis based DNS Tunneling Detection and Classification with Big Data Technologies.
CN111835681A (en) Large-scale abnormal flow host detection method and device
CN105207997A (en) Anti-attack message forwarding method and system
WO2017145898A1 (en) Real-time validation of json data applying tree graph properties
CN112822204A (en) NAT detection method, device, equipment and medium
KR102526935B1 (en) Network intrusion detection system and network intrusion detection method
CN115086080B (en) DNS hidden tunnel detection method based on flow characteristics
CN111405007B (en) TCP session management method, device, storage medium and electronic equipment
US20230188479A1 (en) Adaptive Networking Policy with User Defined Fields
CN109327404B (en) P2P prediction method and system based on naive Bayes classification algorithm, server and medium
JP5127670B2 (en) Filter device, filter method, and program
CN115278685B (en) 5G abnormal behavior terminal detection method based on DPI technology and electronic equipment
CN112615713B (en) Method and device for detecting hidden channel, readable storage medium and electronic equipment
CA3022435A1 (en) Adaptive event aggregation
US10798227B2 (en) Centralized chromatic pluralizing of internet of things (IOT) communication
Sinadskiy et al. Statistical-entropy method for zero knowledge network traffic analysis algorithm implementation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant