CN113904843A - Method and device for analyzing abnormal DNS (Domain name Server) behaviors of terminal - Google Patents

Method and device for analyzing abnormal DNS (Domain name Server) behaviors of terminal Download PDF

Info

Publication number
CN113904843A
CN113904843A CN202111170497.8A CN202111170497A CN113904843A CN 113904843 A CN113904843 A CN 113904843A CN 202111170497 A CN202111170497 A CN 202111170497A CN 113904843 A CN113904843 A CN 113904843A
Authority
CN
China
Prior art keywords
dns
query
terminal
terminals
abnormal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111170497.8A
Other languages
Chinese (zh)
Other versions
CN113904843B (en
Inventor
陈少涵
周赵军
吴雪阳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Skyguard Network Security Technology Co ltd
Chengdu Sky Guard Network Security Technology Co ltd
Original Assignee
Beijing Skyguard Network Security Technology Co ltd
Chengdu Sky Guard Network Security Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Skyguard Network Security Technology Co ltd, Chengdu Sky Guard Network Security Technology Co ltd filed Critical Beijing Skyguard Network Security Technology Co ltd
Priority to CN202111170497.8A priority Critical patent/CN113904843B/en
Publication of CN113904843A publication Critical patent/CN113904843A/en
Application granted granted Critical
Publication of CN113904843B publication Critical patent/CN113904843B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/12Applying verification of the received information
    • H04L63/126Applying verification of the received information the source of the received data
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1416Event detection, e.g. attack signature detection

Abstract

The invention discloses a method and a device for analyzing abnormal DNS (domain name system) behaviors of a terminal, and relates to the technical field of computers. One embodiment of the method comprises: acquiring a historical weblog; the historical weblog comprises network connection information and DNS query information of a plurality of terminals; respectively aggregating the network connection information and DNS query information of a plurality of terminals according to a plurality of preset characteristic vectors, and determining characteristic data of the characteristic vectors corresponding to the plurality of terminals; and determining abnormal DNS behaviors of the plurality of terminals according to the characteristic data. The implementation method can accurately determine the abnormal DNS behavior of the terminal, thereby reducing the false alarm rate and the missing report rate of the abnormal DNS behavior, ensuring the system safety and preventing the malicious attack to the network.

Description

Method and device for analyzing abnormal DNS (Domain name Server) behaviors of terminal
Technical Field
The invention relates to the technical field of computers, in particular to a method and a device for analyzing abnormal DNS (domain name system) behaviors of a terminal.
Background
The DNS protocol is a protocol for converting an alphabetic domain name into an IP address, and the domain name system is an infrastructure of the internet, and when a user accesses a network resource, a DNS server is used to provide a domain name resolution service for converting a domain name and an IP address.
However, network devices and boundary protection devices usually rarely perform operations such as filtering analysis and shielding on DNS services, and therefore, network attacks often use DNS services as objects of malicious attacks for the purposes of theft-sensitive information, file transmission, returning control instructions, Shell rebound, and the like.
The existing abnormal DNS behavior detection modes comprise zeek rule matching, passive DNS, machine learning and the like, all of which are used for identifying malicious domain names through algorithms, so that a gateway/firewall blocks the access to the malicious domain names, and if the malicious domain names are misreported, a large amount of alarms can be caused, so that the whole network is influenced and cannot work normally; if the malicious domain name is not reported, the abnormal DNS behavior cannot be accurately determined, and the security of the whole network is threatened.
Disclosure of Invention
In view of this, embodiments of the present invention provide a method and an apparatus for analyzing an abnormal DNS behavior of a terminal, which can accurately determine the abnormal DNS behavior of the terminal according to DNS query information and network connection information, thereby reducing a false alarm rate and a false negative rate of the abnormal DNS behavior, ensuring system security, and preventing malicious attacks on a network.
Furthermore, the abnormal DNS behavior of the terminal is prompted according to the abnormal DNS behavior determined by analysis, and therefore the network security is improved.
In order to achieve the above object, according to an aspect of the embodiments of the present invention, there is provided a method for analyzing an abnormal DNS behavior of a terminal, including:
acquiring a historical weblog; wherein the historical weblog includes network connection information and DNS query information of a plurality of terminals;
aggregating the network connection information and the DNS query information of the plurality of terminals according to a plurality of preset characteristic vectors respectively, and determining characteristic data of the characteristic vectors corresponding to the plurality of terminals;
and determining abnormal DNS behaviors of the plurality of terminals according to the characteristic data.
Optionally, the determining, according to the feature data, the abnormal DNS behavior of the plurality of terminals includes:
comparing the feature data of the feature vector of each terminal with the feature data of the feature vectors corresponding to a plurality of terminals;
and determining whether the DNS behavior of each terminal is abnormal according to the comparison result.
Optionally, the determining, according to the comparison result, whether the DNS behavior of each terminal is an abnormal DNS behavior includes:
determining the deviation degree of the feature data of the feature vector of each terminal from the feature data of the feature vectors of the group according to the feature data of the plurality of terminals;
and inputting the deviation degree of each characteristic data of the terminal into an abnormal DNS behavior detection model, and determining whether the DNS behavior of the terminal is abnormal DNS behavior.
Optionally, the network connection information comprises one or more of: terminal ID, target IP, target port, protocol type, service type, connection duration, number of bytes sent and number of bytes responded;
the DNS query information includes one or more of: terminal ID, query type, query domain name length, query domain name entropy, response domain name length, RD, RA, TC.
Optionally, the feature vector comprises: the total number of DNS queries, and/or the total number of first characteristic queries, the total number of second characteristic queries, the total number of third characteristic queries, the total number of fourth characteristic queries and the total number of fifth characteristic queries determined according to the DNS query information, and/or the occupation ratio of different TLD queries, and/or the total number of connections with connection duration exceeding time.
Optionally, before aggregating the network connection information and the DNS query information of the plurality of terminals according to a plurality of preset feature vectors, the method further includes:
preprocessing a plurality of query domain names of the DNS query information, and determining TLDs of the plurality of query domain names;
the aggregating the network connection information and the DNS query information of the plurality of terminals according to a plurality of preset feature vectors, and determining feature data of the feature vectors corresponding to the plurality of terminals includes:
and aggregating the TLDs of the query domain name according to the preset different TLD query ratios, and determining the characteristic data of the different TLD query ratios in the characteristic vector corresponding to the terminal.
Optionally, the method further comprises:
and displaying an abnormal DNS behavior analysis report of the terminal under the condition that the terminal has abnormal DNS behavior.
According to still another aspect of the embodiments of the present invention, there is provided an apparatus for analyzing an abnormal DNS behavior of a terminal, including:
the acquisition module is used for acquiring a historical weblog; wherein the historical weblog includes network connection information and DNS query information of a plurality of terminals;
the aggregation module is used for respectively aggregating the network connection information and the DNS query information of the plurality of terminals according to a plurality of preset feature vectors and determining feature data of the feature vectors corresponding to the plurality of terminals;
and the judging module is used for determining the abnormal DNS behaviors of the terminals according to the characteristic data.
According to another aspect of the embodiments of the present invention, there is provided an electronic device for analyzing abnormal DNS behavior of a terminal, including:
one or more processors;
a storage device for storing one or more programs,
when the one or more programs are executed by the one or more processors, the one or more processors implement the method for analyzing the abnormal DNS behavior of the terminal provided by the present invention.
According to still another aspect of the embodiments of the present invention, there is provided a computer-readable medium, on which a computer program is stored, which, when executed by a processor, implements the method for analyzing abnormal DNS behavior of a terminal according to the present invention.
One embodiment of the above invention has the following advantages or benefits: because the technical means that the network connection information and the DNS query information of the historical weblog are aggregated according to the historical weblog of the terminal and the preset characteristic vectors, the characteristic data of the characteristic vectors corresponding to the DNS behavior of the terminal are determined, the characteristic data are input into the abnormal DNS behavior monitoring model, and the abnormal DNS behavior of the terminal is determined according to the output of the model is adopted, a large amount of alarms caused by malicious domain name misinformation in the conventional abnormal DNS behavior detection mode are overcome, so that the whole network is influenced and cannot work normally; if the malicious domain name is missed, the abnormal DNS behavior cannot be accurately determined, the technical problem of threat to the safety of the whole network is caused, and the abnormal DNS behavior can be accurately determined according to DNS query information, so that the false alarm rate and the missed report rate of the abnormal DNS behavior are reduced, the system safety is guaranteed, and malicious attack to the network is prevented.
Further effects of the above-mentioned non-conventional alternatives will be described below in connection with the embodiments.
Drawings
The drawings are included to provide a better understanding of the invention and are not to be construed as unduly limiting the invention. Wherein:
fig. 1 is a schematic diagram of a main flow of an analysis method of abnormal DNS behavior of a terminal according to an embodiment of the present invention;
FIG. 2 is a schematic illustration of query types according to an embodiment of the invention;
fig. 3 is a schematic diagram of a main flow of a feature data determination method of a feature vector according to an embodiment of the present invention;
fig. 4 is a schematic diagram of a main flow of a determination method of abnormal DNS behavior according to an embodiment of the present invention;
fig. 5 is a schematic diagram of main modules of an analysis apparatus for terminal abnormal DNS behavior according to an embodiment of the present invention;
FIG. 6 is an exemplary system architecture diagram in which embodiments of the present invention may be employed;
fig. 7 is a schematic block diagram of a computer system suitable for use in implementing a terminal device or server of an embodiment of the invention.
Detailed Description
Exemplary embodiments of the present invention are described below with reference to the accompanying drawings, in which various details of embodiments of the invention are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
Recording A: the a (address) record is an IP address record for specifying the host name (or domain name), which is colloquially called the IP of the server. The user can point the website server under the domain name to the own webpage server (web server)
DNSSEC: the Security protocol of DNS Security Extensions (DNS Security Extensions) is used to ensure the correctness and integrity of DNS data.
DGA algorithm: DGA (dynamic generation algorithm) algorithms are used to generate random numbers, and DGA domain names refer to domain names generated by DGA algorithms, which are typically hard-coded in malware.
Fig. 1 is a schematic diagram of a main flow of an analysis method for abnormal DNS behavior of a terminal according to an embodiment of the present invention, and as shown in fig. 1, the analysis method for abnormal DNS behavior of a terminal according to the present invention includes the following steps:
the DNS protocol is one of indispensable network communication protocols, and in order to access internet and intranet resources, a DNS server is required to provide domain name resolution service for converting domain names into IP addresses.
Step S101, acquiring a historical weblog; wherein the historical network log includes network connection information and DNS query information of a plurality of terminals.
In the embodiment of the present invention, the historical blog may be a blog of a terminal within 1 hour (1h) before the current analysis time, and may be derived from a network device connected to a plurality of terminals. The historical network log of the network device includes a network connection log including network connection information of each terminal connected to the network device and a DNS query log including DNS query information of each terminal connected to the network device.
In the embodiment of the invention, the network connection information comprises a terminal ID, a target IP, a target port, a protocol type, a service type, a connection duration, a number of bytes sent and a number of bytes responded.
Since only DNS behavior is detected, the target IP is the IP address of the target domain name server, the value of the target port field is 53, the value of the protocol type field is udp, and the value of the service type field is DNS.
In the embodiment of the invention, the DNS query information comprises a terminal ID, a query type, a query domain name length, a query domain name entropy, a response domain name length, RD, RA and TC. Wherein:
and (4) terminal ID:
one terminal corresponds to one terminal ID, and the terminal ID can uniquely identify the corresponding terminal; even if the IP address of a terminal changes, the terminal can be uniquely identified by the terminal ID. Wherein the terminal ID of the network connection information of the same terminal is the same as the terminal ID of the DNS query information.
Query type:
as shown in FIG. 2, query type IDs and query type names under different categories are presented, and query types include a first query type, a second query type and other query types. The first query type comprises a CNAME record with a query type ID of 5, an MX record with a query type ID of 15, a TXT record with a query type ID of 16 and a DNSKEY record with a query type ID of 48; the second query type comprises NS records with query type ID of 2, SOA records with query type ID of 6, PTR records with query type ID of 12, SRV records with query type ID of 33 and DS records with query type ID of 43; since the query types of the DNS query information are various, only the first query type and the second query type are illustrated in fig. 2, and the other query types are other query types than the first query type and the second query type in the existing query types.
Inquiring a domain name:
the query domain name refers to a detailed domain name of the DNS query.
Inquiring the domain name length:
the query domain name length refers to the length of a packet of the DNS query.
Inquiring the domain name entropy:
query domain name entropy refers to the degree of misordering of the strings of detailed domain names of DNS queries.
Responding to the domain name length:
the response domain name length refers to the length of a response packet of the DNS query.
RD, RA and TC:
indicating the requirements of the user and the characteristics of the server, etc. in the DNS query. The RD is a recursion expectation bit of the DNS query, which indicates that the terminal needs a recursion service for the name server to process the query. Setting an RD field in a query request, and returning the value of the RD field in a query response; RA is a recursion available bit of the DNS query response, which indicates whether the name server supports recursive queries, for example, when the RA field is 1, it indicates that the name server supports recursive queries; the TC is a truncation bit of the DNS query response, which indicates whether the response message is truncated, for example, when the TC field is 1, it indicates that more than 512 bytes of the response are truncated, and the response packet is divided into a plurality of packets.
In the embodiment of the invention, before the historical weblogs are acquired, the network connection logs and the DNS query logs of each terminal connected with the network equipment, which are acquired from the network equipment, are stored in the big data platform, so that the historical weblogs can be acquired from the big data platform.
Step S102, the network connection information and the DNS query information of the plurality of terminals are respectively aggregated according to a plurality of preset characteristic vectors, and characteristic data of the characteristic vectors corresponding to the plurality of terminals are determined.
In the embodiment of the present invention, as shown in fig. 3, the method for determining feature data of a feature vector of the present invention includes the following steps:
step S301, preprocessing the query domain names of the DNS query information to determine TLDs of the query domain names.
In the embodiment of the present invention, TLD (top Level domain) refers to a top Level domain name, and although the name is TLD, the name actually refers to a second Level domain name, for example, in the form of xxxx.
Step S302, according to different TLD query ratios in a plurality of preset feature vectors, aggregating the TLDs of the query domain names, and determining feature data of the different TLD query ratios in the feature vectors corresponding to the DNS behavior of the terminal.
In the embodiment of the present invention, according to the TLD for querying the domain name determined in step S301, feature data of different TLD query ratios of each terminal is determined; wherein the feature data represents values of a feature vector.
In the embodiment of the invention, the query domain name is too long, which usually indicates that the domain name is possibly a DNS tunnel, and the query is too much in a short time, which is suspected of being a DGA domain name. However, in practice it cannot be said that normal services may also employ DNS tunneling of data.
Because the DGA algorithm generates a large number of domain names, only one or a few of them are applied by attackers. The DGA algorithm attack in the malicious DNS behavior can be effectively detected by counting the number of different TLDs when the user accesses the DNS. The specific algorithm is as follows:
for example, two DNS queries exist in a user for a period of time, and the TLDs (i.e., "xxx.com") of "a.xxx.com" and "b.xxx.com" are the same, the number of different TLDs is 1, and the occupancy ratio of the different TLDs is 50%; if the user has two DNS queries, which respectively query different TLDs of aaa.com and bbb.com, the number of different TLDs is 2, and the occupancy of different TLD queries is 100%.
If the access is normal, the TLD domain name accessed by the user is usually more centralized, however, if the access is malicious, for example, the DGA algorithm attack can cause the user to access a large number of different TLD domain names.
Step S303, according to other preset feature vectors, aggregating the network connection information and the DNS query information, and determining feature data of other feature vectors corresponding to the DNS behavior of the terminal.
In an embodiment of the present invention, in addition to the feature vector "different TLD query ratios", the feature vector further includes a DNS query total number, a first feature query total number (i.e. a first query total number of a first query type), a second feature query total number (i.e. a second query total number of a non-second query type having an entropy value greater than a preset entropy value threshold value), a third feature query total number (i.e. a third query total number of a non-second query type having a query domain length with a size greater than a first preset query domain length threshold value), a fourth feature query total number (i.e. a fourth query total number of a non-second query type having a response domain length with a size greater than a preset response domain length threshold value), and a fifth feature query total number (i.e. a fifth query total number of RD, RA, and TC having a query domain length with a size greater than a second preset query domain length threshold value), total number of connections whose connection duration is over. Specifically, the method comprises the following steps:
total number of DNS queries:
representing the total number of queries of DNS queries of each terminal within a specified time period; wherein the specified time period may be 1 h.
First feature query total:
the first feature query total number is also referred to as a common query total number, and represents a total number of records of common queries of DNS queries of respective terminals within a specified time period. Common queries include query types with query type IDs [5, 15, 16, 48], that is: the first characteristic query total number represents a total number of records of the first query type of the DNS query of each terminal within a specified time period.
Second feature query total:
the second characteristic query total number is also called an infrequent query total number with overlarge entropy and represents the total number of records of the infrequent queries with the entropy value of the DNS query of each terminal being larger than the preset entropy value threshold value in the specified time period. Rare queries correspond to common queries, indicating that the query type ID does not belong to a query type of [2, 6, 12, 33, 43], i.e.: the second characteristic query total number represents the total number of records which are not in the second query type and have the entropy value larger than the preset entropy value threshold value of the DNS query of each terminal in the specified time period, or the total number of records which are not in the second query type and have the entropy value larger than the preset entropy value threshold value of the DNS query of each terminal in the specified time period.
The DNS protocol belongs to a plaintext transmission protocol, and a DNS tunnel Trojan usually encrypts communication data due to the requirement of improving the concealment of communication content, so that the encrypted DNS data is usually used as a measurement standard of suspicious DNS tunnel Trojan communication, and whether the DNS data is encrypted is determined by judging whether an entropy value rarely inquired is greater than a preset entropy value threshold value; and under the condition that the entropy value of the infrequent query is larger than the preset entropy value threshold, counting the total number of records of the infrequent query, and taking the total number as the total number of the infrequent query with overlarge entropy.
Third feature query total:
the third feature query total number is also referred to as an infrequent query total number in which the query domain name length is too large, and represents a total number of records of infrequent queries in which the query domain name length in the DNS query of each terminal in a specified time period exceeds a first preset query domain name length threshold.
Compared with a normal DNS session, the DNS tunnel Trojan horse session has a larger proportion of query domain name length, the greater the query domain name length is, the greater the possibility of the existence of the DNS tunnel Trojan horse is, therefore, the query domain name length is usually taken as a measurement standard of suspicious DNS tunnel Trojan horse communication, and whether the DNS tunnel Trojan horse exists is determined by judging whether the query domain name length rarely queried is greater than a first preset query domain name length threshold value; and under the condition that whether the length of the query domain name of the infrequent query is larger than a first preset query domain name length threshold value or not, counting the total number of records of the infrequent query, and taking the total number as the total number of the infrequent query with the excessively long query domain name.
Fourth feature query total:
the fourth feature query total number is also referred to as a rare query total number in which the response domain name length is too large, and represents a total number of records of rare queries in which the size of the DNS response domain name length exceeds a preset response domain name length threshold in DNS queries of each terminal in a specified time period.
Compared with a normal DNS session, the DNS tunnel Trojan horse session has a larger proportion of large response domain name length, the larger the DNS response domain name length is, the higher the possibility of existence of the DNS tunnel Trojan horse exists, therefore, the DNS response domain name length is usually taken as a measurement standard of suspicious DNS tunnel Trojan horse communication, and whether the DNS tunnel Trojan horse exists is determined by judging whether the response domain name length rarely inquired is larger than a preset response domain name length threshold value; and under the condition that whether the length of the response domain name of the rare query is larger than a preset response domain name length threshold or not, counting the total number of records of the rare query, and taking the total number as the total number of the rare query with the larger length of the response domain name.
Fifth feature query total:
the fifth feature query total number is also referred to as a specific query domain name length total number, and represents a total number of records in which RD, RA, and TC of the DNS query of each terminal in a specified time period are "TRUE" and the size of the query domain name length exceeds a second preset query domain name length threshold value.
Total number of connections whose connection duration exceeds:
the total number of connections whose connection duration exceeds the period of the session duration is also referred to as the total number of connections whose session duration is longer, and represents the number of network connection records whose connection duration exceeds a preset connection duration threshold in DNS query in which the target port of each terminal is 53, the protocol type is UDP, and the service type is DNS in a specified time period.
In the embodiment of the invention, the method for determining the feature data of the feature vector can determine the value of the feature vector of the DNS query of the terminal, and the feature vector is determined according to the actual abnormal DNS behavior, so that the abnormal DNS behavior is more accurately judged on the basis of the feature data.
Step S103, according to the characteristic data, determining abnormal DNS behaviors of the plurality of terminals.
In the embodiment of the present invention, as shown in fig. 4, the method for determining an abnormal DNS behavior of the present invention includes the following steps:
step S401, feature data of a plurality of terminals is acquired.
Step S402, determining the deviation degree of the characteristic data of the characteristic vector of each terminal from the characteristic data of the characteristic vector of the group according to the characteristic data of the plurality of terminals.
In the embodiment of the invention, aiming at each feature vector, the feature data of a single terminal and a plurality of terminals are compared, and the deviation degree of the feature data of the feature vector corresponding to each terminal from the group is calculated.
Step S403, inputting the deviation degree of each feature data of the terminal into an abnormal DNS behavior detection model, and determining whether the DNS behavior of the terminal is an abnormal DNS behavior.
In the embodiment of the invention, the deviation degree of the feature data of each feature vector of each terminal is input into the abnormal DNS behavior detection model, and the DNS behavior of the terminal is judged to be abnormal according to the output of the model. The abnormal DNS behavior detection model may detect a degree to which feature data of a certain feature vector of a certain terminal deviates from feature data of the feature vector corresponding to a plurality of terminals.
In the embodiment of the invention, the abnormal DNS behavior can be accurately judged by the method for determining the abnormal DNS behavior, so that the false alarm rate and the missing report rate of the abnormal DNS behavior are reduced, the system safety is ensured, and the network is prevented from being maliciously attacked.
In the embodiment of the invention, under the condition that the terminal has abnormal DNS behavior, the unified management platform can send threat warning to the terminal with the abnormal DNS behavior and display a detailed abnormal analysis report.
In the embodiment of the invention, the historical weblog is acquired; wherein the historical weblog includes network connection information and DNS query information of a plurality of terminals; aggregating the network connection information and the DNS query information of the plurality of terminals according to a plurality of preset characteristic vectors respectively, and determining characteristic data of the characteristic vectors corresponding to the plurality of terminals; and determining abnormal DNS behaviors of the plurality of terminals according to the characteristic data, and the like, so that the abnormal DNS behaviors of the terminals can be accurately determined according to the network connection information and the DNS query information, thereby reducing the false alarm rate and the missing report rate of the abnormal DNS behaviors, ensuring the system safety and preventing malicious attacks on the network.
Fig. 5 is a schematic diagram of main modules of an apparatus for analyzing abnormal DNS behavior of a terminal according to an embodiment of the present invention, and as shown in fig. 5, the apparatus for analyzing abnormal DNS behavior of a terminal according to the present invention includes the following modules:
an obtaining module 51, configured to obtain a historical weblog; wherein the historical network log includes network connection information and DNS query information of a plurality of terminals.
In the embodiment of the present invention, the historical blog may be a blog of a terminal within 1 hour (1h) before the current analysis time, and may be derived from a network device connected to a plurality of terminals. The obtaining module 51 is configured to obtain historical weblogs of multiple terminals, where the historical weblogs include network connection information and DNS query information of each terminal.
In the embodiment of the invention, the DNS query information comprises a terminal ID, a query type, a query domain name length, a query domain name entropy, a response domain name length, RD, RA and TC.
In the embodiment of the invention, the network connection information comprises a terminal ID, a target IP, a target port, a protocol type, a service type, a connection duration, a number of bytes sent and a number of bytes responded.
An aggregating module 502, configured to separately aggregate the network connection information and the DNS query information of the multiple terminals according to a plurality of preset feature vectors, and determine feature data of the feature vectors corresponding to the multiple terminals.
In an embodiment of the present invention, the feature vector comprises the different TLD query ratios, the total number of DNS queries, the first feature query total number determined from the DNS query information (i.e. the first query total number of the first query type), a second feature query total (i.e., a second query total of non-second query types with entropy greater than a preset entropy threshold), a third feature query total (i.e., a third query total of non-second query types with query domain length greater than a first preset query domain length threshold), a fourth feature query total (i.e., a fourth query total of non-second query types with response domain length greater than a preset response domain length threshold), and a fifth feature query total (i.e., a fifth query total of RD, RA, and TC with query domain length greater than a second preset query domain length threshold), and a total number of connections for which the connection duration exceeds.
In this embodiment of the present invention, the aggregation module 502 is configured to separately aggregate the network connection information and the DNS query information of the multiple terminals according to a plurality of preset feature vectors, and determine feature data of the feature vectors corresponding to the multiple terminals.
A determining module 503, configured to determine, according to the feature data, abnormal DNS behaviors of the multiple terminals.
In this embodiment of the present invention, after determining the deviation degree of the feature data of the feature vector of each terminal from the feature data of the feature vector of the group, the determining module 503 is configured to input the deviation degree of the feature data of each feature vector of each terminal into the abnormal DNS behavior detection model, and determine whether the DNS behavior of the terminal is the abnormal DNS behavior according to an output of the model.
In the embodiment of the invention, the abnormal DNS behavior of the terminal can be accurately determined according to the DNS query information and the network connection information through the acquisition module, the aggregation module, the judgment module and other modules, so that the false alarm rate and the missing report rate of the abnormal DNS behavior are reduced, the system safety is ensured, and the malicious attack to the network is prevented.
Fig. 6 shows an exemplary system architecture 600 of an analysis method of terminal abnormal DNS behavior or an analysis apparatus of terminal abnormal DNS behavior to which an embodiment of the present invention can be applied.
As shown in fig. 6, the system architecture 600 may include terminal devices 601, 602, 603, a network 604, a storage server 605, and an analytics server 606. The network 604 is used to provide a medium for communication links between the terminal devices 601, 602, 603 and the storage server 605, and between the storage server 605 and the analysis server 606. Network 604 may include various types of connections, such as wire, wireless communication links, or fiber optic cables, to name a few.
A user may use the terminal devices 601, 602, 603 to interact with the storage server 605 via the network 604, and the storage server 605 interaction may interact with the analysis server 606 via the network 604.
Various communication client applications, such as a security application, a shopping application, a web browser application, a search application, an instant messaging tool, a mailbox client, social platform software, and the like, may be installed on the terminal devices 601, 602, and 603. The terminal devices 601, 602, 603 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like.
The storage server 605 may be a server that provides various services, such as a background management server that supports security-type websites browsed by users using the terminal devices 601, 602, and 603. The background management server can store the received data such as the weblog and the like.
The analysis server 606 may be a server that provides various services, such as a back-office management server that provides support for data such as weblogs. The backend management server may analyze and otherwise process data such as the weblog acquired from the storage server 605, and feed back a processing result (for example, abnormal DNS behavior) to the terminal device.
It should be noted that the method for analyzing the abnormal DNS behavior of the terminal provided by the embodiment of the present invention is generally executed by the analysis server 606, and accordingly, the analysis device for the abnormal DNS behavior of the terminal is generally disposed in the analysis server 606.
It should be understood that the number of terminal devices, networks, storage servers, and analysis servers in fig. 6 are merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
Referring now to FIG. 7, shown is a block diagram of a computer system 700 suitable for use with a terminal device implementing an embodiment of the present invention. The terminal device shown in fig. 7 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present invention.
As shown in fig. 7, the computer system 700 includes a Central Processing Unit (CPU)701, which can perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)702 or a program loaded from a storage section 708 into a Random Access Memory (RAM) 703. In the RAM 703, various programs and data necessary for the operation of the system 700 are also stored. The CPU 701, the ROM 702, and the RAM 703 are connected to each other via a bus 704. An input/output (I/O) interface 705 is also connected to bus 704.
The following components are connected to the I/O interface 705: an input portion 706 including a keyboard, a mouse, and the like; an output section 707 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage section 708 including a hard disk and the like; and a communication section 709 including a network interface card such as a LAN card, a modem, or the like. The communication section 709 performs communication processing via a network such as the internet. A drive 710 is also connected to the I/O interface 705 as needed. A removable medium 711 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 710 as necessary, so that a computer program read out therefrom is mounted into the storage section 708 as necessary.
In particular, according to the embodiments of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program can be downloaded and installed from a network through the communication section 709, and/or installed from the removable medium 711. The computer program performs the above-described functions defined in the system of the present invention when executed by the Central Processing Unit (CPU) 701.
It should be noted that the computer readable medium shown in the present invention can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present invention, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present invention, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The modules described in the embodiments of the present invention may be implemented by software or hardware. The described modules may also be provided in a processor, which may be described as: a processor includes an acquisition module, an aggregation module, and a determination module. The names of these modules do not in some cases constitute a limitation to the modules themselves, and for example, the determination module may also be described as a "module that determines abnormal DNS behavior of a plurality of terminals according to characteristic data".
As another aspect, the present invention also provides a computer-readable medium that may be contained in the apparatus described in the above embodiments; or may be separate and not incorporated into the device. The computer readable medium carries one or more programs which, when executed by a device, cause the device to comprise: acquiring a historical weblog; wherein the historical weblog includes network connection information and DNS query information of a plurality of terminals; aggregating the network connection information and the DNS query information of the plurality of terminals according to a plurality of preset characteristic vectors respectively, and determining characteristic data of the characteristic vectors corresponding to the plurality of terminals; and determining abnormal DNS behaviors of the plurality of terminals according to the characteristic data.
Because the existing DNS service relies on a protocol with poor security and is very easy to spoof, and the daily network device and the border guard device do not perform filtering analysis or shielding on the DNS service, an attacker often locates the DNS service as an object of direct attack or as a springboard for other attacks. For example, when an attacker obtains the authority of the server, or the server is infected by malicious software, worms, trojans and the like, the purposes of stealing sensitive information, file transmission, returning control instructions, Shell rebounding and the like are achieved by establishing a DNS tunnel. Common DNS malicious utilization technologies mainly comprise DGA domain names, DNS tunneling technologies, Fast-Flux and the like, common network penetration suites such as Metasplait, Cobalt spike and the like, or open source software iododine, ozymandns, DNS2tcp, dnscat2 and the like can quickly and easily construct DNS hidden tunnels. Conventional security products include traffic analysis software (e.g., Bro), passive DNS techniques, machine learning, and the like.
Bro is a passive, open source network traffic analysis software (now known as Zeek) that can be used as security monitoring software to deeply detect traffic in a link and thereby discover traces of malicious activity. The deployment Bro may log network activity from the higher dimension, including detailed records for each connection, application layer information, and the like. For example, an HTTP session (containing request URI, main header fields, MIME type, and server response), DNS request and response, SSL certificates, main contents of SMTP session, etc., may select the output format. Bro may also write parsing tasks such as DNS tunnel detection using the scripting language provided. Common rule matching by Bro includes far more normal DNS frequency, too many non-repeating sub-domain names, TXT type domain name query, too long domain name length, domain name coding anomaly, etc.
Passive DNS is a method of reverse direction retrieval or query of DNS data information, which rebuilds DNS data information available in the global domain name system to a central database for retrieval and query. Passive DNS is used for malicious domain name identification, traffic analysis, blacklist analysis, DNS server misconfiguration detection, and the like.
Machine learning refers to learning a DNS tunnel model from historical data for detecting malicious DNS, and commonly used algorithms include bayesian algorithms.
The DNS tunnel detection is carried out through the domain name with the abnormal length requested by the monitoring terminal, the time window and the like, on one hand, the detection error is high, and on the other hand, an attacker can easily bypass the rule detection by modifying the characteristics of the domain name length, the request frequency and the like. The database maintenance cost of passive DNS is extremely high. The machine learning also has the problem of high false alarm rate due to the scarcity of abnormal samples. Bro, passive DNS, and machine learning belong to real-time detection, and it is necessary to precisely match malicious domain names accessed by users in a very short time. However, as technologies such as Fast-Flux and DGA are adopted by hackers more and more widely, network attacks are more hidden, malicious tracking is more difficult, and potential safety hazards are more permanent. It is naturally very important to identify malicious domain names quickly and accurately,
the existing safety products aim at domain names or domain name servers, whether the domain names or the domain name servers are malicious or not is judged by using an algorithm, once a certain domain name is mistakenly reported to be malicious, a large amount of alarms can be generated, and terminals accessing the domain name or the domain name server in an enterprise internal network can be influenced; once false-reported, malicious DNS behavior will persist in the enterprise intranet.
According to the technical scheme of the embodiment of the invention, the terminal with abnormal DNS behavior is determined and protected by analyzing historical network log data, so that further activities of malicious software on the terminal or long-term latency of APT high-level sustainable attacks on the internal network of the enterprise can be prevented, and the judgment accuracy can be improved.
According to the technical scheme of the embodiment of the invention, the abnormal DNS behavior of the terminal can be accurately determined according to the DNS query information and the network connection information, so that the false alarm rate and the missing report rate of the abnormal DNS behavior are reduced, the system safety is ensured, and the network is prevented from being maliciously attacked.
According to the technical scheme of the embodiment of the invention, the abnormal DNS behavior can be determined by carrying out feature extraction and information aggregation on the network connection information and the DNS query information of each terminal in the internal network of the enterprise. The judgment is determined based on the network logs of the terminals in the appointed time period, so that whether the DNS behaviors of the terminals are relatively consistent or not can be determined, whether the DNS behaviors of the terminals are abnormal or not is further judged based on the uniform behavior standard, the judgment accuracy is improved, and the abnormal DNS behaviors of the terminals can be accurately determined.
Further, if an abnormal DNS behavior occurs on a certain terminal, a threat warning is sent to the terminal on the unified management platform, and a detailed abnormal analysis report is displayed, so that the network security is improved, and the normal operation and work of the network are ensured.
The above-described embodiments should not be construed as limiting the scope of the invention. Those skilled in the art will appreciate that various modifications, combinations, sub-combinations, and substitutions can occur, depending on design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. A method for analyzing abnormal DNS behavior of a terminal is characterized by comprising the following steps:
acquiring a historical weblog; wherein the historical weblog includes network connection information and DNS query information of a plurality of terminals;
aggregating the network connection information and the DNS query information of the plurality of terminals according to a plurality of preset characteristic vectors respectively, and determining characteristic data of the characteristic vectors corresponding to the plurality of terminals;
and determining abnormal DNS behaviors of the plurality of terminals according to the characteristic data.
2. The method of claim 1, wherein the determining the abnormal DNS behavior of the plurality of terminals according to the characteristic data comprises:
comparing the feature data of the feature vector of each terminal with the feature data of the feature vectors corresponding to a plurality of terminals;
and determining whether the DNS behavior of each terminal is abnormal according to the comparison result.
3. The method according to claim 2, wherein the determining whether the DNS behavior of each of the terminals is an abnormal DNS behavior according to the comparison result comprises:
determining the deviation degree of the feature data of the feature vector of each terminal from the feature data of the feature vectors of the group according to the feature data of the plurality of terminals;
and inputting the deviation degree of each characteristic data of the terminal into an abnormal DNS behavior detection model, and determining whether the DNS behavior of the terminal is abnormal DNS behavior.
4. The method of claim 1, wherein the network connection information comprises one or more of: terminal ID, target IP, target port, protocol type, service type, connection duration, number of bytes sent and number of bytes responded;
the DNS query information includes one or more of: terminal ID, query type, query domain name length, query domain name entropy, response domain name length, RD, RA, TC.
5. The method of claim 1, wherein the feature vector comprises: the total number of DNS queries, and/or the total number of first characteristic queries, the total number of second characteristic queries, the total number of third characteristic queries, the total number of fourth characteristic queries and the total number of fifth characteristic queries determined according to the DNS query information, and/or the occupation ratio of different TLD queries, and/or the total number of connections with connection duration exceeding time.
6. The method according to claim 5, before aggregating the network connection information and the DNS query information of the plurality of terminals according to a plurality of preset feature vectors, respectively, further comprising:
preprocessing a plurality of query domain names of the DNS query information, and determining TLDs of the plurality of query domain names;
the aggregating the network connection information and the DNS query information of the plurality of terminals according to a plurality of preset feature vectors, and determining feature data of the feature vectors corresponding to the plurality of terminals includes:
and aggregating the TLDs of the query domain name according to the preset different TLD query ratios, and determining the characteristic data of the different TLD query ratios in the characteristic vector corresponding to the terminal.
7. The method of claim 1, further comprising:
and displaying an abnormal DNS behavior analysis report of the terminal under the condition that the terminal has abnormal DNS behavior.
8. An apparatus for analyzing abnormal DNS behavior of a terminal, comprising:
the acquisition module is used for acquiring a historical weblog; wherein the historical weblog includes network connection information and DNS query information of a plurality of terminals;
the aggregation module is used for respectively aggregating the network connection information and the DNS query information of the plurality of terminals according to a plurality of preset feature vectors and determining feature data of the feature vectors corresponding to the plurality of terminals;
and the judging module is used for determining the abnormal DNS behaviors of the terminals according to the characteristic data.
9. An electronic device for analyzing abnormal DNS behavior of a terminal, comprising:
one or more processors;
a storage device for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-7.
10. A computer-readable medium, on which a computer program is stored, which, when being executed by a processor, carries out the method according to any one of claims 1-7.
CN202111170497.8A 2021-10-08 2021-10-08 Analysis method and device for abnormal DNS behaviors of terminal Active CN113904843B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111170497.8A CN113904843B (en) 2021-10-08 2021-10-08 Analysis method and device for abnormal DNS behaviors of terminal

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111170497.8A CN113904843B (en) 2021-10-08 2021-10-08 Analysis method and device for abnormal DNS behaviors of terminal

Publications (2)

Publication Number Publication Date
CN113904843A true CN113904843A (en) 2022-01-07
CN113904843B CN113904843B (en) 2023-11-14

Family

ID=79190385

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111170497.8A Active CN113904843B (en) 2021-10-08 2021-10-08 Analysis method and device for abnormal DNS behaviors of terminal

Country Status (1)

Country Link
CN (1) CN113904843B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114760268A (en) * 2022-04-20 2022-07-15 中国电信股份有限公司 Management method of encrypted domain name system and local DNS (domain name system) equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160065611A1 (en) * 2011-07-06 2016-03-03 Nominum, Inc. Analyzing dns requests for anomaly detection
CN106790062A (en) * 2016-12-20 2017-05-31 国家电网公司 A kind of method for detecting abnormality and system based on the polymerization of inverse dns nailing attribute
CN107071084A (en) * 2017-04-01 2017-08-18 北京神州绿盟信息安全科技股份有限公司 A kind of DNS evaluation method and device
CN109218124A (en) * 2017-07-06 2019-01-15 杨连群 DNS tunnel transmission detection method and device
US20210273865A1 (en) * 2018-07-27 2021-09-02 Nokia Solutions And Networks Oy Method, device, and system for network traffic analysis

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160065611A1 (en) * 2011-07-06 2016-03-03 Nominum, Inc. Analyzing dns requests for anomaly detection
CN106790062A (en) * 2016-12-20 2017-05-31 国家电网公司 A kind of method for detecting abnormality and system based on the polymerization of inverse dns nailing attribute
CN107071084A (en) * 2017-04-01 2017-08-18 北京神州绿盟信息安全科技股份有限公司 A kind of DNS evaluation method and device
CN109218124A (en) * 2017-07-06 2019-01-15 杨连群 DNS tunnel transmission detection method and device
US20210273865A1 (en) * 2018-07-27 2021-09-02 Nokia Solutions And Networks Oy Method, device, and system for network traffic analysis

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
白凡;: "基于DNS分析恶意行为检测的研究", 电信网技术, no. 08 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114760268A (en) * 2022-04-20 2022-07-15 中国电信股份有限公司 Management method of encrypted domain name system and local DNS (domain name system) equipment

Also Published As

Publication number Publication date
CN113904843B (en) 2023-11-14

Similar Documents

Publication Publication Date Title
US11323472B2 (en) Identifying automated responses to security threats based on obtained communication interactions
US10200384B1 (en) Distributed systems and methods for automatically detecting unknown bots and botnets
US10091167B2 (en) Network traffic analysis to enhance rule-based network security
US8677487B2 (en) System and method for detecting a malicious command and control channel
US8904524B1 (en) Detection of fast flux networks
CN109474575B (en) DNS tunnel detection method and device
US20150128267A1 (en) Context-aware network forensics
US10122722B2 (en) Resource classification using resource requests
WO2017041666A1 (en) Processing method and device directed at access request
US20200106790A1 (en) Intelligent system for mitigating cybersecurity risk by analyzing domain name system traffic
CN108063833B (en) HTTP DNS analysis message processing method and device
KR20230004222A (en) System and method for selectively collecting computer forensic data using DNS messages
US8713674B1 (en) Systems and methods for excluding undesirable network transactions
US9385993B1 (en) Media for detecting common suspicious activity occurring on a computer network using firewall data and reports from a network filter device
JP5980968B2 (en) Information processing apparatus, information processing method, and program
CN111859374B (en) Method, device and system for detecting social engineering attack event
CN106790073B (en) Blocking method and device for malicious attack of Web server and firewall
US20170353486A1 (en) Method and System For Augmenting Network Traffic Flow Reports
CN113904843B (en) Analysis method and device for abnormal DNS behaviors of terminal
WO2019047693A1 (en) Method and device for carrying out wifi network security monitoring
EP4178159A1 (en) Privacy preserving malicious network activity detection and mitigation
CN115190107B (en) Multi-subsystem management method based on extensive domain name, management terminal and readable storage medium
CN113726775B (en) Attack detection method, device, equipment and storage medium
CN114726579A (en) Method, apparatus, device, storage medium and program product for defending against network attacks
US10454965B1 (en) Detecting network packet injection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant