CN114189390A - Domain name detection method, system, equipment and computer readable storage medium - Google Patents
Domain name detection method, system, equipment and computer readable storage medium Download PDFInfo
- Publication number
- CN114189390A CN114189390A CN202111676182.0A CN202111676182A CN114189390A CN 114189390 A CN114189390 A CN 114189390A CN 202111676182 A CN202111676182 A CN 202111676182A CN 114189390 A CN114189390 A CN 114189390A
- Authority
- CN
- China
- Prior art keywords
- target
- domain name
- grammar
- domain names
- domain
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 94
- 238000004220 aggregation Methods 0.000 claims abstract description 104
- 230000002776 aggregation Effects 0.000 claims abstract description 61
- 238000000034 method Methods 0.000 claims abstract description 51
- 230000004931 aggregating effect Effects 0.000 claims abstract description 32
- 238000004590 computer program Methods 0.000 claims description 10
- 238000000605 extraction Methods 0.000 claims description 9
- 238000012545 processing Methods 0.000 claims description 8
- 230000009467 reduction Effects 0.000 claims description 5
- 230000008569 process Effects 0.000 description 19
- 244000035744 Hura crepitans Species 0.000 description 12
- 230000011218 segmentation Effects 0.000 description 9
- 230000002159 abnormal effect Effects 0.000 description 8
- 238000010586 diagram Methods 0.000 description 7
- 238000004891 communication Methods 0.000 description 6
- 230000001360 synchronised effect Effects 0.000 description 6
- 230000003287 optical effect Effects 0.000 description 4
- 230000009471 action Effects 0.000 description 3
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 241000700605 Viruses Species 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 238000007689 inspection Methods 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 230000003068 static effect Effects 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000003203 everyday effect Effects 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/14—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
- H04L63/1408—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/14—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
- H04L63/1408—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
- H04L63/1416—Event detection, e.g. attack signature detection
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/14—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
- H04L63/1408—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
- H04L63/1425—Traffic logging, e.g. anomaly detection
Landscapes
- Engineering & Computer Science (AREA)
- Computer Security & Cryptography (AREA)
- Computer Hardware Design (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Machine Translation (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The application discloses a domain name detection method, a system, equipment and a computer readable storage medium, which are used for determining each target domain name accessed by each target equipment; aggregating the target domain names to obtain various target aggregated domain names; determining target grammar characteristics and target time sequence characteristics of various target aggregation domain names; and detecting whether the target domain name is a botnet domain name or not based on the target grammar characteristics and the target time sequence characteristics to obtain a corresponding detection result. In the method and the device, similar domain names can be aggregated together, the grammatical features and the time sequence features of a group of similar target domain names can be determined, and finally whether the target domain names are botnet domain names is detected based on the target grammatical features and the target time sequence features, so that if corresponding detection results are obtained, the method and the device are equivalent to detecting whether the domain names are botnet domain names according to the grammatical features and the time sequence features of the group of similar domain names, the detection mode of the botnet domain names is expanded, and the detection accuracy of the botnet domain names is improved.
Description
Technical Field
The present application relates to the field of network security technologies, and in particular, to a method, a system, a device, and a computer-readable storage medium for domain name detection.
Background
In the operation process of a device such as a server, an attacker may attack the device, for example, attack the device through a botnet or the like, which means that a large number of hosts infect bot programs (bots) viruses by using one or more propagation means, so that a one-to-many controllable network is formed between a controller and the infected host. Because the positioning access of the network can be realized by means of a Domain Name, the Domain Name (also called a "netdomain") is the Name of a certain computer or a computer group on the Internet, which is composed of a string of names separated by points, and is used for positioning and identifying the computer during data transmission, a user may access the Domain Name of the botnet, namely the Domain Name of the botnet, and be attacked by the botnet, so that in order to protect the security of equipment, the Domain Name of the botnet and the like needs to be detected so as to perform security protection based on the corresponding Domain Name.
For example, malicious domain names can be extracted by analyzing the abnormal flow released by the sandbox, but the malicious domain names are limited by the sandbox environment and the sample countermeasure means, such as shell adding, code confusion, execution link inspection and the like, so that the sandbox cannot accurately detect the flow released condition of the file sample, and the domain names cannot be accurately detected.
In summary, how to improve the accuracy of domain name detection is a problem to be solved urgently by those skilled in the art.
Disclosure of Invention
The application aims to provide a domain name detection method which can solve the technical problem of improving the accuracy of domain name detection to a certain extent. The application also provides a domain name detection system, an electronic device and a computer readable storage medium.
In order to achieve the above purpose, the present application provides the following technical solutions:
a domain name detection method, comprising:
determining each target domain name accessed by each target device;
aggregating the target domain names to obtain various target aggregated domain names;
determining target grammar characteristics and target time sequence characteristics of each type of target aggregation domain name;
and detecting whether the target domain name is a botnet domain name or not based on the target grammar characteristics and the target time sequence characteristics to obtain a corresponding detection result.
Preferably, the aggregating the target domain name to obtain various target aggregated domain names includes:
aggregating the target domain names with the same top-level domain name to obtain various initial aggregation domain names;
and aggregating the target domain names with the same length of each level of domain name in each initial aggregation domain name to obtain the target aggregation domain name.
Preferably, the determining the target grammar characteristics of each type of the target aggregated domain name includes:
dividing more than two levels of domain names in each target domain name in the target aggregated domain name according to preset dividing characters to obtain each target character string corresponding to each target domain name;
and performing grammar pattern extraction on the target character string of each target domain name to obtain the target grammar features corresponding to the target aggregated domain name.
Preferably, the extracting the grammar pattern of the target character string of each target domain name to obtain the target grammar features corresponding to the target aggregated domain name includes at least one of the following processing modes:
if all characters of the target character string are numbers, marking the grammar features of the target character string as: the combination of the number of the digits and the preset characters;
if the characters in the target character string are words or the length of the target character string is smaller than a first preset value, marking the grammar features of the target character string as: the target character string;
if the characters in the target character string are not words and the length of the target character string is greater than or equal to the first preset value, marking the grammatical features of the target character string as: the character number value of the target character string;
combining the grammar feature of the target character string corresponding to the target domain name with the top-level domain name of the target domain name to obtain the target grammar feature of the target domain name.
Preferably, the detecting whether the target domain name is a botnet domain name based on the target grammar feature and the target time sequence feature includes:
aggregating the target domain names corresponding to the same target grammar features to obtain various grammar aggregated domain names;
aggregating the target domain names which are identical in the secondary domain name and are confirmed words in each type of grammar aggregation domain names to obtain various types of first grammar sub-aggregation domain names;
if the target time sequence characteristics corresponding to the first grammar sub-aggregation domain name indicate that all three-level domain name modes belonging to the second-level domain name appear on at least a first preset number of target devices, determining that the target domain name corresponding to the first grammar sub-aggregation domain name is a botnet domain name.
Preferably, the detecting whether the target domain name is a botnet domain name based on the target grammar feature and the target time sequence feature includes:
aggregating the target domain names corresponding to the same target grammar features to obtain various grammar aggregated domain names;
taking the target domain name with the secondary domain name of a non-confirmed word in each type of the grammar aggregation domain names as an independent second grammar sub-aggregation domain name;
if one of the following conditions exists, the target domain name in the second grammar sub-aggregation domain name is a botnet domain name:
the number of the target domain names in the second grammar sub-aggregation domain names is greater than a second preset number, the target time sequence characteristic represents that a first occurrence number is greater than a first preset number, and the first occurrence number is the occurrence number of the target domain names in the same target equipment continuous time;
the number of the target domain names in the second grammar sub-aggregation domain names is greater than the second preset number, and the target time sequence characteristic represents that the second occurrence number is greater than the second preset number, wherein the second occurrence number is the occurrence number of the target domain names in the same target equipment at different time;
the number of the target domain names in the second grammar sub-aggregation domain names is greater than the second preset number, and the target time sequence characteristic represents that a third occurrence number is greater than a third preset number, wherein the third occurrence number is the occurrence number of the target domain names at different times in a preset period of the same target device;
the number of the target domain names in the second grammar sub-aggregation domain name is larger than the second preset number, and the target time sequence characteristics represent that all the target domain names appear in the same target device at the same time.
Preferably, after detecting whether the target domain name is a botnet domain name based on the target grammar feature and the target time sequence feature and obtaining a corresponding detection result, the method further includes:
carrying out false alarm reduction processing on the domain name based on a preset Chinese lexical library and a white domain name library;
the white domain name library is a domain name library for storing safe domain names.
A domain name detection system, comprising:
the domain name determining module is used for determining each target domain name accessed by each target device;
the aggregation module is used for aggregating the target domain names to obtain various target aggregated domain names;
the characteristic determining module is used for determining target grammar characteristics and target time sequence characteristics of various target aggregation domain names;
and the detection module is used for detecting whether the target domain name is a botnet domain name or not based on the target grammar features and the target time sequence features to obtain a corresponding detection result.
An electronic device, comprising:
a memory for storing a computer program;
a processor for implementing the steps of the domain name detection method as described in any one of the above when executing the computer program.
A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the domain name detection method according to any one of the preceding claims.
According to the domain name detection method, each target domain name accessed by each target device is determined; aggregating the target domain names to obtain various target aggregated domain names; determining target grammar characteristics and target time sequence characteristics of various target aggregation domain names; and detecting whether the target domain name is a botnet domain name or not based on the target grammar characteristics and the target time sequence characteristics to obtain a corresponding detection result. In the application, the target domain names can be aggregated to obtain various target aggregated domain names, so that similar domain names can be aggregated together, if the target grammatical features and the target time sequence features of various target aggregated domain names are subsequently determined, the grammatical features and the time sequence features of a group of similar target domain names are determined, and finally whether the target domain names are botnet domain names is detected based on the target grammatical features and the target time sequence features, so that the corresponding detection results are obtained, the method is equivalent to detecting whether the domain names are botnet domain names according to the grammatical features and the time sequence features of the group of similar domain names, the botnet domain names are detected based on the self features of the botnet domain names, the mode of extracting malicious domain names by analyzing the abnormity of sandbox release flow is equivalent to detecting the botnet domains according to the abnormity results of access domain names, since the characteristics of the botnet domain name do not change, and the result of accessing the botnet domain name may not be detected as abnormal, the method and the device for detecting the botnet domain name based on the characteristics of the botnet domain name can improve the detection accuracy of the botnet domain name. The system, the device and the computer readable storage medium for detecting the domain name also solve the corresponding technical problems.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, it is obvious that the drawings in the following description are only embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
Fig. 1 is an application scenario diagram of a domain name detection scheme provided in an embodiment of the present application;
fig. 2 is a flowchart of a domain name detection method according to an embodiment of the present disclosure;
fig. 3 is another flowchart of a domain name detection method according to an embodiment of the present disclosure;
fig. 4 is a schematic diagram illustrating extraction of a target grammar feature in a domain name detection method according to an embodiment of the present application;
fig. 5 is a flowchart illustrating a detection process of a botnet domain name in a domain name detection method according to an embodiment of the present application;
fig. 6 is a schematic structural diagram of a domain name detection system according to an embodiment of the present application;
fig. 7 is a schematic diagram of a hardware component structure of an electronic device according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
In the operation process of a device such as a server, an attacker may attack the device, for example, attack the device through a botnet or the like, which means that a large number of hosts infect bot programs (bots) viruses by using one or more propagation means, so that a one-to-many controllable network is formed between a controller and the infected host. Because the positioning access of the network can be realized by means of a Domain Name, which is also called a "Domain" and is the Name of a certain computer or a group of computers on the Internet composed of a string of names separated by points, and is used for positioning and identifying the computer during data transmission, a user may access the Domain Name of the botnet, that is, the Domain Name of the botnet, and receive the attack of the botnet, so that in order to protect the security of the device, the Domain Name of the botnet and the like needs to be detected so as to perform security protection based on the corresponding Domain Name. For example, malicious domain names can be extracted by analyzing the abnormal flow released by the sandbox, but the malicious domain names are limited by the sandbox environment and the sample countermeasure means, such as shell adding, code confusion, execution link inspection and the like, so that the sandbox cannot accurately detect the flow released condition of the file sample, and the domain names cannot be accurately detected. The domain name detection scheme provided by the application can improve the detection accuracy of the botnet domain name.
In the domain name detection scheme of the present application, a system framework adopted may specifically refer to fig. 1, and may specifically include: a backend server 01 and a number of clients 02 establishing a communication connection with the backend server 01. It should be noted that the background server may be a server specially used for domain name detection, and the user side may be a server, a terminal device, and the like used by the user, which is not specifically limited herein.
In the present application, the background server 01 is configured to execute the steps of the domain name detection method, including determining each target domain name accessed by each target device; aggregating the target domain names to obtain various target aggregated domain names; determining target grammar characteristics and target time sequence characteristics of various target aggregation domain names; and detecting whether the target domain name is a botnet domain name or not based on the target grammar characteristics and the target time sequence characteristics to obtain a corresponding detection result.
Further, the background server 01 may further be provided with an access record database, a domain name database, a grammar feature database, a time sequence feature database, a botnet domain name database, and the like. The access record database is used for storing the acquired access information of each target device to each target domain name, the domain name database user stores various target aggregation domain names, the grammar feature database is used for storing the target grammar features of various target aggregation domain names, the time sequence feature database is used for storing the target time sequence features of various target aggregation domain names, and the botnet domain name database is used for storing the target domain names detected as botnet domain names. In the present application, the background server 01 may respond to domain name detection requests of one or more user terminals 02, and it can be understood that domain name detection requests initiated by different user terminals 02 in the present application may be detection requests for the same domain name, detection requests for different domain names, and the like.
It should be noted that the domain name detection method provided by the present application can also be applied to other scenarios, for example, the domain name detection method can be directly applied to a sandbox to detect a domain name; or the domain name detection method is operated in the equipment in the form of a software client, and is used for carrying out domain name detection on the domain name accessed by the equipment and the like; the present application is not specifically limited herein.
Referring to fig. 2, fig. 2 is a flowchart of a domain name detection method according to an embodiment of the present disclosure.
The domain name detection method provided by the embodiment of the application can comprise the following steps:
step S101: determining each target domain name accessed by each target device.
In practical application, each target domain name accessed by each target device may be determined first, specifically, DNS access traffic of each target device in each time period may be obtained through firewall traffic monitoring, and each target domain name accessed by each target device may be determined according to the DNS access traffic.
It should be noted that the DNS access traffic may include information such as device id (dev _ id), source ip (sip), destination ip (dip), destination port (dport), timestamp (timestamp), DNS request content (queries), DNS request type (qtype), DNS return result (rdata), etc., it can be flexibly determined according to actual needs, correspondingly, when distinguishing each target device, the device can be directly distinguished according to the device id and the source IP, etc., and in determining each target domain name accessed by each target device according to the DNS access traffic, the DNS access traffic can be filtered, only the target DNS access traffic with the DNS request type of legal domain name and the DNS request type of 1 is reserved, the target device and the target domain name accessed by the target device are extracted from the target DNS, the legal domain name is a domain name that satisfies the domain name rule, and the like, and the application is not specifically limited herein.
Step S102: and aggregating the target domain names to obtain various target aggregated domain names.
In practical application, because the community exists among the botnet domain names, in the process of detecting the domain names, the target domain names can be aggregated to obtain various target aggregated domain names with the same characteristics, so that whether the target domain names are the botnet domain names or not can be analyzed based on the target aggregated domain names in the following process.
In a specific application scenario, in the process of aggregating target domain names to obtain various target aggregated domain names, the target domain names with the same top-level domain name can be aggregated together to obtain various initial aggregated domain names; and aggregating the target domain names with the same length in all levels of domain names in each initial aggregation domain name to obtain the target aggregation domain name.
For convenience of understanding, assuming that the target domain names are [ a.1.com, b.2.com, c.3.cn, d.4.cn, e.5.net ], the initial aggregation domain names obtained after aggregation according to the top-level domain name are: { com: [ a.1.com, b.2.com ], cn: [ c.3.cn, d.4.cn ], net: [ e.5.net ] }; in the process of aggregating the target domain names with the same length in each level of the initial aggregation domain names to obtain the target aggregation domain name, only the domain names with the same level and the same length are aggregated to form the target aggregation domain name, and assuming that the initial aggregation domain names are { com: [ a.1.com, bb.22.com, cc.33.com, d.4.com ], cn: [ cc.3.cn, dd.4.cn ] }, the target aggregation domain name is: { com: { com 1.1.com: [ a.1.com, d.4.com ],2.2.com: [ bb.22.com, cc.33.com ] }, cn: {2.1.com: [ cc.3.cn, dd.4.cn ] }.
Step S103: and determining target grammar characteristics and target time sequence characteristics of various target aggregation domain names.
Step S104: and detecting whether the target domain name is a botnet domain name or not based on the target grammar characteristics and the target time sequence characteristics to obtain a corresponding detection result.
In practical application, the grammar features and the time sequence features of the zombie domain names have commonality, so that after the target domain names are aggregated to obtain various target aggregated domain names, the target grammar features and the target time sequence features of the various target aggregated domain names can be determined, whether the target domain names are zombie network domain names or not is detected based on the target grammar features and the target time sequence features, and corresponding detection results are obtained.
It should be noted that after a domain name is obtained, because the information of the botnet domain name is unknown, it is impossible to directly perform botnet domain name recognition, for example, a domain name is 777.abc.com, and it is impossible to directly determine whether the domain name is a botnet domain name, so it is necessary to detect the domain name by means of a domain name detection technique, for example, it is performed by simulating whether the result of accessing the domain name is abnormal in a sandbox, for example, if the result of accessing the domain name is abnormal corresponding to accessing the botnet domain name, it is considered that the domain name is a botnet domain name, but the access result of some botnets is not abnormal, for example, the result of accessing the botnet domain name is that heartbeat traffic is sent, the result is considered to be normal by the sandbox, the sandbox cannot detect the domain name, but after the botnet domain name is generated, its grammatical feature is fixed, and the time sequence feature of the target device accessing the botnet domain name is fitted to the attack mode of the botnet domain name, for example, the target device continuously accesses the botnet domain name, and the like, so that whether the target domain name is the botnet domain name can be detected based on the target grammatical feature and the target time sequence feature of the target domain name, and a corresponding detection result is obtained.
It should be noted that the Grammatical features (Grammatical features) in the present application refer to features describing relationships between characters in a string sequence, and are commonly used for parsing and semantic analysis to describe characteristics of a certain dimension of a string sequence; the time series (Temporal Sequences) is a sequence formed by arranging numerical values of the same statistical index according to the occurrence time sequence, so in the process of determining the target time series of the target domain name, time information of each target device for accessing the target domain name can be determined first, all time stamps of the same target device for accessing the target domain name are aggregated together, so that the time series of the target device for accessing the target domain name can be obtained, and the time series is analyzed, so that corresponding target timing characteristics can be obtained, for example, the time series of the target device a for accessing the target domain name a is [1s, 3s, 5s ], that is, the target device a for accessing the target device a at 1s, 3s and 5s after the statistical time, so that the target timing characteristics of the target device a for accessing the target domain name a can be determined as: every 2s visits, etc.; the specific time sequence characteristics and the corresponding determination mode may be determined according to actual needs, and the present application is not specifically limited herein.
In a specific application scenario, in order to further ensure the accuracy of the detected botnet domain name, after detecting whether the target domain name is the botnet domain name based on the target grammatical feature and the target time sequence feature and obtaining a corresponding detection result, the method can also be based on a preset Chinese lexical library and a preset white domain name library, and performing false alarm reduction processing on the domain name, specifically, if the target domain name is detected as a botnet domain name, and the target domain name is detected based on the Chinese lexical library not to be composed of all Chinese words and not to belong to the white domain name library, the target domain name may be determined to be a botnet domain name, and accordingly, if the target domain name is detected as a botnet domain name, and the target domain name is detected to be composed of all chinese words based on a chinese lexical library, or the target domain name belongs to a white domain name library, the detection result of the target domain name can be changed into a non-zombie network domain name; the white domain name library is a domain name library for storing safe domain names. For example, the domain name with ICP records, the domain name with master station network, and the common functional domain name may be NTP, query self IP, microsoft, and the like. That is, if the domain name whose detection result is the botnet domain name appears in the white domain name library, the detection result of the domain name may be adjusted to a non-botnet domain name or the like.
According to the domain name detection method, each target domain name accessed by each target device is determined; aggregating the target domain names to obtain various target aggregated domain names; determining target grammar characteristics and target time sequence characteristics of various target aggregation domain names; and detecting whether the target domain name is a botnet domain name or not based on the target grammar characteristics and the target time sequence characteristics to obtain a corresponding detection result. In the application, the target domain names can be aggregated to obtain various target aggregated domain names, so that similar domain names can be aggregated together, if the target grammatical features and the target time sequence features of various target aggregated domain names are subsequently determined, the grammatical features and the time sequence features of a group of similar target domain names are determined, and finally whether the target domain names are botnet domain names is detected based on the target grammatical features and the target time sequence features, so that the corresponding detection results are obtained, the method is equivalent to detecting whether the domain names are botnet domain names according to the grammatical features and the time sequence features of the group of similar domain names, the botnet domain names are detected based on the self features of the botnet domain names, the detection mode of the botnet domain names is expanded, the mode of extracting malicious domain names by analyzing the abnormity of sandbox release flow is currently used, and the method is equivalent to detecting the botnet domain names according to the abnormal result of visiting, since the characteristics of the botnet domain name do not change, and the result of accessing the botnet domain name may not be detected as abnormal, the method and the device for detecting the botnet domain name based on the characteristics of the botnet domain name can improve the detection accuracy of the botnet domain name.
Referring to fig. 3, fig. 3 is another flowchart of a domain name detection method according to an embodiment of the present disclosure.
The domain name detection method provided by the embodiment of the application can comprise the following steps:
step S201: determining each target domain name accessed by each target device.
Step S202: and aggregating the target domain names with the same top-level domain name to obtain various initial aggregation domain names.
Step S203: and aggregating the target domain names with the same length in all levels of domain names in each initial aggregation domain name to obtain the target aggregation domain name.
Step S204: and taking the' more than two levels of domain names in each target domain name in the target aggregated domain name as separators to obtain each character string information corresponding to each target domain name.
Step S205: if the character string information contains other segmentation characters, segmenting the character string information according to the other segmentation characters to obtain a target character string; if the character string information does not include other separators, the character string information is directly used as the target character string.
In practical application, in the process of determining the target grammatical features of various target aggregated domain names, because the grammatical features of the domain names are experienced in the character strings of the domain names, more than two levels of domain names in each target domain name in the target aggregated domain names can be segmented according to preset segmentation symbols to obtain each target character string corresponding to each target domain name; and then, carrying out grammar mode extraction on the target character strings of all the target domain names to obtain target grammar features corresponding to the target aggregated domain names.
In practical application, the type of the preset separator may be determined according to actual needs, for example, because ". as" separator "is used between each level of the domain name, and the character strings corresponding to each level of the domain name reflect grammatical features of the domain name, for example, the second level domain name of the domain name bb.22.com is pure numbers, and the third level domain name is pure letters, in the process of obtaining each target character string corresponding to each target domain name by separating more than two levels of domain names in each target domain name in the target aggregation domain name according to the preset separator, the". as "separator may be used first to separate more than two levels of domain names in each target domain name in the target aggregation domain name, so as to obtain each character string information corresponding to each target domain name; specifically, the target domain name may be formed by splicing a plurality of characters, for example, the target domain name is server-111.abc.com, at this time, each character string information obtained by segmentation according to ". multidot." is server-111, abc, and server-111 is composed of characters and separators, and the grammatical feature is not the finest, at this time, the character string information may be segmented to obtain the target character string of the minimum unit, and then the grammatical feature of the finest granularity is obtained, that is, if the character string information includes other separators, the character string information may be segmented according to the other separators to obtain the target character string; if the character string information does not include other separators, the character string information is directly used as the target character string. At this time, the preset delimiter in the present application is composed of ". multidot..
It should be noted that the type of the other separators can be determined according to actual needs, for example, the type may be "-", "/", and the like, and the application is not limited in this application.
Step S206: and performing grammar mode extraction on the target character strings of the target domain names to obtain target grammar features corresponding to the target aggregated domain names.
Step S207: and determining target time sequence characteristics of various target aggregation domain names.
Step S208: and detecting whether the target domain name is a botnet domain name or not based on the target grammar characteristics and the target time sequence characteristics to obtain a corresponding detection result.
Referring to fig. 4, fig. 4 is a schematic diagram illustrating extraction of a target grammar feature in a domain name detection method according to an embodiment of the present application.
In the domain name detection method provided by the embodiment of the present application, in the process of extracting the grammar pattern of the target character string of each target domain name to obtain the target grammar features corresponding to the target aggregated domain name, the following steps may be specifically performed:
step S301: if all characters of the target character string are numbers, marking the grammar characteristics of the target character string as: the number of digits combined with a predetermined character.
In practical application, after the target character string is obtained, if all characters of the target character string are numbers, the grammatical features of the target character string can be marked as: the number of digits combined with a predetermined character. Assuming that the preset characters representing the target string as pure numbers are, the grammatical features of the target string with the content of 194820 can be labeled as: 6, etc.
Step S302: if the character in the target character string is a word or the length of the target character string is smaller than a first preset value, marking the grammar characteristics of the target character string as: the target character string.
In practical application, after the target character string is obtained, if the character in the target character string is a word or the length of the target character string is smaller than a first preset value, the grammatical feature of the target character string can be marked as: the target character string. That is, the target character string may be directly used as the corresponding grammar feature, and if the target character string is a keyword, the grammar feature of the target character string is the keyword, etc. In addition, the first preset value may be determined according to actual needs, for example, may be 6, and when the target character string is a non-character string with a length smaller than 6, the target character string may also be directly used as a corresponding grammatical feature, for example, when the target character string is yyds, the yyds may be directly used as the grammatical feature of the target character string, and the like.
Step S303: if the characters in the target character string are not words and the length of the target character string is greater than or equal to a first preset value, marking the grammar features of the target character string as: the character number value of the target string.
In practical application, after the target character string is obtained, if the character in the target character string is not a word and the length of the target character string is greater than or equal to a first preset value, the grammatical feature of the target character string is marked as: the character number value of the target string. For convenience of understanding, it is assumed that the target character string is abcdefg, the grammatical feature of the target character string may be 7, and it should be noted that the grammatical feature does not carry a preset character at this time, so that the grammatical feature without the preset character may be analyzed as the grammatical feature of the target character string with a character that is not a word and a length that is greater than or equal to the first preset value.
Step S304: combining the grammar characteristics of the target character string corresponding to the target domain name with the top-level domain name of the target domain name to obtain the target grammar characteristics of the target domain name.
In practical application, after the grammatical feature of the target character string is obtained, the grammatical feature of the target character string corresponding to the target domain name can be combined with the top-level domain name of the target domain name to obtain the target grammatical feature of the target domain name. Taking the target character string as 194820.com as an example, the target grammar characteristic is 6 × com; taking the target character string as a keyword.com as an example, the target grammar characteristic is the keyword.com; com, the target character string is abcdefg.com, the target grammar characteristic is 7.com, and the like.
Referring to fig. 5, fig. 5 is a flowchart illustrating a detection process of a botnet domain name in a domain name detection method according to an embodiment of the present application.
In the domain name detection method provided in the embodiment of the present application, in a process of detecting whether a target domain name is a botnet domain name based on a target grammar feature and a target timing feature, the following steps may be performed:
step S401: and aggregating the target domain names corresponding to the same target grammar characteristics to obtain various grammar aggregated domain names.
In practical application, in the process of detecting whether the target domain name is a botnet domain name or not based on the target grammatical feature and the target time sequence feature, the target domain names corresponding to the same target grammatical feature may be aggregated to obtain various grammar aggregated domain names. For example, for ease of understanding, assuming that the target domain names are anceq.1fdwrt.com, 1 jfhqk.jfuqrt.com, server-111.abc.com, and server-123.abc.com, the grammar-based aggregated domain names may be {6.6.com: [ anceq.1fdwrt.com, 1jfhqk.jfuqpo.com ], server-3. abc.com: [ server-111.abc.com, server-123.abc.com ] }.
It should be noted that, because the botnet domain names are generated in batches, a plurality of domain names may be included in a class of grammar aggregation domain names that can reflect the botnet domain names, for example, 13 domain names may be included in the grammar aggregation domain names of 6.6.com, and therefore, after aggregating the target domain names corresponding to the same target grammar features to obtain various grammar aggregation domain names, the botnet domain names may be preliminarily screened based on the number of domain names in the grammar aggregation domain names, so as to improve the method operation efficiency.
Step S402: and if the target time sequence characteristics corresponding to the first grammar sub-aggregation domain names represent that all three-level domain name modes under the second-level domain names appear on at least a first preset number of target devices, determining that the target domain names corresponding to the first grammar sub-aggregation domain names are zombie network domain names.
In practical application, in the process of performing botnet detection based on the grammar aggregation domain name, the grammar domain name may be classified and detected in a corresponding manner according to whether the second-level domain name is an unconfirmed word, that is, target domain names which are the same as the second-level domain name and are confirmed words in each type of grammar aggregation domain name are aggregated together to obtain various first grammar sub-aggregation domain names, and if a target time sequence characteristic corresponding to the first grammar sub-aggregation domain name represents that all three-level domain name modes subordinate to the second-level domain name appear on at least a first preset number of target devices, for example, a target time sequence characteristic indicates that domain name sets in all the subordinate 3-level domain name modes access the same 2 or more hosts, the target domain name corresponding to the first grammar sub-aggregation domain name is determined to be a botnet domain name.
Step S403: taking a target domain name with a secondary domain name of a non-confirmed word in each grammar aggregation domain name as an independent second grammar sub-aggregation domain name;
step S404: if the number of the target domain names in the second grammar sub-aggregation domain name is larger than a second preset number and the occurrence frequency of the target domain name represented by the target time sequence characteristics in the same target device continuous time is larger than a first preset frequency, or the number of the target domain names in the second grammar sub-aggregation domain name is larger than a second preset number and the occurrence frequency of the target domain name represented by the target time sequence characteristics in different time of the same target device is larger than the second preset frequency, or the number of the target domain names in the second grammar sub-aggregation domain name is larger than a second preset number and the occurrence frequency of the target domain name represented by the target time sequence characteristics at different time in a preset period of the same target device is larger than a third preset frequency, or the number of the target domain names in the second grammar sub-aggregation domain names is larger than a second preset number, and the target time sequence characteristics represent that all the target domain names appear in the same target device at the same time, determining that the target domain names in the second grammar sub-aggregation domain names are botnet domain names.
In practical application, for a target domain name of which the secondary domain name in each grammar aggregation domain name is a non-confirmed word, namely for each second grammar sub-aggregation domain name, the domain name is regarded as a botnet domain name and the like only by satisfying any one of the following four conditions. The four conditions are respectively as follows:
1. the number of the target domain names in the second grammar sub-aggregation domain names is greater than a second preset number, and the occurrence frequency of the target domain names represented by the target time sequence characteristics in the continuous time of the same target device is greater than a first preset frequency, for example, the domain names have 3 or more under certain grammar characteristics and appear for 2 times or more in the continuous time (within 3 s) of the same host;
2. the number of the target domain names in the second grammar sub-aggregation domain names is larger than a second preset number, and the occurrence times of the target domain names in different time of the same target device represented by the target time sequence characteristics are larger than the second preset times; for example, there are 3 or more domain names under a certain grammatical feature, and the domain name appears 5 times or more at different times in the same host;
3. the number of the target domain names in the second grammar sub-aggregation domain names is larger than a second preset number, and the occurrence times of the target domain names in different time in a preset period of the same target device represented by the target time sequence characteristics are larger than a third preset time; for example, there are 3 or more domain names under a certain grammar feature, and the domain name appears 20 times or more at different time with the host computer every day;
4. the number of the target domain names in the second grammar sub-aggregation domain names is larger than a second preset number, and the target time sequence characteristics represent that all the target domain names appear at the same target equipment at the same time; for example, there are 3 or more domain names under a certain grammar feature, and the domain names appear together at the same time with the host, etc.
Referring to fig. 6, fig. 6 is a schematic structural diagram of a domain name detection system according to an embodiment of the present disclosure.
The domain name detection system provided by the embodiment of the application can include:
a domain name determining module 101, configured to determine each target domain name accessed by each target device;
the aggregation module 102 is configured to aggregate the target domain names to obtain various target aggregated domain names;
the characteristic determining module 103 is used for determining target grammar characteristics and target time sequence characteristics of various target aggregation domain names;
the detection module 104 is configured to detect whether the target domain name is a botnet domain name based on the target grammar feature and the target timing feature, and obtain a corresponding detection result.
In an embodiment of the present application, a domain name detection system, an aggregation module may include:
the first aggregation unit is used for aggregating target domain names with the same top-level domain name to obtain various initial aggregation domain names;
and the second aggregation unit is used for aggregating the target domain names with the same length in all levels of domain names in each initial aggregation domain name to obtain the target aggregation domain name.
In an embodiment of the present invention, a domain name detection system, a feature determining module may include:
the first segmentation unit is used for segmenting more than two levels of domain names in each target domain name in the target aggregation domain name according to preset segmentation symbols to obtain each target character string corresponding to each target domain name;
and the first extraction unit is used for carrying out grammar mode extraction on the character string information of each target domain name to obtain target grammar characteristics corresponding to the target aggregation domain name.
In the domain name detection system provided in the embodiment of the present application, the first segmentation unit may be specifically configured to; dividing more than two levels of domain names in each target domain name in the target aggregated domain name by using the 'one' as a separator to obtain each character string information corresponding to each target domain name; if the character string information contains other segmentation characters, segmenting the character string information according to the other segmentation characters to obtain a target character string; if the character string information does not include other separators, the character string information is directly used as the target character string.
In the domain name detection system provided in the embodiment of the present application, the first extraction unit may be specifically configured to: if all characters of the target character string are numbers, marking the grammar characteristics of the target character string as: the combination of the number of the digits and the preset characters; if the character in the target character string is a word or the length of the target character string is smaller than a first preset value, marking the grammar characteristics of the target character string as: a target character string; if the characters in the target character string are not words and the length of the target character string is greater than or equal to a first preset value, marking the grammar features of the target character string as: the character number value of the target character string; combining the grammar characteristics of the target character string corresponding to the target domain name with the top-level domain name of the target domain name to obtain the target grammar characteristics of the target domain name.
In an embodiment of the present application, a domain name detection system, a detection module may include:
the third aggregation unit is used for aggregating the target domain names corresponding to the same target grammar characteristics to obtain various grammar aggregation domain names;
the first detection unit is used for aggregating target domain names which are identical in the secondary domain name and are confirmed words in each grammar aggregation domain name to obtain various first grammar sub-aggregation domain names, and if target time sequence characteristics corresponding to the first grammar sub-aggregation domain names represent that all three-level domain name modes subordinate to the secondary domain name appear on at least a first preset number of target devices, determining that the target domain name corresponding to the first grammar sub-aggregation domain name is a botnet domain name;
the first determining unit is used for taking a target domain name with a secondary domain name of a non-confirmed word in each grammar aggregation domain name as an independent second grammar sub-aggregation domain name;
a second detecting unit, configured to, if the number of the target domain names in the second grammar sub-aggregation domain name is greater than a second preset number and the occurrence frequency of the target time-series characteristic representation target domain name in the same target device in continuous time is greater than the first preset number, or the number of the target domain names in the second grammar sub-aggregation domain name is greater than the second preset number and the occurrence frequency of the target time-series characteristic representation target domain name in the same target device in different time is greater than the second preset number, or the number of the target domain names in the second grammar sub-aggregation domain name is greater than the second preset number and the occurrence frequency of the target time-series characteristic representation target domain name in different time in a preset period of the same target device is greater than a third preset number, or the number of the target domain names in the second grammar sub-aggregation domain name is greater than the second preset number and the target time-series characteristic representation target domain names simultaneously occur in the same target device, then the target domain name in the second grammar sub-aggregation domain name is determined to be the botnet domain name.
The domain name detection system provided in the embodiment of the present application may further include:
the false alarm reduction module is used for detecting whether the target domain name is a botnet domain name or not by the detection module based on the target grammar characteristics and the target time sequence characteristics, and carrying out false alarm reduction processing on the domain name based on a preset Chinese lexical library and a white domain name library after obtaining a corresponding detection result; the white domain name library is a domain name library for storing safe domain names.
Based on the hardware implementation of the program module, and in order to implement the method according to the embodiment of the present invention, an embodiment of the present invention further provides an electronic device, fig. 7 is a schematic diagram of a hardware composition structure of the electronic device according to the embodiment of the present invention, and as shown in fig. 7, the electronic device includes:
a communication interface 1 capable of information interaction with other devices such as network devices and the like;
and the processor 2 is connected with the communication interface 1 to realize information interaction with other equipment, and is used for executing the user operation processing method provided by one or more technical schemes when running a computer program. And the computer program is stored on the memory 3.
In practice, of course, the various components in the electronic device are coupled together by the bus system 4. It will be appreciated that the bus system 4 is used to enable connection communication between these components. The bus system 4 comprises, in addition to a data bus, a power bus, a control bus and a status signal bus. For the sake of clarity, however, the various buses are labeled as bus system 4 in fig. 7.
The memory 3 in the embodiment of the present invention is used to store various types of data to support the operation of the electronic device. Examples of such data include: any computer program for operating on an electronic device.
It will be appreciated that the memory 3 may be either volatile memory or nonvolatile memory, and may include both volatile and nonvolatile memory. Among them, the nonvolatile Memory may be a Read Only Memory (ROM), a Programmable Read Only Memory (PROM), an Erasable Programmable Read-Only Memory (EPROM), an Electrically Erasable Programmable Read-Only Memory (EEPROM), a magnetic random access Memory (FRAM), a Flash Memory (Flash Memory), a magnetic surface Memory, an optical disk, or a Compact Disc Read-Only Memory (CD-ROM); the magnetic surface storage may be disk storage or tape storage. Volatile Memory can be Random Access Memory (RAM), which acts as external cache Memory. By way of illustration and not limitation, many forms of RAM are available, such as Static Random Access Memory (SRAM), Synchronous Static Random Access Memory (SSRAM), Dynamic Random Access Memory (DRAM), Synchronous Dynamic Random Access Memory (SDRAM), Double Data Rate Synchronous Dynamic Random Access Memory (DDRSDRAM), Enhanced Synchronous Dynamic Random Access Memory (ESDRAM), Enhanced Synchronous Dynamic Random Access Memory (Enhanced DRAM), Synchronous Dynamic Random Access Memory (SLDRAM), Direct Memory (DRmb Access), and Random Access Memory (DRAM). The memory 2 described in the embodiments of the present invention is intended to comprise, without being limited to, these and any other suitable types of memory.
The method disclosed by the above embodiment of the present invention can be applied to the processor 2, or implemented by the processor 2. The processor 2 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware or instructions in the form of software in the processor 2. The processor 2 described above may be a general purpose processor, a DSP, or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, or the like. The processor 2 may implement or perform the methods, steps, and logic blocks disclosed in embodiments of the present invention. A general purpose processor may be a microprocessor or any conventional processor or the like. The steps of the method disclosed by the embodiment of the invention can be directly implemented by a hardware decoding processor, or can be implemented by combining hardware and software modules in the decoding processor. The software modules may be located in a storage medium located in the memory 3, and the processor 2 reads the program in the memory 3 and in combination with its hardware performs the steps of the aforementioned method.
When the processor 2 executes the program, the corresponding processes in the methods according to the embodiments of the present invention are realized, and for brevity, are not described herein again.
In an exemplary embodiment, the present invention further provides a storage medium, i.e. a computer storage medium, in particular a computer readable storage medium, for example comprising a memory 3 storing a computer program, which is executable by a processor 2 to perform the steps of the aforementioned method. The computer readable storage medium may be Memory such as FRAM, ROM, PROM, EPROM, EEPROM, Flash Memory, magnetic surface Memory, optical disk, or CD-ROM.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus, terminal and method may be implemented in other manners. The above-described device embodiments are only illustrative, for example, the division of the unit is only one logical function division, and there may be other division ways in actual implementation, such as: multiple units or components may be combined, or may be integrated into another system, or some features may be omitted, or not implemented. In addition, the coupling, direct coupling or communication connection between the components shown or discussed may be through some interfaces, and the indirect coupling or communication connection between the devices or units may be electrical, mechanical or other forms.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed on a plurality of network units; some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, all the functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may be separately regarded as one unit, or two or more units may be integrated into one unit; the integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional unit.
Those of ordinary skill in the art will understand that: all or part of the steps for implementing the method embodiments may be implemented by hardware related to program instructions, and the program may be stored in a computer readable storage medium, and when executed, the program performs the steps including the method embodiments; and the aforementioned storage medium includes: a removable storage device, a ROM, a RAM, a magnetic or optical disk, or various other media that can store program code.
Alternatively, the integrated unit of the present invention may be stored in a computer-readable storage medium if it is implemented in the form of a software functional module and sold or used as a separate product. Based on such understanding, the technical solutions of the embodiments of the present invention may be essentially implemented or a part contributing to the prior art may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for enabling an electronic device (which may be a personal computer, a server, or a network device) to execute all or part of the methods described in the embodiments of the present invention. And the aforementioned storage medium includes: a removable storage device, a ROM, a RAM, a magnetic or optical disk, or various other media that can store program code.
For a description of a relevant part in the domain name detection system, the domain name detection device, and the computer readable storage medium provided in the embodiments of the present application, reference is made to detailed descriptions of a corresponding part in the domain name detection method provided in the embodiments of the present application, and details are not repeated here. In addition, parts of the above technical solutions provided in the embodiments of the present application, which are consistent with the implementation principles of corresponding technical solutions in the prior art, are not described in detail so as to avoid redundant description.
It is further noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
Claims (10)
1.A domain name detection method is characterized by comprising the following steps:
determining each target domain name accessed by each target device;
aggregating the target domain names to obtain various target aggregated domain names;
determining target grammar characteristics and target time sequence characteristics of each type of target aggregation domain name;
and detecting whether the target domain name is a botnet domain name or not based on the target grammar characteristics and the target time sequence characteristics to obtain a corresponding detection result.
2. The method according to claim 1, wherein the aggregating the target domain names to obtain various types of target aggregated domain names comprises:
aggregating the target domain names with the same top-level domain name to obtain various initial aggregation domain names;
and aggregating the target domain names with the same length of each level of domain name in each initial aggregation domain name to obtain the target aggregation domain name.
3. The method of claim 2, wherein determining the target grammatical features of each class of the target aggregated domain name comprises:
dividing more than two levels of domain names in each target domain name in the target aggregated domain name according to preset dividing characters to obtain each target character string corresponding to each target domain name;
and performing grammar pattern extraction on the target character string of each target domain name to obtain the target grammar features corresponding to the target aggregated domain name.
4. The method according to claim 3, wherein the extracting the target character string of each target domain name in the grammar pattern to obtain the target grammar feature corresponding to the target aggregated domain name at least includes one of the following processing modes:
if all characters of the target character string are numbers, marking the grammar features of the target character string as: the combination of the number of the digits and the preset characters;
if the characters in the target character string are words or the length of the target character string is smaller than a first preset value, marking the grammar features of the target character string as: the target character string;
if the characters in the target character string are not words and the length of the target character string is greater than or equal to the first preset value, marking the grammatical features of the target character string as: the character number value of the target character string;
combining the grammar feature of the target character string corresponding to the target domain name with the top-level domain name of the target domain name to obtain the target grammar feature of the target domain name.
5. The method of claim 4, wherein the detecting whether the target domain name is a botnet domain name based on the target grammatical features and the target timing features comprises:
aggregating the target domain names corresponding to the same target grammar features to obtain various grammar aggregated domain names;
aggregating the target domain names which are identical in the secondary domain name and are confirmed words in each type of grammar aggregation domain names to obtain various types of first grammar sub-aggregation domain names;
if the target time sequence characteristics corresponding to the first grammar sub-aggregation domain name indicate that all three-level domain name modes belonging to the second-level domain name appear on at least a first preset number of target devices, determining that the target domain name corresponding to the first grammar sub-aggregation domain name is a botnet domain name.
6. The method of claim 4, wherein the detecting whether the target domain name is a botnet domain name based on the target grammatical features and the target timing features comprises:
aggregating the target domain names corresponding to the same target grammar features to obtain various grammar aggregated domain names;
taking the target domain name with the secondary domain name of a non-confirmed word in each type of the grammar aggregation domain names as an independent second grammar sub-aggregation domain name;
if one of the following conditions exists, the target domain name in the second grammar sub-aggregation domain name is a botnet domain name:
the number of the target domain names in the second grammar sub-aggregation domain names is greater than a second preset number, the target time sequence characteristic represents that a first occurrence number is greater than a first preset number, and the first occurrence number is the occurrence number of the target domain names in the same target equipment continuous time;
the number of the target domain names in the second grammar sub-aggregation domain names is greater than the second preset number, and the target time sequence characteristic represents that the second occurrence number is greater than the second preset number, wherein the second occurrence number is the occurrence number of the target domain names in the same target equipment at different time;
the number of the target domain names in the second grammar sub-aggregation domain names is greater than the second preset number, and the target time sequence characteristic represents that a third occurrence number is greater than a third preset number, wherein the third occurrence number is the occurrence number of the target domain names at different times in a preset period of the same target device;
the number of the target domain names in the second grammar sub-aggregation domain name is larger than the second preset number, and the target time sequence characteristics represent that all the target domain names appear in the same target device at the same time.
7. The method according to any one of claims 1 to 6, wherein the detecting whether the target domain name is a botnet domain name based on the target grammar features and the target time sequence features further includes, after obtaining a corresponding detection result:
carrying out false alarm reduction processing on the domain name based on a preset Chinese lexical library and a white domain name library;
the white domain name library is a domain name library for storing safe domain names.
8. A domain name detection system, comprising:
the domain name determining module is used for determining each target domain name accessed by each target device;
the aggregation module is used for aggregating the target domain names to obtain various target aggregated domain names;
the characteristic determining module is used for determining target grammar characteristics and target time sequence characteristics of various target aggregation domain names;
and the detection module is used for detecting whether the target domain name is a botnet domain name or not based on the target grammar features and the target time sequence features to obtain a corresponding detection result.
9. An electronic device, comprising:
a memory for storing a computer program;
a processor for implementing the steps of the domain name detection method according to any one of claims 1 to 7 when executing the computer program.
10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the domain name detection method according to any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111676182.0A CN114189390B (en) | 2021-12-31 | 2021-12-31 | Domain name detection method, system, equipment and computer readable storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111676182.0A CN114189390B (en) | 2021-12-31 | 2021-12-31 | Domain name detection method, system, equipment and computer readable storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114189390A true CN114189390A (en) | 2022-03-15 |
CN114189390B CN114189390B (en) | 2024-07-09 |
Family
ID=80606597
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111676182.0A Active CN114189390B (en) | 2021-12-31 | 2021-12-31 | Domain name detection method, system, equipment and computer readable storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114189390B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115051861A (en) * | 2022-06-17 | 2022-09-13 | 北京天融信网络安全技术有限公司 | Domain name detection method, device, system and medium |
US20230362179A1 (en) * | 2022-05-09 | 2023-11-09 | IronNet Cybersecurity, Inc. | Automatic identification of algorithmically generated domain families |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101692639A (en) * | 2009-09-15 | 2010-04-07 | 西安交通大学 | Bad webpage recognition method based on URL |
US8578493B1 (en) * | 2011-05-10 | 2013-11-05 | Narus, Inc. | Botnet beacon detection |
CN111353109A (en) * | 2020-03-04 | 2020-06-30 | 深信服科技股份有限公司 | Malicious domain name identification method and system |
CN112839012A (en) * | 2019-11-22 | 2021-05-25 | 中国移动通信有限公司研究院 | Zombie program domain name identification method, device, equipment and storage medium |
WO2021136315A1 (en) * | 2019-12-31 | 2021-07-08 | 论客科技(广州)有限公司 | Mail classification method and apparatus based on conjoint analysis of behavior structures and semantic content |
CN113315851A (en) * | 2021-04-23 | 2021-08-27 | 北京奇虎科技有限公司 | Domain name detection method, device and storage medium |
WO2021169730A1 (en) * | 2020-02-25 | 2021-09-02 | 深信服科技股份有限公司 | Method and device for data processing, and storage medium |
-
2021
- 2021-12-31 CN CN202111676182.0A patent/CN114189390B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101692639A (en) * | 2009-09-15 | 2010-04-07 | 西安交通大学 | Bad webpage recognition method based on URL |
US8578493B1 (en) * | 2011-05-10 | 2013-11-05 | Narus, Inc. | Botnet beacon detection |
CN112839012A (en) * | 2019-11-22 | 2021-05-25 | 中国移动通信有限公司研究院 | Zombie program domain name identification method, device, equipment and storage medium |
WO2021136315A1 (en) * | 2019-12-31 | 2021-07-08 | 论客科技(广州)有限公司 | Mail classification method and apparatus based on conjoint analysis of behavior structures and semantic content |
WO2021169730A1 (en) * | 2020-02-25 | 2021-09-02 | 深信服科技股份有限公司 | Method and device for data processing, and storage medium |
CN113381962A (en) * | 2020-02-25 | 2021-09-10 | 深信服科技股份有限公司 | Data processing method, device and storage medium |
CN111353109A (en) * | 2020-03-04 | 2020-06-30 | 深信服科技股份有限公司 | Malicious domain name identification method and system |
CN113315851A (en) * | 2021-04-23 | 2021-08-27 | 北京奇虎科技有限公司 | Domain name detection method, device and storage medium |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20230362179A1 (en) * | 2022-05-09 | 2023-11-09 | IronNet Cybersecurity, Inc. | Automatic identification of algorithmically generated domain families |
CN115051861A (en) * | 2022-06-17 | 2022-09-13 | 北京天融信网络安全技术有限公司 | Domain name detection method, device, system and medium |
CN115051861B (en) * | 2022-06-17 | 2024-01-23 | 北京天融信网络安全技术有限公司 | Domain name detection method, device, system and medium |
Also Published As
Publication number | Publication date |
---|---|
CN114189390B (en) | 2024-07-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Zhao et al. | Malicious Domain Names Detection Algorithm Based on N‐Gram | |
EP3905624B1 (en) | Botnet domain name family detecting method, device, and storage medium | |
CN114189390A (en) | Domain name detection method, system, equipment and computer readable storage medium | |
CN112769775B (en) | Threat information association analysis method, system, equipment and computer medium | |
CN109379390B (en) | Network security baseline generation method based on full flow | |
CN114363062B (en) | Domain name detection method, system, equipment and computer readable storage medium | |
CN112272186A (en) | Network flow detection framework, method, electronic equipment and storage medium | |
CN113179260B (en) | Botnet detection method, device, equipment and medium | |
CN112818307A (en) | User operation processing method, system, device and computer readable storage medium | |
CN108156127B (en) | Network attack mode judging device, judging method and computer readable storage medium thereof | |
CN112583827B (en) | Data leakage detection method and device | |
US20240080330A1 (en) | Security monitoring apparatus, security monitoring method, and computer readable medium | |
WO2024068238A1 (en) | Malicious domain name detection | |
CN113315739A (en) | Malicious domain name detection method and system | |
WO2016173327A1 (en) | Method and device for detecting website attack | |
CN108650274B (en) | Network intrusion detection method and system | |
CN113329035B (en) | Method and device for detecting attack domain name, electronic equipment and storage medium | |
CN115001724B (en) | Network threat intelligence management method, device, computing equipment and computer readable storage medium | |
TWI777766B (en) | System and method of malicious domain query behavior detection | |
CN114363060A (en) | Domain name detection method, system, equipment and computer readable storage medium | |
CN115442109A (en) | Method, device, equipment and storage medium for determining network attack result | |
CN114900375A (en) | Malicious threat detection method based on AI graph analysis | |
CN115296849A (en) | Associated alarm method and system, storage medium and electronic equipment | |
CN111371917B (en) | Domain name detection method and system | |
Upadhyay et al. | Feature extraction approach to unearth domain generating algorithms (DGAS) |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |