CN110798481A - Malicious domain name detection method and device based on deep learning - Google Patents

Malicious domain name detection method and device based on deep learning Download PDF

Info

Publication number
CN110798481A
CN110798481A CN201911084930.9A CN201911084930A CN110798481A CN 110798481 A CN110798481 A CN 110798481A CN 201911084930 A CN201911084930 A CN 201911084930A CN 110798481 A CN110798481 A CN 110798481A
Authority
CN
China
Prior art keywords
domain name
detected
deep learning
information
learning model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911084930.9A
Other languages
Chinese (zh)
Inventor
仝哲
范渊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Dbappsecurity Technology Co Ltd
Original Assignee
Hangzhou Dbappsecurity Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Dbappsecurity Technology Co Ltd filed Critical Hangzhou Dbappsecurity Technology Co Ltd
Priority to CN201911084930.9A priority Critical patent/CN110798481A/en
Publication of CN110798481A publication Critical patent/CN110798481A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Machine Translation (AREA)

Abstract

The invention provides a malicious domain name detection method and device based on deep learning, which relate to the technical field of network security and comprise the following steps: acquiring a domain name to be detected; analyzing the domain name to be detected to obtain message information of the domain name to be detected; processing the message information of the domain name to be detected based on a natural language processing algorithm and a text feature extraction algorithm to obtain feature information of the domain name to be detected; the characteristic information is input into a deep learning model to obtain a detection result, wherein the detection result represents whether the domain name to be detected is a malicious domain name, the deep learning model is a learning model constructed based on a convolutional neural network and a full connection layer, and the technical problem that the accuracy rate of detecting whether the domain name to be detected is the malicious domain name is low in the existing domain name detection method is solved.

Description

Malicious domain name detection method and device based on deep learning
Technical Field
The invention relates to the technical field of network security, in particular to a malicious domain name detection method and device based on deep learning.
Background
With the development of the internet, thousands of domain names are registered every day, and how to detect malicious domain names from massive domain names becomes an important matter for network attack detection and defense. However, the detection technology commonly used at present is mainly based on a regular expression and a white list, and has the problem of high false alarm rate.
No effective solution has been proposed to the above problems.
Disclosure of Invention
In view of this, the present invention provides a malicious domain name detection method and apparatus based on deep learning, so as to alleviate the technical problem that the accuracy rate of detecting whether a domain name to be detected is a malicious domain name is low in the existing domain name detection method.
In a first aspect, an embodiment of the present invention provides a malicious domain name detection method based on deep learning, including: acquiring a domain name to be detected; analyzing the domain name to be detected to obtain message information of the domain name to be detected; processing the message information of the domain name to be detected based on a natural language processing algorithm and a text feature extraction algorithm to obtain feature information of the domain name to be detected; and inputting the characteristic information into a deep learning model to obtain a detection result, wherein the detection result represents whether the domain name to be detected is a malicious domain name, and the deep learning model is a learning model constructed based on a convolutional neural network and a full connection layer.
Further, processing the message information of the domain name to be detected based on a natural language processing algorithm and a text feature extraction algorithm to obtain feature information of the domain name to be detected, including: segmenting the domain name to be detected to obtain a triple of the domain name to be detected; processing the triples based on the natural language processing algorithm to obtain target triples; and processing the target triple based on a text feature extraction algorithm to obtain the feature information of the domain name to be detected.
Further, the feature information includes: domain name lexical characteristic information and domain name network characteristic information; based on a text feature extraction algorithm, processing the target triple to obtain feature information of the domain name to be detected, wherein the method comprises the following steps: processing the target triple based on a domain name lexical feature extraction algorithm to obtain domain name lexical feature information; and processing the target triple based on a domain name network feature extraction algorithm to obtain domain name network feature information.
Further, the message information includes: DNS inquires message information and response message information.
Further, the method further comprises constructing the deep learning model by: obtaining a plurality of sample domain names, wherein the sample domain names comprise legal domain names and malicious domain names; analyzing each sample domain name to obtain message information of each sample domain name; processing the message information of each sample domain name based on a natural language processing algorithm and a text feature extraction algorithm to obtain feature information of each sample domain name: inputting the characteristic information of the plurality of sample domain names into an initial deep learning model, and training the initial deep learning model to obtain the deep learning model.
In a second aspect, an embodiment of the present invention further provides a malicious domain name detection apparatus based on deep learning, including: the domain name detection device comprises an acquisition unit, an analysis unit, an extraction unit and a detection unit, wherein the acquisition unit is used for acquiring a domain name to be detected; the analysis unit is used for analyzing the domain name to be detected to obtain message information of the domain name to be detected; the extraction unit is used for processing the message information of the domain name to be detected based on a natural language processing algorithm and a text feature extraction algorithm to obtain the feature information of the domain name to be detected; the detection unit is used for inputting the characteristic information into a deep learning model to obtain a detection result, wherein the detection result represents whether the domain name to be detected is a malicious domain name, and the deep learning model is a learning model constructed based on a convolutional neural network and a full connection layer.
Further, the extraction unit is further configured to: segmenting the domain name to be detected to obtain a triple of the domain name to be detected; processing the triples based on the natural language processing algorithm to obtain target triples; and processing the target triple based on a text feature extraction algorithm to obtain the feature information of the domain name to be detected.
Further, the feature information includes: domain name lexical characteristic information and domain name network characteristic information; the extraction unit is also used for processing the target triple based on a domain name lexical feature extraction algorithm to obtain domain name lexical feature information; and processing the target triple based on a domain name network feature extraction algorithm to obtain domain name network feature information.
Further, the message information includes: DNS inquires message information and response message information.
Further, the apparatus further comprises: a training unit to: obtaining a plurality of sample domain names, wherein the sample domain names comprise legal domain names and malicious domain names; analyzing each sample domain name to obtain message information of each sample domain name; processing the message information of each sample domain name based on a natural language processing algorithm and a text feature extraction algorithm to obtain feature information of each sample domain name: inputting the characteristic information of the plurality of sample domain names into an initial deep learning model, and training the initial deep learning model to obtain the deep learning model.
In the embodiment of the invention, firstly, a domain name to be detected is obtained; then, analyzing the domain name to be detected to obtain message information of the domain name to be detected; then, processing the message information of the domain name to be detected based on a natural language processing algorithm and a text feature extraction algorithm to obtain feature information of the domain name to be detected; and finally, inputting the characteristic information into a learning model constructed based on the convolutional neural network and the full connection layer to obtain a detection result, wherein the learning model constructed based on the convolutional neural network and the full connection layer has higher accuracy rate for domain name detection, so that the domain name to be detected is detected through the learning model constructed based on the convolutional neural network and the full connection layer, the aim of improving the accuracy rate of domain name detection is fulfilled, the technical problem that the accuracy rate of detecting whether the domain name to be detected is a malicious domain name is lower in the existing domain name detection method is solved, and the technical effect of improving the accuracy rate of detecting whether the domain name to be detected is the malicious domain name is realized.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
In order to make the aforementioned and other objects, features and advantages of the present invention comprehensible, preferred embodiments accompanied with figures are described in detail below.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.
Fig. 1 is a flowchart of a malicious domain name detection method based on deep learning according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a deep learning model training method according to an embodiment of the present invention;
fig. 3 is a schematic diagram of a malicious domain name detection apparatus based on deep learning according to an embodiment of the present invention;
fig. 4 is a schematic diagram of a server according to an embodiment of the present invention.
Detailed Description
To make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is apparent that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The first embodiment is as follows:
according to an embodiment of the present invention, there is provided an embodiment of a malicious domain name detection method based on deep learning, it should be noted that the steps illustrated in the flowchart of the drawings may be executed in a computer system such as a set of computer executable instructions, and although a logical order is illustrated in the flowchart, in some cases, the steps illustrated or described may be executed in an order different from that herein.
Fig. 1 is a flowchart of a malicious domain name detection method based on deep learning according to an embodiment of the present invention, as shown in fig. 1, the method includes the following steps:
step S102, acquiring a domain name to be detected;
step S104, analyzing the domain name to be detected to obtain message information of the domain name to be detected;
step S106, processing the message information of the domain name to be detected based on a natural language processing algorithm and a text feature extraction algorithm to obtain feature information of the domain name to be detected;
step S108, inputting the characteristic information into a deep learning model to obtain a detection result, wherein the detection result represents whether the domain name to be detected is a malicious domain name, and the deep learning model is a learning model constructed based on a convolutional neural network and a full connection layer.
In the embodiment of the invention, firstly, a domain name to be detected is obtained; then, analyzing the domain name to be detected to obtain message information of the domain name to be detected; then, processing the message information of the domain name to be detected based on a natural language processing algorithm and a text feature extraction algorithm to obtain feature information of the domain name to be detected; and finally, inputting the characteristic information into a learning model constructed based on the convolutional neural network and the full connection layer to obtain a detection result, wherein the learning model constructed based on the convolutional neural network and the full connection layer has higher accuracy rate for domain name detection, so that the domain name to be detected is detected through the learning model constructed based on the convolutional neural network and the full connection layer, the aim of improving the accuracy rate of domain name detection is fulfilled, the technical problem that the accuracy rate of detecting whether the domain name to be detected is a malicious domain name is lower in the existing domain name detection method is solved, and the technical effect of improving the accuracy rate of detecting whether the domain name to be detected is the malicious domain name is realized.
It should be noted that the message information includes: DNS inquires message information and response message information.
In this embodiment of the present invention, step S106 further includes the following steps:
step S11, the domain name to be detected is segmented to obtain a triple of the domain name to be detected;
step S12, processing the triples based on the natural language processing algorithm to obtain target triples;
and step S13, processing the target triple based on a text feature extraction algorithm to obtain the feature information of the domain name to be detected.
In the embodiment of the present invention, it should be noted that the characteristic information includes: domain name lexical characteristic information and domain name network characteristic information.
After the domain name to be detected is obtained, the domain name to be detected is segmented into triples.
Com "may be converted to < '> goo', 'oog', 'ogl', 'gle', 'le.', 'e.c', 'co', 'com' >, for example, and then vectorized using a word embedding algorithm in natural language processing techniques.
Then, extracting features from the DNS query message and the response message obtained by analysis by using a text feature extraction technology, and constructing a domain name algorithm based on a lexical feature algorithm and network attributes to extract feature information of the domain name to be detected, wherein the lexical special diagnosis information of the domain name comprises: the length of the domain name to be detected, the number of separators in the domain name to be detected, the proportion of the number in the domain name to be detected to the total length, the number of special characters in the domain name to be detected, the maximum length among the separators of the domain name to be detected and the like; the domain name network characteristic information comprises: TTL (Time To Live) average value, response type, number of response values, and the like.
By extracting a plurality of characteristic information of the domain name to be detected, resolving the domain name to be detected by using a natural language processing technology and matching with a deep learning model for detection, the accuracy of detection is improved, and the method has strong practicability.
In the embodiment of the present invention, as shown in fig. 2, the deep learning model is constructed by the following steps:
step S202, obtaining a plurality of sample domain names, wherein the sample domain names comprise legal domain names and malicious domain names;
step S204, analyzing each sample domain name to obtain message information of each sample domain name;
step S206, based on a natural language processing algorithm and a text feature extraction algorithm, processing the message information of each sample domain name to obtain the feature information of each sample domain name:
step S208, inputting the characteristic information of the plurality of sample domain names into an initial deep learning model, and training the initial deep learning model to obtain the deep learning model.
In the embodiment of the invention, a sufficient number of legal domain names and malicious domain names are obtained through an open source channel and are obtained after screening, the legal domain names and the malicious domain names form a sample domain name, the legal domain name is used as a positive sample, and the malicious domain name is used as a negative sample.
In addition, the plurality of sample domain names can be divided into two parts, namely training samples and testing samples, after the initial deep learning model completes training through the training samples, the detection accuracy of the deep learning model is detected through the testing samples, if the accuracy of the detection result is low, the training samples are obtained again to train the deep learning model until the accuracy of the detection result meets an expected target, and therefore the accuracy of detecting whether the domain name to be detected is the malicious domain name is improved.
After obtaining a plurality of sample domain names, analyzing each sample domain name respectively to obtain message information of each sample domain name.
Then, each sample domain name is analyzed to obtain the message information of each sample domain name.
By acquiring a large number of sample Domain names and analyzing each sample Domain Name, DNS (Domain Name System) query message information and response message information of each sample are obtained.
The initial deep learning model is trained by utilizing massive DNS query message information and response message information, so that the detection accuracy of the initial deep learning model can be effectively improved, and the accuracy of detecting whether the domain name to be detected is a malicious domain name is improved.
And finally, inputting the characteristic information of the plurality of sample domain names into the initial deep learning model, and training the initial deep learning model to obtain the deep learning model.
Example two:
the invention further provides an embodiment of a malicious domain name detection device based on deep learning, which is used for executing the malicious domain name detection method based on deep learning provided by the embodiment of the invention.
As shown in fig. 3, the malicious domain name detection apparatus based on deep learning includes: an acquisition unit 10, an analysis unit 20, an extraction unit 30 and a detection unit 40.
The acquiring unit 10 is configured to acquire a domain name to be detected;
the analyzing unit 20 is configured to analyze the domain name to be detected to obtain message information of the domain name to be detected;
the extraction unit 30 is configured to process the message information of the domain name to be detected based on a natural language processing algorithm and a text feature extraction algorithm, so as to obtain feature information of the domain name to be detected;
the detection unit 40 is configured to input the feature information into a deep learning model to obtain a detection result, where the detection result represents whether the domain name to be detected is a malicious domain name, and the deep learning model is a learning model constructed based on a convolutional neural network and a full connection layer.
In the embodiment of the invention, firstly, a domain name to be detected is obtained; then, analyzing the domain name to be detected to obtain message information of the domain name to be detected; then, processing the message information of the domain name to be detected based on a natural language processing algorithm and a text feature extraction algorithm to obtain feature information of the domain name to be detected; and finally, inputting the characteristic information into a learning model constructed based on the convolutional neural network and the full connection layer to obtain a detection result, wherein the learning model constructed based on the convolutional neural network and the full connection layer has higher accuracy rate for domain name detection, so that the domain name to be detected is detected through the learning model constructed based on the convolutional neural network and the full connection layer, the aim of improving the accuracy rate of domain name detection is fulfilled, the technical problem that the accuracy rate of detecting whether the domain name to be detected is a malicious domain name is lower in the existing domain name detection method is solved, and the technical effect of improving the accuracy rate of detecting whether the domain name to be detected is the malicious domain name is realized.
Preferably, the extraction unit is further configured to: segmenting the domain name to be detected to obtain a triple of the domain name to be detected; processing the triples based on the natural language processing algorithm to obtain target triples; and processing the target triple based on a text feature extraction algorithm to obtain the feature information of the domain name to be detected.
Preferably, the feature information includes: domain name lexical characteristic information and domain name network characteristic information; the extraction unit is also used for processing the target triple based on a domain name lexical feature extraction algorithm to obtain domain name lexical feature information; and processing the target triple based on a domain name network feature extraction algorithm to obtain domain name network feature information.
Preferably, the message information includes: DNS inquires message information and response message information.
Preferably, the apparatus further comprises: a training unit to: obtaining a plurality of sample domain names, wherein the sample domain names comprise legal domain names and malicious domain names; analyzing each sample domain name to obtain message information of each sample domain name; processing the message information of each sample domain name based on a natural language processing algorithm and a text feature extraction algorithm to obtain feature information of each sample domain name: inputting the characteristic information of the plurality of sample domain names into an initial deep learning model, and training the initial deep learning model to obtain the deep learning model.
Referring to fig. 4, an embodiment of the present invention further provides a server 100, including: the device comprises a processor 50, a memory 51, a bus 52 and a communication interface 53, wherein the processor 50, the communication interface 53 and the memory 51 are connected through the bus 52; the processor 50 is arranged to execute executable modules, such as computer programs, stored in the memory 51.
The Memory 51 may include a high-speed Random Access Memory (RAM) and may also include a non-volatile Memory (non-volatile Memory), such as at least one disk Memory. The communication connection between the network element of the system and at least one other network element is realized through at least one communication interface 53 (which may be wired or wireless), and the internet, a wide area network, a local network, a metropolitan area network, and the like can be used.
The bus 52 may be an ISA bus, PCI bus, EISA bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one double-headed arrow is shown in FIG. 4, but that does not indicate only one bus or one type of bus.
The memory 51 is used for storing a program, the processor 50 executes the program after receiving an execution instruction, and the method executed by the apparatus defined by the flow process disclosed in any of the foregoing embodiments of the present invention may be applied to the processor 50, or implemented by the processor 50.
The processor 50 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware or instructions in the form of software in the processor 50. The Processor 50 may be a general-purpose Processor, and includes a Central Processing Unit (CPU), a Network Processor (NP), and the like; the device can also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field-Programmable Gate Array (FPGA) or other Programmable logic device, a discrete Gate or transistor logic device, or a discrete hardware component. The various methods, steps and logic blocks disclosed in the embodiments of the present invention may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present invention may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in the memory 51, and the processor 50 reads the information in the memory 51 and completes the steps of the method in combination with the hardware thereof.
In addition, in the description of the embodiments of the present invention, unless otherwise explicitly specified or limited, the terms "mounted," "connected," and "connected" are to be construed broadly, e.g., as meaning either a fixed connection, a removable connection, or an integral connection; can be mechanically or electrically connected; they may be connected directly or indirectly through intervening media, or they may be interconnected between two elements. The specific meanings of the above terms in the present invention can be understood in specific cases to those skilled in the art.
In the description of the present invention, it should be noted that the terms "center", "upper", "lower", "left", "right", "vertical", "horizontal", "inner", "outer", etc., indicate orientations or positional relationships based on the orientations or positional relationships shown in the drawings, and are only for convenience of description and simplicity of description, but do not indicate or imply that the device or element being referred to must have a particular orientation, be constructed and operated in a particular orientation, and thus, should not be construed as limiting the present invention. Furthermore, the terms "first," "second," and "third" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance.
In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one logical division, and there may be other divisions when actually implemented, and for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some communication interfaces, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.
Finally, it should be noted that: the above-mentioned embodiments are only specific embodiments of the present invention, which are used for illustrating the technical solutions of the present invention and not for limiting the same, and the protection scope of the present invention is not limited thereto, although the present invention is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: any person skilled in the art can modify or easily conceive the technical solutions described in the foregoing embodiments or equivalent substitutes for some technical features within the technical scope of the present disclosure; such modifications, changes or substitutions do not depart from the spirit and scope of the embodiments of the present invention, and they should be construed as being included therein. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (10)

1. A malicious domain name detection method based on deep learning is characterized by comprising the following steps:
acquiring a domain name to be detected;
analyzing the domain name to be detected to obtain message information of the domain name to be detected;
processing the message information of the domain name to be detected based on a natural language processing algorithm and a text feature extraction algorithm to obtain feature information of the domain name to be detected;
and inputting the characteristic information into a deep learning model to obtain a detection result, wherein the detection result represents whether the domain name to be detected is a malicious domain name, and the deep learning model is a learning model constructed based on a convolutional neural network and a full connection layer.
2. The method according to claim 1, wherein processing the message information of the domain name to be detected based on a natural language processing algorithm and a text feature extraction algorithm to obtain the feature information of the domain name to be detected comprises:
segmenting the domain name to be detected to obtain a triple of the domain name to be detected;
processing the triples based on the natural language processing algorithm to obtain target triples;
and processing the target triple based on a text feature extraction algorithm to obtain the feature information of the domain name to be detected.
3. The method of claim 2, wherein the feature information comprises: domain name lexical characteristic information and domain name network characteristic information;
based on a text feature extraction algorithm, processing the target triple to obtain feature information of the domain name to be detected, wherein the method comprises the following steps:
processing the target triple based on a domain name lexical feature extraction algorithm to obtain domain name lexical feature information;
and processing the target triple based on a domain name network feature extraction algorithm to obtain domain name network feature information.
4. The method of claim 1, wherein the message information comprises: DNS inquires message information and response message information.
5. The method of claim 4, further comprising constructing the deep learning model by:
obtaining a plurality of sample domain names, wherein the sample domain names comprise legal domain names and malicious domain names;
analyzing each sample domain name to obtain message information of each sample domain name;
processing the message information of each sample domain name based on a natural language processing algorithm and a text feature extraction algorithm to obtain feature information of each sample domain name:
inputting the characteristic information of the plurality of sample domain names into an initial deep learning model, and training the initial deep learning model to obtain the deep learning model.
6. A malicious domain name detection device based on deep learning is characterized by comprising: an acquisition unit, an analysis unit, an extraction unit and a detection unit, wherein,
the acquisition unit is used for acquiring a domain name to be detected;
the analysis unit is used for analyzing the domain name to be detected to obtain message information of the domain name to be detected;
the extraction unit is used for processing the message information of the domain name to be detected based on a natural language processing algorithm and a text feature extraction algorithm to obtain the feature information of the domain name to be detected;
the detection unit is used for inputting the characteristic information into a deep learning model to obtain a detection result, wherein the detection result represents whether the domain name to be detected is a malicious domain name, and the deep learning model is a learning model constructed based on a convolutional neural network and a full connection layer.
7. The apparatus of claim 6, wherein the extraction unit is further configured to:
segmenting the domain name to be detected to obtain a triple of the domain name to be detected;
processing the triples based on the natural language processing algorithm to obtain target triples;
and processing the target triple based on a text feature extraction algorithm to obtain the feature information of the domain name to be detected.
8. The apparatus of claim 7, wherein the feature information comprises: domain name lexical characteristic information and domain name network characteristic information;
the extraction unit is also used for processing the target triple based on a domain name lexical feature extraction algorithm to obtain domain name lexical feature information;
and processing the target triple based on a domain name network feature extraction algorithm to obtain domain name network feature information.
9. The apparatus of claim 6, wherein the message information comprises: DNS inquires message information and response message information.
10. The apparatus of claim 6, further comprising: a training unit to:
obtaining a plurality of sample domain names, wherein the sample domain names comprise legal domain names and malicious domain names;
analyzing each sample domain name to obtain message information of each sample domain name;
processing the message information of each sample domain name based on a natural language processing algorithm and a text feature extraction algorithm to obtain feature information of each sample domain name:
inputting the characteristic information of the plurality of sample domain names into an initial deep learning model, and training the initial deep learning model to obtain the deep learning model.
CN201911084930.9A 2019-11-08 2019-11-08 Malicious domain name detection method and device based on deep learning Pending CN110798481A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911084930.9A CN110798481A (en) 2019-11-08 2019-11-08 Malicious domain name detection method and device based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911084930.9A CN110798481A (en) 2019-11-08 2019-11-08 Malicious domain name detection method and device based on deep learning

Publications (1)

Publication Number Publication Date
CN110798481A true CN110798481A (en) 2020-02-14

Family

ID=69443524

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911084930.9A Pending CN110798481A (en) 2019-11-08 2019-11-08 Malicious domain name detection method and device based on deep learning

Country Status (1)

Country Link
CN (1) CN110798481A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111698260A (en) * 2020-06-23 2020-09-22 上海观安信息技术股份有限公司 DNS hijacking detection method and system based on message analysis
CN112929390A (en) * 2021-03-12 2021-06-08 厦门帝恩思科技股份有限公司 Network intelligent monitoring method based on multi-strategy fusion
CN115567289A (en) * 2022-09-23 2023-01-03 清华大学 Malicious domain name detection method and system based on federal graph model under encrypted DNS protocol

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107682348A (en) * 2017-10-19 2018-02-09 杭州安恒信息技术有限公司 DGA domain name Quick method and devices based on machine learning
CN108200054A (en) * 2017-12-29 2018-06-22 北京奇安信科技有限公司 A kind of malice domain name detection method and device based on dns resolution
US20180288086A1 (en) * 2017-04-03 2018-10-04 Royal Bank Of Canada Systems and methods for cyberbot network detection
CN109150873A (en) * 2018-08-16 2019-01-04 武汉虹旭信息技术有限责任公司 Malice domain name detection system and method based on PSO_SVM optimization algorithm
CN109450845A (en) * 2018-09-18 2019-03-08 浙江大学 A kind of algorithm generation malice domain name detection method based on deep neural network
CN109714356A (en) * 2019-01-08 2019-05-03 北京奇艺世纪科技有限公司 A kind of recognition methods of abnormal domain name, device and electronic equipment
CN109756510A (en) * 2019-01-25 2019-05-14 兰州理工大学 A kind of malice domain name detection method based on N-Gram
CN109788079A (en) * 2017-11-15 2019-05-21 瀚思安信(北京)软件技术有限公司 DGA domain name real-time detection method and device
CN109951472A (en) * 2019-03-13 2019-06-28 武汉智美互联科技有限公司 A kind of DGA domain name detection method based on CNN deep learning
CN110381089A (en) * 2019-08-23 2019-10-25 南京邮电大学 Means of defence is detected to malice domain name based on deep learning

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180288086A1 (en) * 2017-04-03 2018-10-04 Royal Bank Of Canada Systems and methods for cyberbot network detection
CN107682348A (en) * 2017-10-19 2018-02-09 杭州安恒信息技术有限公司 DGA domain name Quick method and devices based on machine learning
CN109788079A (en) * 2017-11-15 2019-05-21 瀚思安信(北京)软件技术有限公司 DGA domain name real-time detection method and device
CN108200054A (en) * 2017-12-29 2018-06-22 北京奇安信科技有限公司 A kind of malice domain name detection method and device based on dns resolution
CN109150873A (en) * 2018-08-16 2019-01-04 武汉虹旭信息技术有限责任公司 Malice domain name detection system and method based on PSO_SVM optimization algorithm
CN109450845A (en) * 2018-09-18 2019-03-08 浙江大学 A kind of algorithm generation malice domain name detection method based on deep neural network
CN109714356A (en) * 2019-01-08 2019-05-03 北京奇艺世纪科技有限公司 A kind of recognition methods of abnormal domain name, device and electronic equipment
CN109756510A (en) * 2019-01-25 2019-05-14 兰州理工大学 A kind of malice domain name detection method based on N-Gram
CN109951472A (en) * 2019-03-13 2019-06-28 武汉智美互联科技有限公司 A kind of DGA domain name detection method based on CNN deep learning
CN110381089A (en) * 2019-08-23 2019-10-25 南京邮电大学 Means of defence is detected to malice domain name based on deep learning

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111698260A (en) * 2020-06-23 2020-09-22 上海观安信息技术股份有限公司 DNS hijacking detection method and system based on message analysis
CN111698260B (en) * 2020-06-23 2022-10-11 上海观安信息技术股份有限公司 DNS hijacking detection method and system based on message analysis
CN112929390A (en) * 2021-03-12 2021-06-08 厦门帝恩思科技股份有限公司 Network intelligent monitoring method based on multi-strategy fusion
CN115567289A (en) * 2022-09-23 2023-01-03 清华大学 Malicious domain name detection method and system based on federal graph model under encrypted DNS protocol

Similar Documents

Publication Publication Date Title
CN110275958B (en) Website information identification method and device and electronic equipment
CN108200054B (en) Malicious domain name detection method and device based on DNS (Domain name Server) resolution
CN108989150B (en) Login abnormity detection method and device
CN107204960B (en) Webpage identification method and device and server
CN110798481A (en) Malicious domain name detection method and device based on deep learning
US9210189B2 (en) Method, system and client terminal for detection of phishing websites
CN111818198B (en) Domain name detection method, domain name detection device, equipment and medium
CN111368289B (en) Malicious software detection method and device
CN109039875B (en) Phishing mail detection method and system based on link characteristic analysis
CN114143049B (en) Abnormal flow detection method and device, storage medium and electronic equipment
CN110866259A (en) Method and system for calculating potential safety hazard score based on multi-dimensional data
WO2020082763A1 (en) Decision trees-based method and apparatus for detecting phishing website, and computer device
CN113535823B (en) Abnormal access behavior detection method and device and electronic equipment
CN112929370B (en) Domain name system hidden channel detection method and device
CN110866831A (en) Asset activity level determination method and device and server
CN111651658A (en) Method and computer equipment for automatically identifying website based on deep learning
CN110598115A (en) Sensitive webpage identification method and system based on artificial intelligence multi-engine
CN115955457A (en) Malicious domain name detection method and device and electronic equipment
CN115643044A (en) Data processing method, device, server and storage medium
US20150139539A1 (en) Apparatus and method for detecting forgery/falsification of homepage
CN115688107A (en) Fraud-related APP detection system and method
CN115801309A (en) Big data-based computer terminal access security verification method and system
EP3361405A1 (en) Enhancement of intrusion detection systems
CN113114679A (en) Message identification method and device, electronic equipment and medium
CN113238971A (en) Automatic penetration testing system and method based on state machine

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20200214