CN105119876B - A kind of detection method and system of the domain name automatically generated - Google Patents

A kind of detection method and system of the domain name automatically generated Download PDF

Info

Publication number
CN105119876B
CN105119876B CN201510368044.4A CN201510368044A CN105119876B CN 105119876 B CN105119876 B CN 105119876B CN 201510368044 A CN201510368044 A CN 201510368044A CN 105119876 B CN105119876 B CN 105119876B
Authority
CN
China
Prior art keywords
domain name
abnormality degree
layer
character
sample
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201510368044.4A
Other languages
Chinese (zh)
Other versions
CN105119876A (en
Inventor
肖军
云晓春
张永铮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Information Engineering of CAS
Original Assignee
Institute of Information Engineering of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Information Engineering of CAS filed Critical Institute of Information Engineering of CAS
Priority to CN201510368044.4A priority Critical patent/CN105119876B/en
Publication of CN105119876A publication Critical patent/CN105119876A/en
Application granted granted Critical
Publication of CN105119876B publication Critical patent/CN105119876B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic

Abstract

The invention discloses the detection methods and system of a kind of domain name automatically generated.The method include the steps that 1) establish a normal domain name sample;For each sample domain name: counting the distribution of lengths of each layer domain name of the sample domain name, calculate the length abnormality degree of each layer domain name;The probability that jumps of intercharacter in each layer domain name of the sample domain name is counted, the character for calculating respective layer domain name jumps abnormality degree;It counts and calculates entropy exception angle value of each character in each layer domain name in the sample domain name, and calculate the character entropy abnormality degree in each layer domain name;Total abnormality degree of the sample domain name is calculated according to above-mentioned calculated result;2) an abnormality degree threshold value is set according to total abnormality degree of all sample domain names;3) detection module calculates total abnormality degree of domain name to be detected, if the value is greater than setting abnormality degree threshold value, then it is assumed that the domain name to be detected is the domain name automatically generated.The present invention is trained and detection process is simpler quick, can satisfy the needs of on-line checking.

Description

A kind of detection method and system of the domain name automatically generated
Technical field
The invention belongs to network security detection fields, and in particular to a kind of detection of domain name automatically generated by algorithm Method and system.
Background technique
Domain name is usually artificial formulation, and the corresponding domain name of a website does not do frequent change usually.And algorithm is automatic The domain name of generation refers to through computer according to domain name generating algorithm, and current time is combined to automatically generate.Currently, it partially attacks The person of hitting uses algorithm and automatically generates domain name technology to improve the survivability of Botnet or fast flux network.Due to new Domain name generates at random, and can convert daily, thus traditional blacklist mechanism can not be defendd, and considerably increases detection and disposition Difficulty.
Currently to the detection method that algorithm automatically generates mainly pass through machine learning strategy carry out, by train come Detection model is detected.The deficiency of this method is to need malice domain name sample that is enough, and needing all kinds of generating algorithms This (that is, domain name sample that algorithm automatically generates).Such sample is not easy to obtain, and is not easy to cover all kinds of generation situations.
Summary of the invention
For the technical problems in the prior art, the purpose of the present invention is to provide a kind of algorithms to automatically generate domain name Detection method, the present invention only needs normal operation in normal domain name as sample, and sample is easier to obtain.With the existing side based on machine learning Method is compared, and trained and detection process is simpler quick, can satisfy the needs of on-line checking.
The technical solution of the present invention is as follows:
A kind of detection method of the domain name automatically generated, the steps include:
1) sample set is established, wherein the sample domain name in sample set is normal domain name;For in sample set Each sample domain name:
11) the extremely trained submodule of layer domain name length counts the distribution of lengths of each layer domain name of the sample domain name, then root The length abnormality degree of each layer domain name is calculated according to the spatial abnormal feature;
12) what character jumped that abnormal trained submodule counts intercharacter in each layer domain name of the sample domain name jumps probability, Then the character that probability calculation respective layer domain name is jumped according to jumps abnormality degree;
13) the extremely trained submodule of entropy counts and calculates entropy abnormality degree of each character in each layer domain name in the sample domain name Value, and calculate the character entropy abnormality degree in each layer domain name;
14) domain name abnormality degree combined training submodule jumps exception according to above-mentioned length abnormality degree, the character being calculated Total abnormality degree of the sample domain name is calculated in degree and character entropy abnormality degree;
2) domain name abnormality degree combined training submodule sets an abnormality degree threshold value according to total abnormality degree of all sample domain names;
3) detection module calculates total abnormality degree of domain name to be detected, if the value is greater than setting abnormality degree threshold value, then it is assumed that The domain name to be detected is the domain name automatically generated.
Further, the calculation method of the layer domain name entropy abnormality degree: the entropy abnormality degree of jth layer domain name is set as Dentropy, ThenWherein, M is the kinds of characters number in the sample domain name jth layer domain name, piFor character i Statistical probability.
Further, the character jumps the calculation method of abnormality degree are as follows: set jth layer domain name character jump abnormality degree as Dbigram, thenWherein, N refers to the character quantity in the sample domain name jth layer domain name, k-th of character Pair jump abnormality degreeMAXCntIt is the number of hops maximum value of all characters in jth layer domain name, CntijBe in jth layer domain name k-th of character to from i-th of character to the number of hops of jth character.
Further, length is the layer domain name length abnormality degree of iCntiFor in the sample domain name Length is the quantity of the layer domain name of i, and the quantity maximum value of the layer domain name of equal length is CntMAX
Further, the abnormality degree of l layers of domain name is A in the sample domain namel=α Dentropy+βDbigram+γDlength;α+β + γ=1.
Further, the overall abnormality degree of the sample domain nameL is the layer domain name number of the sample domain name.
A kind of detection system of the domain name automatically generated, which is characterized in that including training module and detection module;Wherein, Training module includes that the extremely trained submodule of layer domain name length, character jump abnormal trained submodule, the extremely trained submodule of entropy With domain name abnormality degree combined training submodule;
The layer extremely trained submodule of domain name length, the distribution of lengths of each layer domain name for counting each sample domain name, so The length abnormality degree of each layer domain name is calculated according to the spatial abnormal feature afterwards;
Character jumps abnormal trained submodule, and intercharacter jumps generally in each layer domain name for counting each sample domain name Rate, the character that probability calculation respective layer domain name is then jumped according to jump abnormality degree;
The extremely trained submodule of entropy, it is different for counting and calculating entropy of each character in each layer domain name in each sample domain name Normal manner value, and calculate the character entropy abnormality degree in each layer domain name;
Domain name abnormality degree combined training submodule, it is different for being jumped according to above-mentioned length abnormality degree, the character being calculated Total abnormality degree of each sample domain name is calculated in normal manner and character entropy abnormality degree, then according to total exception of all sample domain names Degree one abnormality degree threshold value of setting;
The detection module is used to calculate total abnormality degree of domain name to be detected, if the value is greater than setting abnormality degree threshold value, Then think that the domain name to be detected is the domain name automatically generated.
Compared with prior art, the positive effect of the present invention are as follows:
Technology and system in the present invention can satisfy the demand handled in real time online, and training is simple, and detection is quick, precision It is higher, there is preferable practicability, such as table 1.
Table 1 is testing result table of the present invention
Detailed description of the invention
Fig. 1 is system flow chart of the invention;
Fig. 2 is present system structure chart;
Fig. 3 is detection example flow chart of the present invention;
Fig. 4 is one structure chart of deployment way of the present invention;
Fig. 5 is two structure chart of deployment way of the present invention.
Specific embodiment
The principle and features of the present invention will be described below with reference to the accompanying drawings, and the given examples are served only to explain the present invention, and It is non-to be used to limit the scope of the invention.
The invention discloses a kind of detection methods and system that algorithm automatically generates.The system is based on having normal domain name, Training obtains the behavior profile of legitimate domain name, is calculated by the abnormality degree to suspicious domain name, can effectively find algorithm certainly The dynamic domain name generated.
(1) system includes two main process flows to the system flow chart as shown in Figure 1:: training process and being detected Journey.By the training to legitimate domain name, the abnormality degree threshold value of legitimate domain name is obtained;The suspicious degree of domain name of domain name to be detected is calculated, If the value is greater than abnormality degree threshold value, then it is assumed that be the domain name that algorithm automatically generates.
(2) system module is as shown in Figure 2.The system includes training module and detection module.Training module includes: a layer domain It is comprehensive that the extremely trained submodule of name length, character jump abnormal trained submodule, the extremely trained submodule of entropy and domain name abnormality degree Training submodule.Detection module includes: a layer domain name length abnormality detection submodule, that character jumps abnormality detection submodule, entropy is different Normal detection sub-module and domain name abnormality degree comprehensive detection submodule.
Specific each submodule function is as follows:
(1) the extremely trained submodule of layer domain name length: the distribution of lengths of each layer domain name of each sample domain name is counted, and is counted Calculate each layer length abnormality degree of normal domain name.Our all sample standard deviations are normal domain names.
(2) character jumps abnormal trained submodule: statistics obtains the probability that jumps of each intercharacter, and calculates each layer normal operation in normal domain Character in name jumps abnormality degree.Such as www.sina.com.cn, cn is first layer, and com is the second layer, and sina is Its third layer domain name.We pay close attention to third layer domain name, it is assumed that after the third layer domain name character for having counted sample jumps, after s It is N1 that the number of i is followed in face, and the number (including following i) that s is followed by all characters is that the probability that jumps of N, then s followed by i is N1/N.It is abnormal for other entropy exceptions or length, also all refer to and is counting the probability in certain layer of domain name.
(3) the extremely trained submodule of entropy: counting and calculates entropy exception angle value of each character in each layer domain name, and calculates each The abnormality degree of character entropy in the normal domain name of layer.
(4) abnormal trained submodule, the extremely trained submodule of entropy domain name abnormality degree combined training submodule: are jumped based on character The exception of each layer domain name of normal domain name is calculated in the abnormality degree calculated result of block and domain name abnormality degree combined training submodule Degree, and then total abnormality degree of normal domain name is calculated.The abnormality degree threshold value of final setting domain name.
(5) layer domain name length abnormality detection submodule: each layer obtained according to the training of layer domain name length exception training module The distribution of lengths of domain name calculates each layer length abnormality degree of suspicious domain name.
(6) character jumps abnormality detection submodule: jumping abnormal trained submodule according to character and counts to obtain each intercharacter Jump probability, calculate the character in suspicious domain name in each layer domain name and jump abnormality degree.
(7) entropy abnormality detection submodule: according to each character in the extremely trained submodule of entropy at training in each layer domain name Entropy, calculate the character entropy abnormality degree of each layer domain name in suspicious domain name.
(8) domain name abnormality degree comprehensive detection submodule: length abnormality degree, character based on each layer domain name jump abnormality degree and Entropy abnormality degree obtains the abnormality degree of each layer domain name, and then calculates the overall abnormality degree of domain name.And eventually by with abnormality degree threshold Value it was found that whether the domain name is domain name that algorithm automatically generates.
It is elucidated further below the process of training and detection, as shown in Figure 3.Fccb.chc.dg is one and is used to training Normal domain name;Cfad.hklf.ac can be for trained normal domain name, be also possible to a suspicious domain name to be detected.Such as Fruit is normal domain name, then sets detection threshold value by the abnormality degree of multiple normal domain names;If it is a domain name to be detected, then Calculated abnormality degree is compared with the detection threshold value set, to find domain name that algorithm automatically generates.
(3) specific each calculation method is described below.
1) calculation method of layer domain name entropy abnormality degree:
If sharing M kinds of characters in sample in a layer domain name, the statistical probability of character i is pi, then this layer of domain Name entropy abnormality degree be
2) character jumps the calculation method of abnormality degree:
Assuming that any one layer domain name character string is terminated with " $ " beginning with " $ ".Such as " google " is considered as " $ google$".There are following characters to jump " $ g ", " go ", " oo ", " og ", " gl ", " le " and " e $ ".We distinguish statistically State the number of hops of two characters of each group.Assuming that MAXCntIt is the number of hops maximum value of all characters.And k-th of character is to being The number of hops of i-th of character to jth character is Cntij.Then the abnormality degree that jumps of k-th of character pair is
By training, the finally obtained matrix that jumps is
Then, the abnormality degree that jumps of entire character string is
N refers to character quantity of the sample in this layer of domain name.
3) layer domain name length abnormality degree calculates:
Assuming that the quantity for the layer domain name that length is i in sample is Cnti, the quantity maximum value of the layer domain name of equal length is CntMAX, then the abnormality degree for the layer domain name that length is i is
4) abnormality degree of layer domain name calculates:
The abnormality degree of l layers of domain name are as follows:
5) abnormality degree of domain name calculates:
(4) system deployment mode
There are two kinds of deployment way for this system.First is that obtained in such a way that data are passively collected in backbone router side, It is detected by router light splitting DNS access data packet to our system, as shown in Figure 4.
As shown in figure 5, another is to be divided in dns server side by corresponding couple in router, it will be all The system that request data package is transmitted to us.

Claims (6)

1. a kind of detection method of the domain name automatically generated, the steps include:
1) sample set is established, wherein the sample domain name in sample set is normal domain name;For every in sample set One sample domain name:
11) the extremely trained submodule of layer domain name length counts the distribution of lengths of each layer domain name of the sample domain name, then according to institute State the length abnormality degree that distribution of lengths calculates each layer domain name;
12) what character jumped that abnormal trained submodule counts intercharacter in each layer domain name of the sample domain name jumps probability, then Abnormality degree is jumped according to the character for jumping probability calculation respective layer domain name;The character jumps the calculation method of abnormality degree Are as follows: the character for setting jth layer domain name jumps abnormality degree as Dbigram, thenWherein, N refers to the sample domain name Character quantity in jth layer domain name, k-th character pair jump abnormality degreeMAXCntIt is jth layer domain The number of hops maximum value of all characters, Cnt in nameijBe in jth layer domain name k-th of character to from i-th of character to jth character Number of hops;
13) the extremely trained submodule of entropy counts and calculates entropy exception angle value of each character in each layer domain name in the sample domain name, And calculate the layer domain name entropy abnormality degree in each layer domain name;The calculation method of the layer domain name entropy abnormality degree: the layer of jth layer domain name is set Domain name entropy abnormality degree is Dentropy, thenWherein, M is in the sample domain name jth layer domain name Kinds of characters number, piFor the statistical probability of character i;
14) domain name abnormality degree combined training submodule according to above-mentioned length abnormality degree, the character being calculated jump abnormality degree and Total abnormality degree of the sample domain name is calculated in layer domain name entropy abnormality degree;
2) domain name abnormality degree combined training submodule sets an abnormality degree threshold value according to total abnormality degree of all sample domain names;
3) detection module calculates total abnormality degree of domain name to be detected, if the value is greater than setting abnormality degree threshold value, then it is assumed that should be to The entitled domain name automatically generated of detecting domains.
2. the method as described in claim 1, which is characterized in that length is the layer domain name length abnormality degree of iCntiFor the quantity for the layer domain name that length in the sample domain name is i, the number of the layer domain name of equal length Amount maximum value is CntMAX
3. method according to claim 2, which is characterized in that the abnormality degree of l layers of domain name is A in the sample domain namel=α Dentropy+βDbigram+γDlength;Alpha+beta+γ=1.
4. method as claimed in claim 3, which is characterized in that the overall abnormality degree of the sample domain nameL is should The layer domain name number of sample domain name.
5. a kind of detection system of the domain name automatically generated, which is characterized in that including training module and detection module;Wherein, it instructs Practice module include the extremely trained submodule of layer domain name length, character jump abnormal trained submodule, the extremely trained submodule of entropy and Domain name abnormality degree combined training submodule;
The layer extremely trained submodule of domain name length, the distribution of lengths of each layer domain name for counting each sample domain name, then root The length abnormality degree of each layer domain name is calculated according to the distribution of lengths;
Character jumps abnormal trained submodule, and intercharacter jumps probability in each layer domain name for counting each sample domain name, Then the character that probability calculation respective layer domain name is jumped according to jumps abnormality degree;The character jumps abnormal trained submodule Utilize formulaThe character for calculating jth layer domain name jumps abnormality degree Dbigram;Wherein, N refers to sample domain name Character quantity in jth layer domain name, k-th character pair jump abnormality degreeMAXCntIt is jth layer domain The number of hops maximum value of all characters, Cnt in nameijBe in jth layer domain name k-th of character to from i-th of character to jth character Number of hops;
The extremely trained submodule of entropy, for counting and calculating entropy abnormality degree of each character in each layer domain name in each sample domain name Value, and calculate the layer domain name entropy abnormality degree in each layer domain name;The calculation method of the layer domain name entropy abnormality degree: jth layer domain name is set Layer domain name entropy abnormality degree be Dentropy, thenWherein, M is the sample domain name jth layer domain name In kinds of characters number, piFor the statistical probability of character i;
Domain name abnormality degree combined training submodule, for jumping abnormality degree according to above-mentioned length abnormality degree, the character being calculated Total abnormality degree of each sample domain name is calculated with layer domain name entropy abnormality degree, then according to total abnormality degree of all sample domain names Set an abnormality degree threshold value;
The detection module is used to calculate total abnormality degree of domain name to be detected, if the value is greater than setting abnormality degree threshold value, recognizes It is the domain name automatically generated for the domain name to be detected.
6. system as claimed in claim 5, which is characterized in that the extremely trained submodule of layer domain name length utilizes formulaComputational length is the layer domain name length abnormality degree D of ilength;Wherein, CntiIt is for length in sample domain name The quantity of the layer domain name of i, the quantity maximum value of the layer domain name of equal length are CntMAX
CN201510368044.4A 2015-06-29 2015-06-29 A kind of detection method and system of the domain name automatically generated Active CN105119876B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510368044.4A CN105119876B (en) 2015-06-29 2015-06-29 A kind of detection method and system of the domain name automatically generated

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510368044.4A CN105119876B (en) 2015-06-29 2015-06-29 A kind of detection method and system of the domain name automatically generated

Publications (2)

Publication Number Publication Date
CN105119876A CN105119876A (en) 2015-12-02
CN105119876B true CN105119876B (en) 2019-01-18

Family

ID=54667769

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510368044.4A Active CN105119876B (en) 2015-06-29 2015-06-29 A kind of detection method and system of the domain name automatically generated

Country Status (1)

Country Link
CN (1) CN105119876B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107770132B (en) * 2016-08-18 2021-11-05 中兴通讯股份有限公司 Method and device for detecting algorithmically generated domain name
CN109120733B (en) * 2018-07-20 2021-06-01 杭州安恒信息技术股份有限公司 Detection method for communication by using DNS (Domain name System)
CN108881517B (en) * 2018-08-01 2021-08-24 北京闲徕互娱网络科技有限公司 Domain name pool automatic management method and system
US10764246B2 (en) * 2018-08-14 2020-09-01 Didi Research America, Llc System and method for detecting generated domain
CN114885334B (en) * 2022-07-13 2022-09-27 安徽创瑞信息技术有限公司 High-concurrency short message processing method

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101645884A (en) * 2009-08-26 2010-02-10 西安理工大学 Multi-measure network abnormity detection method based on relative entropy theory
CN102882881A (en) * 2012-10-10 2013-01-16 常州大学 Special data filtering method for eliminating denial-of-service attacks to DNS (domain name system) service

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8200696B2 (en) * 2005-05-26 2012-06-12 International Business Machines Corporation Presenting multiple possible selectable domain names from a URL entry

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101645884A (en) * 2009-08-26 2010-02-10 西安理工大学 Multi-measure network abnormity detection method based on relative entropy theory
CN102882881A (en) * 2012-10-10 2013-01-16 常州大学 Special data filtering method for eliminating denial-of-service attacks to DNS (domain name system) service

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
《域名解析文件自动生成技术研究》;王莉军;《计算机技术与发展》;20130630;第23卷(第6期);全文 *
《基于会话异常度模型的应用层分布式拒绝服务攻击过滤》;肖军等;《计算机学报》;20100930;第33卷(第9期);第3页-第7页 *

Also Published As

Publication number Publication date
CN105119876A (en) 2015-12-02

Similar Documents

Publication Publication Date Title
CN105119876B (en) A kind of detection method and system of the domain name automatically generated
CN107786575B (en) DNS flow-based self-adaptive malicious domain name detection method
Dou et al. A confidence-based filtering method for DDoS attack defense in cloud environment
CN106330906B (en) A kind of ddos attack detection method under big data environment
Lancichinetti et al. Benchmark graphs for testing community detection algorithms
CN103095711B (en) A kind of application layer ddos attack detection method for website and system of defense
CN104836810B (en) A kind of collaborative detection method of NDN low speed caching pollution attack
CN111181901B (en) Abnormal flow detection device and abnormal flow detection method thereof
CN109450845B (en) Detection method for generating malicious domain name based on deep neural network algorithm
CN109600363A (en) A kind of internet-of-things terminal network portrait and abnormal network access behavioral value method
CN109167789A (en) A kind of cloud environment LDoS attack data-flow detection method and system
CN109117634A (en) Malware detection method and system based on network flow multi-view integration
Xu et al. Detection on application layer DDoS using random walk model
CN102571487B (en) Distributed bot network scale measuring and tracking method based on multiple data sources
CN105939340A (en) Method and system for discovering hidden conficker
Juvonen et al. An efficient network log anomaly detection system using random projection dimensionality reduction
CN103747003A (en) Peer-to-peer botnet core node detection method and detection device
Wen Cloud computing intrusion detection technology based on BP-NN
Liu et al. A new network flow grouping method for preventing periodic shrew DDoS attacks in cloud computing
CN112333128B (en) Web attack behavior detection system based on self-encoder
CN107644162A (en) A kind of Web attack recognitions method and apparatus
Sun et al. A new mimicking attack by LSGAN
US20210158217A1 (en) Method and Apparatus for Generating Application Identification Model
CN106375288B (en) A kind of Chinese domain name similarity calculating method and counterfeit domain name detection method
Bartos et al. IFS: Intelligent flow sampling for network security–an adaptive approach

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant