CN108768954B - DGA malicious software identification method - Google Patents
DGA malicious software identification method Download PDFInfo
- Publication number
- CN108768954B CN108768954B CN201810419555.8A CN201810419555A CN108768954B CN 108768954 B CN108768954 B CN 108768954B CN 201810419555 A CN201810419555 A CN 201810419555A CN 108768954 B CN108768954 B CN 108768954B
- Authority
- CN
- China
- Prior art keywords
- host
- dga
- infected
- random walk
- domain name
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/14—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
- H04L63/1441—Countermeasures against malicious traffic
- H04L63/145—Countermeasures against malicious traffic the attack involving the propagation of malware through the network, e.g. viruses, trojans or worms
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/50—Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
- G06F21/55—Detecting local intrusion or implementing counter-measures
- G06F21/56—Computer malware detection or handling, e.g. anti-virus arrangements
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L61/00—Network arrangements, protocols or services for addressing or naming
- H04L61/45—Network directories; Name-to-address mapping
- H04L61/4505—Network directories; Name-to-address mapping using standardised directories; using standardised directory access protocols
- H04L61/4511—Network directories; Name-to-address mapping using standardised directories; using standardised directory access protocols using domain name system [DNS]
Abstract
The invention discloses a DGA malicious software identification method which can quickly identify DGA malicious software based on the weakness of a DGA technology. Since the host infected by DGA malware does not know its control server domain name, the infected host needs to constantly and randomly generate domain names and attempt to connect until it is successfully connected to the control server. Based on the defects and by using the idea of random walk for reference, the invention considers the domain name connection failed each time of the host as one random walk, provides a calculation method of random walk increment, and judges whether the host is infected by DGA malicious software or not by comparing the random walk number and the random walk increment with a preset threshold value. The method can complete detection before the infected host is connected to the control server, effectively inhibits the application of DGA malicious software, and has wide application prospect in the field of network security.
Description
Technical Field
The invention belongs to the technical field of network security, and particularly relates to a DGA (Domain Generation Algorithm) malicious software identification method.
Background
Early DGA techniques were mainly used for botnets, and many correlation detection methods detect DGA malware based on characteristics of botnets (including synchronicity, periodicity, and node correlation). While recent DGA techniques are used in lasso software, in such applications, lasso software does not have the characteristics of botnet described above, and thus the conventional detection methods are difficult to apply to such scenarios.
The key weakness of DGA malware is that the infected host does not know the domain name of the control server, i.e. the infected host needs to continuously generate domain names and try to connect through a random algorithm until successfully connecting to the domain name of the control server. Thus, DGA malware identification can be achieved by analyzing the number of domain names that fail a connection and the characteristics of the domain names themselves.
The detection METHODs proposed by patent CN106576058A "system AND METHOD for detecting domain generation algorithm malware AND system infected by the malware", patent CN106992969A "detection METHOD for domain name generation BASED on DGA of domain name string statistical features", patent CN105577660A "detection METHOD for domain name of DGA BASED on random forest", patent CN107046586A "detection METHOD for domain name generation BASED on natural language-like features", patent US2013191915(a1) "METHOD AND system domain DETECTING DGA-base MA L way", all use a single domain name or related parameters of the domain name as analysis detection objects, AND because a large number of normal domain names exist in the actual network environment, especially short domain names, the detection METHODs all have high false alarm rate.
Disclosure of Invention
The invention solves the problems: aiming at the key weakness of DGA malware, namely that an infected host does not know the domain name of a control server, a DGA malware identification method is provided, and the infected host can be detected before being connected to the control server.
The technical scheme of the invention is as follows: in order to achieve the purpose, the invention adopts the following technical scheme.
A DGA malware identification method comprising the steps of:
a) the domain name connection of each failure of the host is called a random walk, and the random walk increment delta is calculated based on the domain name of the ith connection failure of the hostiAnd obtaining Λ the increment of the previous n random walksn;
b) When ΛnGreater than a predetermined upper threshold BuOr the number n of steps of the random walk exceeds a preset threshold BsWhen it is determined that the host is infected with DGA malware, ΛnLess than a lower threshold BlIf so, judging that the host is not infected by the DGA malicious software;
c) when a host is determined to be infected, an alarm is raised and reset Λ10, when the host is determined to be in the normal state, direct reset Λ is performed1=0。
Further, in step 1), the random walk increment ΔiIs calculated by Wherein l is domain name length, Pr (α)0) And Pr (α)k|αk-1) The statistical derivation of Pr (α) based on the top 10 million Alexa-ranked Domain names0) For all domain names the initial character is α0Statistical probability of (3), Pr (α)k|αk-1) For the k-1 character in all domain names is αk-1Under the condition that the k-th character is αkThe probability of (c).
Further, Pr (α)k|αk-1) Is calculated byWhereinAs a binary character set αk-1αkThe number of times that it occurs in all domain names,is a start character of αk-1The number of occurrences of the binary character set in all domain names.
Further, in step 2), the upper threshold limit BuAnd lower threshold bound BlBased on the calculation of the missing report rate fnr and the false report rate fpr, the calculation method isThe false alarm rate indicates that the host is not infected but the determination result is that the host is in an infected state.
Further, in step 2), the false alarm rate fnr, the false alarm rate fpr and the threshold value BsThe method can be comprehensively determined according to factors such as system security requirements, current network conditions and the like.
Compared with the prior art, the invention has the beneficial effects that: existing DGA malware identification methods can be broadly divided into two categories. The DGA domain names are judged by extracting and analyzing the characteristics of single domain names, and the detection method has high false alarm rate due to the fact that a large number of normal irregular domain names exist in the actual network environment, particularly the domain names with short lengths. The other type is based on the characteristics of botnets, namely whether the domain name is abnormal is judged by analyzing the characteristics of multiple connections, so that the DGA domain name can be detected only after the connection request is completed. According to the DGA malicious software identification method based on the threshold random walk algorithm, malicious samples are not needed to be used as training sets, detection can be completed before infected hosts are connected to a control server, and the detection rate can be improved to the maximum extent by the threshold random walk algorithm while the detection accuracy is guaranteed. The invention is verified by experiments that the false alarm rate can be less than 3%, thus showing the effectiveness.
Drawings
FIG. 1 is a diagram of a finite state machine according to the present invention.
Detailed Description
The following detailed description of specific embodiments of the invention is provided in connection with the accompanying drawings and examples. The following examples are intended to illustrate the invention, but not to limit the scope of the invention.
The invention needs to set a missing report rate fnr and a false report rate fpr at first, and calculates the upper threshold value boundary B of the random walk increment based on fnr and fpruAnd lower threshold bound Bl. Step number threshold B for random walksIt is not set for the moment.
The maximum number of end steps S for normal access under the above parameters is then tested,then setting a step number threshold B of random walk according to Ss。
For example, if S is 12, B may be sets=15。
In the detection phase, FIG. 1, Λ in the initial state1When the host tries a domain name connection, if the connection is successful ΛnIf the connection fails, the random walk increment delta is calculated according to the following formulaiAnd a random walk increment sum Λn,Λn=∑iΔi。
Wherein l is the domain name length, Pr (α)0) For all domain names the initial character is α0Statistical probability of (3), Pr (α)k|αk-1) For the k-1 character in all domain names is αk-1Under the condition that the k-th character is αkProbability of (D.Pr) (α)k|αk-1) The calculation method comprises the following steps:whereinAs a binary character set αk-1αkThe number of times that it occurs in all domain names,is a start character of αk-1The number of occurrences of the binary character set in all domain names.
When ΛnGreater than a predetermined upper threshold BuOr the number of steps of the random walk exceeds a preset threshold BsWhen it is determined that the host is infected with DGA malware, ΛnLess than a lower threshold BlThen it is determined that the host is not infected by DGA malware.
When a host is determined to be infected, an alarm is raised and the host returns to the initial state, i.e., reset Λ1When the host is 0If it is determined to be normal, it is returned directly to the initial state and reset Λ1=0。
The above embodiments are merely illustrative, and not restrictive, and various changes and modifications may be made by those skilled in the art without departing from the spirit and scope of the invention, and therefore all equivalent technical solutions are intended to be included within the scope of the invention.
Claims (4)
1. A DGA malware identification method is characterized by comprising the following steps:
(1) the domain name connection of each failure of the host is called a random walk, and the random walk increment delta is calculated based on the domain name of the ith connection failure of the hostiAnd obtaining Λ the increment of the previous n random walksn;
(2) When ΛnGreater than a predetermined upper threshold BuOr n exceeds a preset threshold BsWhen it is determined that the host is infected with DGA malware, ΛnLess than a lower threshold BlIf so, judging that the host is not infected by the DGA malicious software;
(3) when a host is determined to be infected, an alarm is raised and reset Λ10, when the host is determined to be in the normal state, direct reset Λ is performed1=0;
In the step (1), the random walk increment deltaiIs calculated by Wherein l is the domain name length, Pr (α)0) For all domain names the initial character is α0Statistical probability of (3), Pr (α)k|αk-1) For the k-1 character in all domain names is αk-1Under the condition that the k-th character is αkThe probability of (d);
in the step (1), the increment sum Λ is randomly strokednIs Λn=∑iΔi;
In the step (2), the upper threshold BuAnd lower threshold bound BlBased on the calculation of the missing report rate fnr and the false report rate fpr, the calculation method isThe false alarm rate indicates that the host is not infected but the determination result is that the host is in an infected state.
2. The DGA malware identification method of claim 1, wherein the Pr (α)k|αk-1) The calculation method comprises the following steps:whereinAs a binary character set αk-1αkThe number of times that it occurs in all domain names,is a start character of αk-1The number of occurrences of the binary character set in all domain names.
3. The DGA malware identification method of claim 1, wherein in step (1), if the domain name accessed by the host is successfully connected, Λ is performednRemain unchanged.
4. The DGA malware identification method of claim 1, wherein: in the step (2), the false alarm rate fnr, the false alarm rate fpr and the threshold BsThe setting principle is as follows: setting interval (0, 0.01) of missing report rate fnr]To ensure that abnormal accesses can be identified more; setting interval (0, 0.001) of false alarm rate fpr]To ensure that normal access finishes the whole identification process in a short time(ii) a Threshold value BsSetting interval as [ S, S x 150%]Wherein S is not set to the threshold BsIn the case of (3), the maximum number of end steps in the normal access, that is, the maximum number of random walk steps required for the host to be determined to be in the normal state.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810419555.8A CN108768954B (en) | 2018-05-04 | 2018-05-04 | DGA malicious software identification method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810419555.8A CN108768954B (en) | 2018-05-04 | 2018-05-04 | DGA malicious software identification method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108768954A CN108768954A (en) | 2018-11-06 |
CN108768954B true CN108768954B (en) | 2020-07-10 |
Family
ID=64010106
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810419555.8A Active CN108768954B (en) | 2018-05-04 | 2018-05-04 | DGA malicious software identification method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108768954B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110278212A (en) * | 2019-06-26 | 2019-09-24 | 中国工商银行股份有限公司 | Link detection method and device |
CN112468484B (en) * | 2020-11-24 | 2022-09-20 | 山西三友和智慧信息技术股份有限公司 | Internet of things equipment infection detection method based on abnormity and reputation |
Citations (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1859199A (en) * | 2006-02-20 | 2006-11-08 | 华为技术有限公司 | System and method for detecting network worm |
CN101175078A (en) * | 2006-10-30 | 2008-05-07 | 丛林网络公司 | Identification of potential network threats using a distributed threshold random walk |
CN101626377A (en) * | 2009-08-07 | 2010-01-13 | 成都市华为赛门铁克科技有限公司 | Method and device for detecting viruses |
CN101707539A (en) * | 2009-11-26 | 2010-05-12 | 成都市华为赛门铁克科技有限公司 | Method and device for detecting worm virus and gateway equipment |
CN103973663A (en) * | 2013-02-01 | 2014-08-06 | 中国移动通信集团河北有限公司 | Method and device for dynamic threshold anomaly traffic detection of DDOS (distributed denial of service) attack |
CN105072089A (en) * | 2015-07-10 | 2015-11-18 | 中国科学院信息工程研究所 | WEB malicious scanning behavior abnormity detection method and system |
CN105577660A (en) * | 2015-12-22 | 2016-05-11 | 国家电网公司 | DGA domain name detection method based on random forest |
CN105681313A (en) * | 2016-01-29 | 2016-06-15 | 博雅网信(北京)科技有限公司 | Flow detection system and method for virtualization environment |
CN106170002A (en) * | 2016-09-08 | 2016-11-30 | 中国科学院信息工程研究所 | A kind of Chinese counterfeit domain name detection method and system |
CN106576058A (en) * | 2014-08-22 | 2017-04-19 | 迈克菲股份有限公司 | System and method to detect domain generation algorithm malware and systems infected by such malware |
CN106992969A (en) * | 2017-03-03 | 2017-07-28 | 南京理工大学 | DGA based on domain name character string statistical nature generates the detection method of domain name |
CN107046586A (en) * | 2017-04-14 | 2017-08-15 | 四川大学 | A kind of algorithm generation domain name detection method based on natural language feature |
CN107592312A (en) * | 2017-09-18 | 2018-01-16 | 济南互信软件有限公司 | A kind of malware detection method based on network traffics |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8800036B2 (en) * | 2010-01-22 | 2014-08-05 | The School Of Electrical Engineering And Computer Science (Seecs), National University Of Sciences And Technology (Nust) | Method and system for adaptive anomaly-based intrusion detection |
US9922190B2 (en) * | 2012-01-25 | 2018-03-20 | Damballa, Inc. | Method and system for detecting DGA-based malware |
-
2018
- 2018-05-04 CN CN201810419555.8A patent/CN108768954B/en active Active
Patent Citations (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1859199A (en) * | 2006-02-20 | 2006-11-08 | 华为技术有限公司 | System and method for detecting network worm |
CN101175078A (en) * | 2006-10-30 | 2008-05-07 | 丛林网络公司 | Identification of potential network threats using a distributed threshold random walk |
CN101626377A (en) * | 2009-08-07 | 2010-01-13 | 成都市华为赛门铁克科技有限公司 | Method and device for detecting viruses |
CN101707539A (en) * | 2009-11-26 | 2010-05-12 | 成都市华为赛门铁克科技有限公司 | Method and device for detecting worm virus and gateway equipment |
CN103973663A (en) * | 2013-02-01 | 2014-08-06 | 中国移动通信集团河北有限公司 | Method and device for dynamic threshold anomaly traffic detection of DDOS (distributed denial of service) attack |
CN106576058A (en) * | 2014-08-22 | 2017-04-19 | 迈克菲股份有限公司 | System and method to detect domain generation algorithm malware and systems infected by such malware |
CN105072089A (en) * | 2015-07-10 | 2015-11-18 | 中国科学院信息工程研究所 | WEB malicious scanning behavior abnormity detection method and system |
CN105577660A (en) * | 2015-12-22 | 2016-05-11 | 国家电网公司 | DGA domain name detection method based on random forest |
CN105681313A (en) * | 2016-01-29 | 2016-06-15 | 博雅网信(北京)科技有限公司 | Flow detection system and method for virtualization environment |
CN106170002A (en) * | 2016-09-08 | 2016-11-30 | 中国科学院信息工程研究所 | A kind of Chinese counterfeit domain name detection method and system |
CN106992969A (en) * | 2017-03-03 | 2017-07-28 | 南京理工大学 | DGA based on domain name character string statistical nature generates the detection method of domain name |
CN107046586A (en) * | 2017-04-14 | 2017-08-15 | 四川大学 | A kind of algorithm generation domain name detection method based on natural language feature |
CN107592312A (en) * | 2017-09-18 | 2018-01-16 | 济南互信软件有限公司 | A kind of malware detection method based on network traffics |
Also Published As
Publication number | Publication date |
---|---|
CN108768954A (en) | 2018-11-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10826684B1 (en) | System and method of validating Internet of Things (IOT) devices | |
CN106790186B (en) | Multi-step attack detection method based on multi-source abnormal event correlation analysis | |
KR101122650B1 (en) | Apparatus, system and method for detecting malicious code injected with fraud into normal process | |
JP6636096B2 (en) | System and method for machine learning of malware detection model | |
KR102210627B1 (en) | Method, apparatus and system for detecting malicious process behavior | |
CN102664875B (en) | Malicious code type detection method based on cloud mode | |
CN108111466A (en) | A kind of attack detection method and device | |
CN110581827B (en) | Detection method and device for brute force cracking | |
Shabtai et al. | F-sign: Automatic, function-based signature generation for malware | |
US10356113B2 (en) | Apparatus and method for detecting abnormal behavior | |
CN105046152B (en) | Malware detection method based on function call graph fingerprint | |
KR20080071862A (en) | Apparatus for detecting intrusion code and method using the same | |
CN109257393A (en) | XSS attack defence method and device based on machine learning | |
JP2010182019A (en) | Abnormality detector and program | |
CN108768954B (en) | DGA malicious software identification method | |
CN104598820A (en) | Trojan virus detection method based on feature behavior activity | |
WO2020134311A1 (en) | Method and device for detecting malware | |
CN114969766A (en) | Account locking bypassing logic vulnerability detection method, system and storage medium | |
US11916953B2 (en) | Method and mechanism for detection of pass-the-hash attacks | |
CN101719906B (en) | Worm propagation behavior-based worm detection method | |
CN111901286B (en) | APT attack detection method based on flow log | |
CN113839963B (en) | Network security vulnerability intelligent detection method based on artificial intelligence and big data | |
CN113709097B (en) | Network risk sensing method and defense method | |
Ponomarev et al. | Session duration based feature extraction for network intrusion detection in control system networks | |
CN115373834A (en) | Intrusion detection method based on process call chain |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |