CN115550021A - Method and system for accurately replicating network space in big data environment and storage medium - Google Patents
Method and system for accurately replicating network space in big data environment and storage medium Download PDFInfo
- Publication number
- CN115550021A CN115550021A CN202211172640.1A CN202211172640A CN115550021A CN 115550021 A CN115550021 A CN 115550021A CN 202211172640 A CN202211172640 A CN 202211172640A CN 115550021 A CN115550021 A CN 115550021A
- Authority
- CN
- China
- Prior art keywords
- domain name
- detected
- big data
- message
- malicious
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 44
- 230000003362 replicative effect Effects 0.000 title 1
- 238000001514 detection method Methods 0.000 claims abstract description 39
- 238000004891 communication Methods 0.000 claims abstract description 32
- 238000003745 diagnosis Methods 0.000 claims abstract description 23
- 230000003068 static effect Effects 0.000 claims description 19
- 230000002159 abnormal effect Effects 0.000 claims description 12
- 238000004422 calculation algorithm Methods 0.000 claims description 8
- 238000004590 computer program Methods 0.000 claims description 6
- 238000010586 diagram Methods 0.000 claims description 4
- 238000013500 data storage Methods 0.000 claims description 3
- 238000000605 extraction Methods 0.000 claims description 3
- 238000005516 engineering process Methods 0.000 description 5
- 238000004364 calculation method Methods 0.000 description 3
- ZXQYGBMAQZUVMI-GCMPRSNUSA-N gamma-cyhalothrin Chemical compound CC1(C)[C@@H](\C=C(/Cl)C(F)(F)F)[C@H]1C(=O)O[C@H](C#N)C1=CC=CC(OC=2C=CC=CC=2)=C1 ZXQYGBMAQZUVMI-GCMPRSNUSA-N 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000011897 real-time detection Methods 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 238000012549 training Methods 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000001010 compromised effect Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000004907 flux Effects 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 230000001960 triggered effect Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/14—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
- H04L63/1408—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
- H04L63/1416—Event detection, e.g. attack signature detection
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/02—Network architectures or network communication protocols for network security for separating internal from external traffic, e.g. firewalls
- H04L63/0227—Filtering policies
- H04L63/0236—Filtering by address, protocol, port number or service, e.g. IP-address or URL
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/14—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
- H04L63/1441—Countermeasures against malicious traffic
- H04L63/1466—Active attacks involving interception, injection, modification, spoofing of data unit addresses, e.g. hijacking, packet injection or TCP sequence number attacks
Landscapes
- Engineering & Computer Science (AREA)
- Computer Security & Cryptography (AREA)
- Computer Hardware Design (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
本发明公开了一种大数据环境下网络空间精准反制方法、系统及可存储介质,涉及网络安全领域。本发明包括以下步骤:接收待检测报文并获取待检测报文中的通信数据;根据待检测报文中的通信数据捕获DNS查询,并提取诊断目标域名;利用DGA检测流程对诊断目标域名进行诊断;若所述诊断目标域名判定为恶意域名,则启动反制机制;通过反制机制中断恶意域名的通信节点。本发明更好地提高网络安全攻击反制能力,保障网络空间安全。
The invention discloses a network space precise countermeasure method, system and storage medium in a big data environment, and relates to the field of network security. The present invention comprises the following steps: receiving the message to be detected and obtaining the communication data in the message to be detected; capturing DNS query according to the communication data in the message to be detected, and extracting the diagnosis target domain name; using the DGA detection process to diagnose the target domain name Diagnosis; if the diagnosis target domain name is determined to be a malicious domain name, start a countermeasure mechanism; interrupt the communication node of the malicious domain name through the countermeasure mechanism. The present invention better improves the countermeasure ability of network security attacks and guarantees the security of network space.
Description
技术领域technical field
本发明涉及网络安全领域,更具体的说是涉及一种大数据环境下网络空间精准反制方法、系统及可存储介质。The present invention relates to the field of network security, and more specifically relates to a method, system and storage medium for precise countermeasures in network space in a big data environment.
背景技术Background technique
网络信息技术的高速发展,让人们的生活与工作都得到极大的便利性,但同时网络中也存在很多安全问题。网络世界属于虚拟空间,不法分子利用网络的虚拟性进行非法入侵,从而盗取他人的私人信息,造成了较为严重的安全问题。The rapid development of network information technology has brought great convenience to people's life and work, but at the same time there are many security problems in the network. The network world belongs to the virtual space, criminals use the virtuality of the network to carry out illegal intrusion, thereby stealing other people's private information, resulting in serious security problems.
并且随着攻击平台、商用木马和开源恶意工具的使用,网络攻防战争的日趋深入和精进,亟需引入人工智能、大数据等分析技术,提高网络攻击反制能力,形成了对攻击者的威慑和反制。And with the use of attack platforms, commercial Trojan horses and open source malicious tools, the network offensive and defensive warfare is becoming more and more in-depth and sophisticated. It is urgent to introduce artificial intelligence, big data and other analysis technologies to improve the ability to counter network attacks and form a deterrent to attackers. and counter.
发明内容Contents of the invention
有鉴于此,本发明提供了一种大数据环境下网络空间精准反制方法、系统及可存储介质,运用DGA算法识别恶意域名,及时定位到失陷主机,完成精准反击。In view of this, the present invention provides a method, system, and storage medium for precise countermeasures in cyberspace in a big data environment, using the DGA algorithm to identify malicious domain names, locate the compromised host in time, and complete precise counterattacks.
为了实现上述目的,本发明采用如下技术方案:In order to achieve the above object, the present invention adopts the following technical solutions:
一方面,公开了一种大数据环境下网络空间精准反制方法,包括以下步骤:On the one hand, a precise cyberspace countermeasure method in a big data environment is disclosed, including the following steps:
接收待检测报文并获取待检测报文中的通信数据;Receive the message to be detected and obtain the communication data in the message to be detected;
根据待检测报文中的通信数据捕获DNS查询,并提取诊断目标域名;Capture the DNS query according to the communication data in the message to be detected, and extract the diagnostic target domain name;
利用DGA检测流程对诊断目标域名进行诊断;Use the DGA detection process to diagnose the diagnosis target domain name;
若所述诊断目标域名判定为恶意域名,则启动反制机制;If the diagnosis target domain name is determined to be a malicious domain name, start a countermeasure mechanism;
通过反制机制中断恶意域名的通信节点。Interrupt the communication nodes of malicious domain names through the counter mechanism.
可选的,所述反制机制的算法具体如下:Optionally, the algorithm of the countermeasure mechanism is specifically as follows:
基于平均通信总流量大小、平均通信数据包、平均通信连接次数标准建立网络连接图;Establish a network connection diagram based on the average total communication traffic size, average communication data packets, and average communication connection times;
删除网络图中和异常域名节点未进行过通信的节点,选取任意异常域名节点为起始节点;Delete the nodes that have not communicated with the abnormal domain name node in the network graph, and select any abnormal domain name node as the starting node;
从所述起始节点遍历相邻节点,利用蚁群算法遍历意义两个异常域名节点之间的最短路径;Traversing the adjacent nodes from the starting node, using the ant colony algorithm to traverse the shortest path between the two abnormal domain name nodes;
根据最短路径进行检索,中断出现最多的节点与异常域名节点,完成反制机制。Retrieve according to the shortest path, interrupt the nodes with the most occurrences and abnormal domain name nodes, and complete the countermeasure mechanism.
可选的,所述DGA检测流程包括静态特征检测和动态特征检测。Optionally, the DGA detection process includes static feature detection and dynamic feature detection.
可选的,所述静态特征检测具体为:提取诊断目标域名的静态特征,使用静态特征分类器做出判断,对于可信度高于预设阈值的判断结果,直接给出结论,结束流程,并将诊断目标域名添加到白名单;对于其他判断结果,判别为“可疑域名”进入动态特征检测流程。Optionally, the static feature detection specifically includes: extracting static features of the diagnosis target domain name, using a static feature classifier to make a judgment, and directly giving a conclusion for a judgment result whose reliability is higher than a preset threshold, and ending the process. And add the diagnosis target domain name to the white list; for other judgment results, it is judged as "suspicious domain name" and enters the dynamic feature detection process.
可选的,所述动态特征检测具体为:提取诊断目标的动态特征,使用动态特征分类器来做出判断,对于可信度高于预设阈值的判断结果,给出结论并将诊断目标放入相应的黑名单或白名单中,对于其他判断结果,只给出结论,不修改黑白名单。Optionally, the dynamic feature detection specifically includes: extracting the dynamic features of the diagnostic target, using a dynamic feature classifier to make a judgment, and for a judgment result with a reliability higher than a preset threshold, giving a conclusion and placing the diagnostic target For other judgment results, only conclusions are given, and the black and white lists are not modified.
可选的,还包括数据库将DGA检测流程的检测结果进行存储,所述数据库包括白名单数据库和黑名单数据库;所述白名单数据库存储安全的目的主机,目的服务器域名;所述黑名单数据库存储已知的恶意特征,恶意特征检测引擎使用所述黑名单数据库内容进行匹配,临时运算数据库存储临时数据存储地址与每个模块的计算结果。Optionally, a database is also included to store the detection results of the DGA detection process, the database includes a whitelist database and a blacklist database; the whitelist database stores safe destination hosts and destination server domain names; the blacklist database stores For known malicious features, the malicious feature detection engine uses the contents of the blacklist database for matching, and the temporary computing database stores temporary data storage addresses and calculation results of each module.
另一方面,还公开了一种大数据环境下网络空间精准反制系统,包括:On the other hand, it also discloses a precise countermeasure system in cyberspace under the big data environment, including:
待检测数据接收与获取模块:用于接收待检测报文并获取待检测报文中的通信数据;Data to be detected receiving and obtaining module: used to receive the message to be detected and obtain the communication data in the message to be detected;
诊断目标域名提取模块:用于根据待检测报文中的通信数据捕获DNS查询,并提取诊断目标域名;Diagnostic target domain name extraction module: used to capture DNS queries according to the communication data in the message to be detected, and extract the diagnostic target domain name;
DGA诊断模块:用于利用DGA检测流程对诊断目标域名进行诊断;DGA diagnosis module: used for diagnosing the diagnosis target domain name by using the DGA detection process;
恶意域名中断模块:用于若所述诊断目标域名判定为恶意域名,则启动反制机制,通过反制机制中断恶意域名的通信节点。Malicious domain name interruption module: used to start a countermeasure mechanism if the diagnosis target domain name is determined to be a malicious domain name, and interrupt the communication node of the malicious domain name through the countermeasure mechanism.
最后,公开来了一种计算机存储介质,其特征在于,所述计算机存储介质上存储有计算机程序,所述计算机程序被处理器执行时实现任意一项所述的一种大数据环境下网络空间精准反制方法的步骤。Finally, a computer storage medium is disclosed, which is characterized in that a computer program is stored on the computer storage medium, and when the computer program is executed by a processor, the network space under any one of the big data environments can be realized. The steps of the precision counter method.
经由上述的技术方案可知,与现有技术相比,本发明公开提供了一种大数据环境下网络空间精准反制方法、系统及可存储介质,具有以下有益效果:It can be seen from the above-mentioned technical solutions that, compared with the prior art, the present invention discloses a method, system and storage medium for accurate cyberspace countermeasures in a big data environment, which have the following beneficial effects:
1、能够在不干扰正常通信流量的同时捕获流量供后续过程分析使用;1. Capable of capturing traffic for subsequent process analysis without interfering with normal communication traffic;
2、能够更好地提高网络安全攻击反制能力,保障网络空间安全。2. It can better improve the ability to counter cyber security attacks and ensure cyberspace security.
附图说明Description of drawings
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据提供的附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, the accompanying drawings in the following description are only It is an embodiment of the present invention, and those skilled in the art can also obtain other drawings according to the provided drawings without creative work.
图1为本发明的流程示意图;Fig. 1 is a schematic flow sheet of the present invention;
图2为本发明的结构示意图。Fig. 2 is a structural schematic diagram of the present invention.
具体实施方式detailed description
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some, not all, embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.
本发明实施例公开了一种大数据环境下网络空间精准反制方法,如图1所示,包括以下步骤:The embodiment of the present invention discloses a network space precise countermeasure method in a big data environment, as shown in Figure 1, including the following steps:
接收待检测报文并获取待检测报文中的通信数据;Receive the message to be detected and obtain the communication data in the message to be detected;
根据待检测报文中的通信数据捕获DNS查询,并提取诊断目标域名;Capture the DNS query according to the communication data in the message to be detected, and extract the diagnostic target domain name;
利用DGA检测流程对诊断目标域名进行诊断;Use the DGA detection process to diagnose the diagnosis target domain name;
若诊断目标域名判定为恶意域名,则启动反制机制;If the diagnosis target domain name is determined to be a malicious domain name, a countermeasure mechanism will be activated;
通过反制机制中断恶意域名的通信节点。Interrupt the communication nodes of malicious domain names through the counter mechanism.
在本实施例中,反制机制的算法具体如下:In this embodiment, the algorithm of the countermeasure mechanism is as follows:
基于平均通信总流量大小、平均通信数据包、平均通信连接次数标准建立网络连接图;Establish a network connection diagram based on the average total communication traffic size, average communication data packets, and average communication connection times;
删除网络图中和异常域名节点未进行过通信的节点,选取任意异常域名节点为起始节点;Delete the nodes that have not communicated with the abnormal domain name node in the network graph, and select any abnormal domain name node as the starting node;
从起始节点遍历相邻节点,利用蚁群算法遍历意义两个异常域名节点之间的最短路径;Traverse the adjacent nodes from the starting node, and use the ant colony algorithm to traverse the shortest path between two abnormal domain name nodes;
根据最短路径进行检索,中断出现最多的节点与异常域名节点,完成反制机制。Retrieve according to the shortest path, interrupt the nodes with the most occurrences and abnormal domain name nodes, and complete the countermeasure mechanism.
DGA检测流程包括静态特征检测和动态特征检测。The DGA detection process includes static feature detection and dynamic feature detection.
静态特征检测具体为:提取诊断目标域名的静态特征,使用静态特征分类器做出判断,对于可信度高于预设阈值的判断结果,直接给出结论,结束流程,并将诊断目标域名添加到白名单;对于其他判断结果,判别为“可疑域名”进入动态特征检测流程。Static feature detection is specifically: extracting the static features of the diagnosis target domain name, using the static feature classifier to make a judgment, and directly giving a conclusion for the judgment result whose reliability is higher than the preset threshold, ending the process, and adding the diagnosis target domain name to the white list; for other judgment results, it is judged as a "suspicious domain name" and enters the dynamic feature detection process.
动态特征检测具体为:提取诊断目标的动态特征,使用动态特征分类器来做出判断,对于可信度高于预设阈值的判断结果,给出结论并将诊断目标放入相应的黑名单或白名单中,对于其他判断结果,只给出结论,不修改黑白名单。The dynamic feature detection is specifically: extracting the dynamic features of the diagnostic target, using a dynamic feature classifier to make a judgment, and for the judgment result with a reliability higher than the preset threshold, giving a conclusion and putting the diagnostic target into the corresponding blacklist or In the white list, for other judgment results, only conclusions are given, and the black and white lists are not modified.
具体的,检测流程由防护目标网络中的DNS查询捕捉事件触发,一个诊断目标首先经过白名单过滤器、黑名单过滤器,若命中则立即给出结论并结束流程,否则向后进行;诊断目标的静态特征将被提取,之后使用静态特征分类器来做出判断,对于高可信度的判断结果,直接给出结论,结束流程,对于其他判断结果,判别为‘可疑域名’进入后续流程;诊断目标的动态特征将被提取,之后使用动态特征分类器来做出判断,对于高可信度的判断结果,给出结论并将诊断目标放入相应的黑或白名单中,对于其他判断结果,只给出结论,不修改黑白名单。Specifically, the detection process is triggered by the DNS query capture event in the protection target network. A diagnostic target first passes through the whitelist filter and blacklist filter. If it is hit, it will immediately give a conclusion and end the process, otherwise it will go backward; the diagnostic target The static features will be extracted, and then the static feature classifier will be used to make a judgment. For the judgment results with high reliability, a conclusion will be given directly, and the process will end. For other judgment results, it will be judged as a 'suspicious domain name' and enter the subsequent process; The dynamic features of the diagnostic target will be extracted, and then the dynamic feature classifier will be used to make a judgment. For high-confidence judgment results, a conclusion will be given and the diagnostic target will be put into the corresponding black or white list. For other judgment results , only the conclusion is given, and the black and white lists are not modified.
整个检测流程,最多会进行四次判断,即白名单过滤器、黑名单过滤器、静态特征分类器与动态特征分类器。In the entire detection process, up to four judgments are made, namely whitelist filter, blacklist filter, static feature classifier and dynamic feature classifier.
黑白名单数据库是自维护的,在研究过程中,经动态特征分类器诊断的高可信结果将导入到黑白名单数据库,以提升检测效率。The black-and-white list database is self-maintaining. During the research process, the high-confidence results diagnosed by the dynamic feature classifier will be imported into the black-and-white list database to improve detection efficiency.
白名单初始化的方式—取Alexa排名靠前的域名,该列表可以从top.chinaz.com、www.alexa.cn等网站通过爬虫获取。黑名单初始化的方式——有两种途径:其一、从挂马举报平台通过爬虫获取被挂过木马的域名,如www.anva.org.cn/virusAddress/listBlack,untroubled.org/spam/等;其二,利用公开的垃圾邮件数据库,提取其中的域名,挂马和垃圾邮件是僵尸网络的主要用途,因此其中涵盖大量恶意域名。Whitelist initialization method—take the top-ranked domain names of Alexa, and the list can be obtained through crawlers from top.chinaz.com, www.alexa.cn and other websites. How to initialize the blacklist - there are two ways: First, obtain the domain name of the Trojan horse from the trojan report platform through crawlers, such as www.anva.org.cn/virusAddress/listBlack, untroubled.org/spam/, etc. ; Second, using the public spam database to extract the domain names in it, hanging horses and spam are the main purposes of botnets, so it covers a large number of malicious domain names.
静态特征分类器主要针对于Domain Flux技术相关的恶意域名进行实时检测,它的检测原理基于这样的前提假设,良好的(或正常的)域名构造方式是有一定统计规律的,如长度不宜过长,即使有数字但也不会太多,数字与字母交替的情况一般不超过两次,这些构造规律本质上是保证域名可以被人们轻易记住,印象深刻,有益于网站的推广,值得注意的是,这种域名往往很早被抢注甚至炒作,因此注册成本很高;相反,僵尸网络利用域名的目的并不是让人去记住,而是用于计算机间的连接,结合注册成本的考虑,根本不会考虑上述构造规律,因此以这些规律出发,可以有效建立区分恶意/正常域名的统计特征。The static feature classifier is mainly aimed at real-time detection of malicious domain names related to Domain Flux technology. Its detection principle is based on the assumption that a good (or normal) domain name construction method has certain statistical rules, such as the length should not be too long , even if there are numbers, there are not too many, and the number and letters alternate generally do not exceed twice. These structural rules essentially ensure that the domain name can be easily remembered by people, impressive, and beneficial to the promotion of the website. It is worth noting Yes, this kind of domain name is often squatted or even hyped very early, so the registration cost is very high; on the contrary, the purpose of using domain names by botnets is not for people to remember, but for the connection between computers, combined with the consideration of registration costs , will not consider the above structure rules at all, so starting from these rules, the statistical features for distinguishing malicious/normal domain names can be effectively established.
本实施例主要从以下静态特征开展了研究:This embodiment mainly carries out research from the following static features:
域名长度:如www.163.com的长度为11。Domain name length: For example, the length of www.163.com is 11.
数字比例:DigitRatio=DigitNum/length,其中DigitNum为FQDN中数字的数量。Digit ratio: DigitRatio=DigitNum/length, where DigitNum is the number of digits in the FQDN.
数字与字母切换比例:相邻两个字符称为一个“相邻字符对”,若一个相邻字符对中只存在一个数字,则为一个“数字与字母切换”,该特征为数字与字母切换总数与相邻字符对总数的比例。Number and letter switching ratio: two adjacent characters are called an "adjacent character pair", if there is only one number in an adjacent character pair, it is a "number and letter switching", this feature is a number and letter switching The ratio of the total to the total number of pairs of adjacent characters.
站点名与主域名长度比例:SiteRatio=SiteLength/MainDomainLength。其中,SiteLength为FQDN中站点名称的长度,MainDomainLength为主域名的长度。如:www.163.com的站点名称为www,SiteLength=3,主域名为163,MainDomainLength=3。The ratio of the length of the site name to the main domain name: SiteRatio=SiteLength/MainDomainLength. Among them, SiteLength is the length of the site name in the FQDN, and MainDomainLength is the length of the main domain name. For example: the site name of www.163.com is www, SiteLength=3, the main domain name is 163, MainDomainLength=3.
连接符的数量:FQDN中连接符“-”的个数。Number of connectors: the number of connectors "-" in the FQDN.
最大词长度:以小数点“.”为分隔符,将FQDN分割为多个字符串,其中最长的字符串的长度。Maximum word length: Use the decimal point "." as the separator to divide the FQDN into multiple strings, and the length of the longest string among them.
国家顶级域名的类型:如“cn”,”jp”等。The type of country top-level domain name: such as "cn", "jp", etc.
国际顶级域名的类型:如“com”,”net”等。The type of international top-level domain name: such as "com", "net" and so on.
二级国际顶级域名的类型:如“edu”,“gov”等。The type of the second-level international top-level domain name: such as "edu", "gov", etc.
静态特征分类器的训练样本来自现有黑白名单,针对每一个FQDN可以直接计算上述静态特征,从而形成训练样本,样本依黑白名单,标注为两类—正常域名、恶意域名。针对这些样本,利用SVM算法建立分类器模型,该分类器模型除判别类别外,还支持输出概率。The training samples of the static feature classifier come from the existing black and white lists. For each FQDN, the above static features can be directly calculated to form training samples. The samples are marked into two categories according to the black and white lists—normal domain names and malicious domain names. Aiming at these samples, a classifier model is established by using the SVM algorithm, and the classifier model supports output probability in addition to distinguishing categories.
动态特征分类器主要针对于FFSN技术相关的恶意域名进行实时检测,它的检测原理基于这样的前提假设,正常的域名一般是分为两类,一个域名就对应一个ip,TTL(域名缓存时长)一般较大,这是最常见的情况;一个域名对应一组固定的ip,这些ip基本是固定的,且物理位置基本固定,这种情况一般发生在访问量很大的站点,这些站点利用CDN技术,实现负载均衡。The dynamic feature classifier is mainly aimed at real-time detection of malicious domain names related to FFSN technology. Its detection principle is based on the assumption that normal domain names are generally divided into two categories, one domain name corresponds to one ip, TTL (domain name cache time) Generally larger, this is the most common situation; a domain name corresponds to a set of fixed IPs, these IPs are basically fixed, and the physical location is basically fixed, this situation generally occurs in sites with a lot of visits, these sites use CDN technology to achieve load balancing.
对于其他情况则很可能是僵尸网络利用恶意域名在维护网络整体的通信的表象。例如,由于僵尸网络的脆弱性,C&C代理主机经常性失效,此时一个恶意域名对应的ip就需要被替换为新的C&C代理主机,因此ip经常性变换、地理位置分布变化等就是一些明显的现象。基于这些现象,可以建立一下统计特征,对于同一诊断目标(域名),采集N次(经验值20次,间隔3小时)DNS请求返回的结果。For other cases, it is likely that the botnet is using malicious domain names to maintain the communication of the entire network. For example, due to the vulnerability of botnets, the C&C proxy host often fails. At this time, the IP corresponding to a malicious domain name needs to be replaced with a new C&C proxy host. Therefore, frequent changes in IP and changes in geographical location distribution are some obvious problems. Phenomenon. Based on these phenomena, statistical characteristics can be established. For the same diagnostic target (domain name), the results returned by DNS requests are collected N times (experience value 20 times, with an interval of 3 hours).
在本实施例中,还包括数据库将DGA检测流程的检测结果进行存储,数据库包括白名单数据库和黑名单数据库;白名单数据库存储安全的目的主机,目的服务器域名;黑名单数据库存储已知的恶意特征,恶意特征检测引擎使用黑名单数据库内容进行匹配,临时运算数据库存储临时数据存储地址与每个模块的计算结果。In this embodiment, a database is also included to store the detection results of the DGA detection process, and the database includes a whitelist database and a blacklist database; the whitelist database stores safe destination hosts and destination server domain names; the blacklist database stores known malicious Features, the malicious feature detection engine uses the contents of the blacklist database for matching, and the temporary calculation database stores the temporary data storage address and the calculation results of each module.
另一方面,还公开了一种大数据环境下网络空间精准反制系统,如图2所示,包括:On the other hand, a precise cyberspace countermeasure system in a big data environment is also disclosed, as shown in Figure 2, including:
待检测数据接收与获取模块:用于接收待检测报文并获取待检测报文中的通信数据;Data to be detected receiving and obtaining module: used to receive the message to be detected and obtain the communication data in the message to be detected;
诊断目标域名提取模块:用于根据待检测报文中的通信数据捕获DNS查询,并提取诊断目标域名;Diagnostic target domain name extraction module: used to capture DNS queries according to the communication data in the message to be detected, and extract the diagnostic target domain name;
DGA诊断模块:用于利用DGA检测流程对诊断目标域名进行诊断;DGA diagnosis module: used for diagnosing the diagnosis target domain name by using the DGA detection process;
恶意域名中断模块:用于若诊断目标域名判定为恶意域名,则启动反制机制,通过反制机制中断恶意域名的通信节点。Malicious domain name interruption module: used to start the countermeasure mechanism if the diagnosis target domain name is determined to be a malicious domain name, and interrupt the communication node of the malicious domain name through the countermeasure mechanism.
最后,公开来了一种计算机存储介质,其特征在于,计算机存储介质上存储有计算机程序,计算机程序被处理器执行时实现任意一项的一种大数据环境下网络空间精准反制方法的步骤。Finally, a computer storage medium is disclosed, which is characterized in that a computer program is stored on the computer storage medium, and when the computer program is executed by a processor, any one of the steps of a method for accurately countering cyberspace in a big data environment is realized .
本说明书中各个实施例采用递进的方式描述,每个实施例重点说明的都是与其他实施例的不同之处,各个实施例之间相同相似部分互相参见即可。对于实施例公开的装置而言,由于其与实施例公开的方法相对应,所以描述的比较简单,相关之处参见方法部分说明即可。Each embodiment in this specification is described in a progressive manner, each embodiment focuses on the difference from other embodiments, and the same and similar parts of each embodiment can be referred to each other. As for the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and for the related information, please refer to the description of the method part.
对所公开的实施例的上述说明,使本领域专业技术人员能够实现或使用本发明。对这些实施例的多种修改对本领域的专业技术人员来说将是显而易见的,本文中所定义的一般原理可以在不脱离本发明的精神或范围的情况下,在其它实施例中实现。因此,本发明将不会被限制于本文所示的这些实施例,而是要符合与本文所公开的原理和新颖特点相一致的最宽的范围。The above description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the general principles defined herein may be implemented in other embodiments without departing from the spirit or scope of the invention. Therefore, the present invention will not be limited to the embodiments shown herein, but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
Claims (8)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211172640.1A CN115550021A (en) | 2022-09-26 | 2022-09-26 | Method and system for accurately replicating network space in big data environment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211172640.1A CN115550021A (en) | 2022-09-26 | 2022-09-26 | Method and system for accurately replicating network space in big data environment and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115550021A true CN115550021A (en) | 2022-12-30 |
Family
ID=84730073
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211172640.1A Pending CN115550021A (en) | 2022-09-26 | 2022-09-26 | Method and system for accurately replicating network space in big data environment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115550021A (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180343272A1 (en) * | 2017-05-26 | 2018-11-29 | Qatar Foundation | Method to identify malicious web domain names thanks to their dynamics |
CN112866023A (en) * | 2021-01-13 | 2021-05-28 | 恒安嘉新(北京)科技股份公司 | Network detection method, model training method, device, equipment and storage medium |
CN113746952A (en) * | 2021-09-14 | 2021-12-03 | 京东科技信息技术有限公司 | DGA domain name detection method, device, electronic equipment and computer storage medium |
CN114513355A (en) * | 2022-02-14 | 2022-05-17 | 平安科技(深圳)有限公司 | Malicious domain name detection method, device, equipment and storage medium |
CN114978770A (en) * | 2022-07-25 | 2022-08-30 | 睿至科技集团有限公司 | Internet of things security risk early warning management and control method and system based on big data |
-
2022
- 2022-09-26 CN CN202211172640.1A patent/CN115550021A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180343272A1 (en) * | 2017-05-26 | 2018-11-29 | Qatar Foundation | Method to identify malicious web domain names thanks to their dynamics |
CN112866023A (en) * | 2021-01-13 | 2021-05-28 | 恒安嘉新(北京)科技股份公司 | Network detection method, model training method, device, equipment and storage medium |
CN113746952A (en) * | 2021-09-14 | 2021-12-03 | 京东科技信息技术有限公司 | DGA domain name detection method, device, electronic equipment and computer storage medium |
CN114513355A (en) * | 2022-02-14 | 2022-05-17 | 平安科技(深圳)有限公司 | Malicious domain name detection method, device, equipment and storage medium |
CN114978770A (en) * | 2022-07-25 | 2022-08-30 | 睿至科技集团有限公司 | Internet of things security risk early warning management and control method and system based on big data |
Non-Patent Citations (2)
Title |
---|
王文通;胡宁;刘波;刘欣;李树栋;: "DNS安全防护技术研究综述" * |
王林汝;吴琳;蔡冰;: "基于静态及动态特征的恶意域名检测技术研究" * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Vinayakumar et al. | Scalable framework for cyber threat situational awareness based on domain name systems data analysis | |
Khormali et al. | Domain name system security and privacy: A contemporary survey | |
Nadler et al. | Detection of malicious and low throughput data exfiltration over the DNS protocol | |
Bagui et al. | Using machine learning techniques to identify rare cyber‐attacks on the UNSW‐NB15 dataset | |
CN112910929B (en) | Method and device for malicious domain name detection based on heterogeneous graph representation learning | |
US20140047543A1 (en) | Apparatus and method for detecting http botnet based on densities of web transactions | |
CN110830490B (en) | Malicious domain name detection method and system based on area confrontation training deep network | |
CN102685145A (en) | Domain name server (DNS) data packet-based bot-net domain name discovery method | |
Bisio et al. | Real-time behavioral DGA detection through machine learning | |
Zhang et al. | BotDigger: Detecting DGA Bots in a Single Network. | |
Ghafir et al. | DNS traffic analysis for malicious domains detection | |
Satam et al. | Anomaly Behavior Analysis of DNS Protocol. | |
Geng et al. | Combating phishing attacks via brand identity and authorization features | |
Ghafir et al. | DNS query failure and algorithmically generated domain-flux detection | |
Špaček et al. | Current issues of malicious domains blocking | |
CN107786539A (en) | A kind of method that anti-CC attacks are carried out based on DNS | |
US20240333755A1 (en) | Reactive domain generation algorithm (dga) detection | |
Bao et al. | Using passive dns to detect malicious domain name | |
Barbosa et al. | Identifying and classifying suspicious network behavior using passive dns analysis | |
Xu et al. | Defending against UDP flooding by negative selection algorithm based on eigenvalue sets | |
CN107612876A (en) | The detection method of service request bag extensive aggression in wisdom contract network | |
CN115550021A (en) | Method and system for accurately replicating network space in big data environment and storage medium | |
Xuanzhen et al. | Application of passive DNS in cyber security | |
TWI777766B (en) | System and method of malicious domain query behavior detection | |
Vishwakarma | Domain name generation algorithms |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20221230 |
|
RJ01 | Rejection of invention patent application after publication |