CN106899586A - A kind of dns server software fingerprinting identifying system and method based on machine learning - Google Patents

A kind of dns server software fingerprinting identifying system and method based on machine learning Download PDF

Info

Publication number
CN106899586A
CN106899586A CN201710093763.9A CN201710093763A CN106899586A CN 106899586 A CN106899586 A CN 106899586A CN 201710093763 A CN201710093763 A CN 201710093763A CN 106899586 A CN106899586 A CN 106899586A
Authority
CN
China
Prior art keywords
dns server
server software
module
machine learning
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710093763.9A
Other languages
Chinese (zh)
Inventor
邹福泰
周江林
裴蓓
潘理
李建华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Jiaotong University
Third Research Institute of the Ministry of Public Security
Original Assignee
Shanghai Jiaotong University
Third Research Institute of the Ministry of Public Security
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Jiaotong University, Third Research Institute of the Ministry of Public Security filed Critical Shanghai Jiaotong University
Priority to CN201710093763.9A priority Critical patent/CN106899586A/en
Publication of CN106899586A publication Critical patent/CN106899586A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L61/00Network arrangements, protocols or services for addressing or naming
    • H04L61/45Network directories; Name-to-address mapping
    • H04L61/4505Network directories; Name-to-address mapping using standardised directories; using standardised directory access protocols
    • H04L61/4511Network directories; Name-to-address mapping using standardised directories; using standardised directory access protocols using domain name system [DNS]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/20Software design
    • G06F8/22Procedural
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/12Fingerprints or palmprints
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/08Network architectures or network communication protocols for network security for authentication of entities
    • H04L63/0861Network architectures or network communication protocols for network security for authentication of entities using biometrical features, e.g. fingerprint, retina-scan

Abstract

The invention discloses a kind of dns server software fingerprinting identifying system and method based on machine learning, the decision tree classification module and output module that the system is generated by input module, characteristic extracting module (query/response → extraction feature), local training in advance are constituted.The present invention is using user's target dns server domain name to be detected as input, system obtains the feature record of the request/time bag of target dns server in running background program, recognized in the forecast model for training and judged, the software version information of final output target dns server.Dns server software fingerprinting identifying system and method based on machine learning of the present invention, by the way of machine learning, detect and identification dns server software version information, improve the security of dns server.

Description

A kind of dns server software fingerprinting identifying system and method based on machine learning
Technical field
The present invention relates to domain name system security management domain, and in particular to a kind of dns server based on machine learning is soft The System and method for of part version information identification.
Background technology
Domain name system (DNS) is a kind of distributed naming management system of hierarchy, be internet core element it One, groundwork is responsible for facilitating the domain name addresses (such as www.baidu.com) of human mind to be mapped to internet route choosing The IP address selected, the Internet, applications function of the overwhelming majority is required for based on domain name system realizing, once domain name system System breaks down, it is impossible to provide normal domain name resolution service, then, quite most internet functions will fail therewith, give User makes troubles and incalculable damage, therefore the security of domain name system is quite important, causes numerous experts and scholars With the concern and research of operator.
Unfortunately, domain name system suffer from substantial amounts of attack all the time, and these attack problems mostly come from execution The reason such as mistake, DNS Protocol leak, the DNS query request forged.According to CNNIC (CNNIC) and country The issue of domain name Security Association《Chinese domain name service security status and situation distribution report (2014)》It has been shown that, China's TLD Server generally employs Linux/Unix operating systems, and both reach more than 98% at accounting, meanwhile, taken in the DNS for being used Business software aspects, ISC BIND (Berkeley Internet Name Domain) are still most TLD services The first-selected DNS service software of device, ratio accounted for 81.8%, but wherein the quite BIND servers of most are still used It is that, compared with lowest version, particularly wherein 32.6% BIND service softwares open version answering, this is name server band Carry out certain potential safety hazard.Therefore, in the case where network security policy is allowed, the bug or weakness of specific dns resolution service software should This is accurately identified and controls.
Network sweep and remote application detection are usually used to acquisition target information, fingerprint recognition (fingerprint) skill Be embodied for this target by art.In terms of dns server software fingerprinting identification, most of existing methods are active modes, are passed through Send request and then bag feature is returned to according to response and judge.There is researcher to propose a kind of passive fingerprinting identification technology in addition, Extracted during dns server software version feature uses flow analysis by artificial.These methods exist partial failure or Labor workload is big, characteristic information updates slow shortcoming.
Therefore, those skilled in the art is devoted to a kind of dns server software fingerprinting knowledge based on machine learning of exploitation Other system and method, by the way of machine learning, detect and identification dns server software version information, improve dns server Security.
The content of the invention
In view of the defect of existing dns server software fingerprinting recognition methods, the present invention proposes a kind of based on machine The detection identification dns server software version information system and method for study, exploitation personnel are in advance in local training generation Decision tree module as core classification module, using ownership goal dns server domain name as input, the domain name is extracted in sequencing Query/response message characteristic record, as the input feature vector of decision tree module, obtained most by decision tree module Decision Classfication The version information of whole target dns server domain name.
In order to solve above-mentioned technological deficiency, a kind of dns server software fingerprinting based on machine learning of the present invention Identifying system, including input module, characteristic extracting module, decision tree classification module and output module, wherein, the input mould Block, characteristic extracting module, decision tree classification module and output module are sequentially connected, the characteristic extracting module be configured as to Extraction feature in inquiry that specific dns server is carried out and response;The decision tree classification module is configured as locally instructing Practice generation, and produce the software fingerprinting and corresponding version information of dns server.
Dns server software fingerprinting recognition methods is detected based on machine learning present invention also offers one kind, its feature exists In comprising the following steps:
1st step, set up dns server software version information data set;
2nd step, the data set conversion extraction feature training set in the 1st step;
3rd step, the training set in the 2nd step obtain decision tree classification module;
4th step, the decision tree classification module obtained in the 3rd step is integrated into identifying system, for for user The dns server software fingerprinting of the target domain name of input and the output of software version information.
Further, include in the 1st step, the step of set up dns server software version information data set:
(1.1) different editions dns server software is installed on native virtual machine;
(1.2) inquiry request based on DNS query bag is carried out for domain name;
(1.3) pcap flow bags are obtained using tcpdump/tshark interception DNS communication flows;The pcap flows bag is Dns server software version information data set.
Further, in the 2nd step, conversion extraction feature training set the step of include:
(2.1) the software version information data set for being obtained using the 1st step uses Python to parse institute as input Pcap flow bags are stated, every record of query/response is obtained;
(2.2) feature field is extracted in the content of the record, is aggregated into features training collection standby.
Further, in the 3rd step, obtain decision tree classification module the step of include:
(3.1) used as input, the machine learning algorithm of operational decisions tree classification is obtained the features training collection for being obtained using the 2nd step To decision tree classification module.
Further, in the 4th step:Comprise the following steps:
(4.1) the decision tree classification module is connected and has been constituted with input module, query/response module, output module Whole software fingerprinting identifying system;
(4.2) for the target domain name of user input, dns server software fingerprinting and software version information are exported.
Further, in the 2.1st step, comprise the following steps:
(2.1.1) writes pcap document analysis programs using python language, by the binary format pcap of original intercepted File reads parsing turns into 16 system character files;
(2.1.2) isolates application layer data according to pcap file formats;
(2.1.3) every Message Record is extracted the value of fixed field as spy according to the head form of DNS message formats Levy.
Further, in 2.1.3 steps, the fixed field includes Opcode, Authoritative, Recursion Available and Reply code.
Further, for the request of same domain name, the return bag feature that different type is asked is horizontally-spliced, all An effective training set data record is just obtained after type return bag merging features are complete.
Technology contents of the invention:A kind of systems approach of the dns server software fingerprinting identification based on machine learning, it is main Comprise the following steps:
1. classifier modules are set up:For the dns server software of different editions, query/response flow bag is intercepted, used Python language has write pcap document analysis programs, and the binary format pcap files of original intercepted are read into parsing turns into 16 System character file, and application layer (i.e. DNS Protocol layer) data are isolated according to pcap file formats, then disappear further according to DNS Breath form is a form, and every Message Record is extracted into Opcode, Authoritative, Recursion available, Reply code etc. amount to the value of 11 fixed fields as feature, for the request of same domain name, different type are asked Return to bag feature horizontally-spliced, an effective training set data is just obtained after such all types return bag merging features are complete Record, after the same method, by other dns resolution software version or the return bag similar process of other domain names, it is possible thereby to Obtain a complete training dataset.The operational decisions tree classification algorithm on this training set, obtains decision tree classifier module.
2. sequencing extraction feature:For the target DNS domain name of user input, we are using foundation classification described in step 1 Device model early stage is done, and a series of DNS query requestorization of types is performed, and intercepts correlative flow, and program is taken out automatically Feature into record is taken, the sample input of decision tree classifier module is constituted.
3. machine learning:The target DNS domain name query/response message of user input is extracted using step 2 described programization Feature, is input in the classifier modules for pre-building, be input into for this according to the characteristic attribute pattern for learning in advance by grader It is divided into existing classification, outputs results to user interface.
Brief description of the drawings
Fig. 1 is a kind of method flow of dns server software fingerprinting identification based on machine learning of the invention;
Fig. 2 is decision tree classifier module section structure chart of the invention.
Specific embodiment
Shown in reference picture 1, a kind of method system of the dns server software fingerprinting identification based on machine learning, the system System comprises the following steps:
1. target dns server domain name is read in:Target dns server domain name is passed through inputting interface input system by user In.
2. target dns server domain name feature is extracted:
It is up to tens kinds of A, AAAA, NS, PTR etc. that 2.1 system programs send Type from trend target dns server domain name The inquiry request message of type.
2.2 while system program gives out a contract for a project inquiry, and system background operating flux intercepts program by looking into described in step 2.1 Ask request/response bag and intercept into the preservation of pcap bags.
2.3 for the pcap bags that step 2.2 is obtained, according to DNS message packet forms, using the program for writing by every Message Record extracts Opcode, Authoritative, Recursion available, Reply code etc. and amounts to 11 fixations , used as feature, for the request of same domain name, the return bag feature that different type is asked is horizontally-spliced, so for the value of field All types return bag feature it is horizontally-spliced completely after just obtain an effective characteristic and record.
3. machine learning classification is based on:Decision tree classifier structure as shown in Figure 2, we will obtain sample in step 2 Characteristic is input in the decision tree classifier module for pre-building, it enters from the tree root of decision tree, in each set Node progressively drops to leaf node by that analogy by the different path of the judgement selection of certain attribute to sample, each Leaf node is a kind of dns server software version, such that it is able to obtain the classification belonging to input sample.
4., by the result for obtaining in step 3 by output module Formatting Output to user interface, target DNS clothes are completed Business device software fingerprinting identification work.
Preferred embodiment of the invention described in detail above.It should be appreciated that the ordinary skill of this area is without wound The property made work just can make many modifications and variations with design of the invention.Therefore, all technical staff in the art Pass through the available technology of logical analysis, reasoning, or a limited experiment on the basis of existing technology under this invention's idea Scheme, all should be in the protection domain being defined in the patent claims.

Claims (9)

1. a kind of dns server software fingerprinting identifying system based on machine learning, it is characterised in that including input module, spy Extraction module, decision tree classification module and output module are levied, wherein, the input module, characteristic extracting module, decision tree classification Module and output module are sequentially connected, and the characteristic extracting module is configured as in the inquiry carried out to specific dns server With extraction feature in response;The decision tree classification module is configured as in local training generation, and produces the institute of dns server State software fingerprinting and corresponding version information.
2. it is a kind of that dns server software fingerprinting recognition methods is detected based on machine learning, it is characterised in that to comprise the following steps:
1st step, set up dns server software version information data set;
2nd step, the data set conversion extraction feature training set in the 1st step;
3rd step, the training set in the 2nd step obtain decision tree classification module;
4th step, the decision tree classification module obtained in the 3rd step is integrated into identifying system, for for user input Target domain name dns server software fingerprinting and the output of software version information.
3. according to claim 2 to detect dns server software fingerprinting recognition methods based on machine learning, its feature exists In including in the 1st step, the step of set up dns server software version information data set:
(1.1) different editions dns server software is installed on native virtual machine;
(1.2) inquiry request based on DNS query bag is carried out for domain name;
(1.3) pcap flow bags are obtained using tcpdump/tshark interception DNS communication flows;The pcap flows bag is DNS Server software version information data set.
4. according to claim 3 to detect dns server software fingerprinting recognition methods based on machine learning, its feature exists In, in the 2nd step, conversion extraction feature training set the step of include:
(2.1) the software version information data set for being obtained using the 1st step is parsed described as input using Python Pcap flow bags, obtain every record of query/response;
(2.2) feature field is extracted in the content of the record, is aggregated into features training collection standby.
5. according to claim 4 to detect dns server software fingerprinting recognition methods based on machine learning, its feature exists In, in the 3rd step, obtain decision tree classification module the step of include:
(3.1) the features training collection for being obtained using the 2nd step used as input, determined by the machine learning algorithm of operational decisions tree classification Plan tree classification module.
6. according to claim 5 to detect dns server software fingerprinting recognition methods based on machine learning, its feature exists In in the 4th step:Comprise the following steps:
(4.1) the decision tree classification module is connected with input module, query/response module, output module and constitutes complete Software fingerprinting identifying system;
(4.2) for the target domain name of user input, dns server software fingerprinting and software version information are exported.
7. according to claim 4 to detect dns server software fingerprinting recognition methods based on machine learning, its feature exists In in the 2.1st step, comprising the following steps:
(2.1.1) writes pcap document analysis programs using python language, by the binary format pcap files of original intercepted Reading parsing turns into 16 system character files;
(2.1.2) isolates application layer data according to pcap file formats;
(2.1.3) every Message Record is extracted the value of fixed field as feature according to the head form of DNS message formats.
8. according to claim 7 to detect dns server software fingerprinting recognition methods based on machine learning, its feature exists In, 2.1.3 step, the fixed field include Opcode, Authoritative, Recursion available and Reply code。
9. according to claim 7 to detect dns server software fingerprinting recognition methods based on machine learning, its feature exists In for the request of same domain name, the return bag feature that different type is asked is horizontally-spliced, and all types return to bag feature An effective training set data record is just obtained after splicing completely.
CN201710093763.9A 2017-02-21 2017-02-21 A kind of dns server software fingerprinting identifying system and method based on machine learning Pending CN106899586A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710093763.9A CN106899586A (en) 2017-02-21 2017-02-21 A kind of dns server software fingerprinting identifying system and method based on machine learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710093763.9A CN106899586A (en) 2017-02-21 2017-02-21 A kind of dns server software fingerprinting identifying system and method based on machine learning

Publications (1)

Publication Number Publication Date
CN106899586A true CN106899586A (en) 2017-06-27

Family

ID=59184204

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710093763.9A Pending CN106899586A (en) 2017-02-21 2017-02-21 A kind of dns server software fingerprinting identifying system and method based on machine learning

Country Status (1)

Country Link
CN (1) CN106899586A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107995226A (en) * 2017-12-27 2018-05-04 山东华软金盾软件股份有限公司 A kind of device-fingerprint recognition methods based on passive flux
CN110166522A (en) * 2019-04-01 2019-08-23 腾讯科技(深圳)有限公司 Server identification method, device, readable storage medium storing program for executing and computer equipment
CN110198309A (en) * 2019-05-14 2019-09-03 北京墨云科技有限公司 A kind of Web server recognition methods, device, terminal and storage medium
CN110602041A (en) * 2019-08-05 2019-12-20 中国人民解放军战略支援部队信息工程大学 White list-based Internet of things equipment identification method and device and network architecture
WO2020150880A1 (en) * 2019-01-22 2020-07-30 道里云信息技术(北京)有限公司 Publicly verifiable compressed fingerprints and an application in securing domain name systems
CN112788159A (en) * 2020-12-31 2021-05-11 山西三友和智慧信息技术股份有限公司 Webpage fingerprint tracking method based on DNS traffic and KNN algorithm

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080189326A1 (en) * 2007-02-01 2008-08-07 Microsoft Corporation Dynamic Software Fingerprinting
CN102214213A (en) * 2011-05-31 2011-10-12 中国科学院计算技术研究所 Method and system for classifying data by adopting decision tree
US20150047051A1 (en) * 2013-08-06 2015-02-12 Sap Ag Managing Access to Secured Content
CN105915555A (en) * 2016-06-29 2016-08-31 北京奇虎科技有限公司 Method and system for detecting network anomalous behavior
CN106372513A (en) * 2016-08-25 2017-02-01 北京知道未来信息技术有限公司 Software fingerprint database-based software identification method and apparatus

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080189326A1 (en) * 2007-02-01 2008-08-07 Microsoft Corporation Dynamic Software Fingerprinting
CN102214213A (en) * 2011-05-31 2011-10-12 中国科学院计算技术研究所 Method and system for classifying data by adopting decision tree
US20150047051A1 (en) * 2013-08-06 2015-02-12 Sap Ag Managing Access to Secured Content
CN105915555A (en) * 2016-06-29 2016-08-31 北京奇虎科技有限公司 Method and system for detecting network anomalous behavior
CN106372513A (en) * 2016-08-25 2017-02-01 北京知道未来信息技术有限公司 Software fingerprint database-based software identification method and apparatus

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107995226A (en) * 2017-12-27 2018-05-04 山东华软金盾软件股份有限公司 A kind of device-fingerprint recognition methods based on passive flux
WO2020150880A1 (en) * 2019-01-22 2020-07-30 道里云信息技术(北京)有限公司 Publicly verifiable compressed fingerprints and an application in securing domain name systems
CN110166522A (en) * 2019-04-01 2019-08-23 腾讯科技(深圳)有限公司 Server identification method, device, readable storage medium storing program for executing and computer equipment
CN110166522B (en) * 2019-04-01 2021-08-24 腾讯科技(深圳)有限公司 Server identification method and device, readable storage medium and computer equipment
CN110198309A (en) * 2019-05-14 2019-09-03 北京墨云科技有限公司 A kind of Web server recognition methods, device, terminal and storage medium
CN110602041A (en) * 2019-08-05 2019-12-20 中国人民解放军战略支援部队信息工程大学 White list-based Internet of things equipment identification method and device and network architecture
CN112788159A (en) * 2020-12-31 2021-05-11 山西三友和智慧信息技术股份有限公司 Webpage fingerprint tracking method based on DNS traffic and KNN algorithm

Similar Documents

Publication Publication Date Title
CN106899586A (en) A kind of dns server software fingerprinting identifying system and method based on machine learning
CN105187394B (en) Proxy server and method with mobile terminal from malicious software action detectability
Holm et al. Automatic data collection for enterprise architecture models
IL275042A (en) Self-adaptive application programming interface level security monitoring
CN102394885B (en) Information classification protection automatic verification method based on data stream
CN108712396A (en) Networked asset management and loophole governing system
CN104065532B (en) A kind of non-recorded website search method and system based on multichannel data access way
CN103150509B (en) A kind of virus detection system based on virtual execution
WO2018235252A1 (en) Analysis device, log analysis method, and recording medium
US9390083B2 (en) Identity propagation through application layers using contextual mapping and planted values
CN112910929B (en) Malicious domain name detection method and device based on heterogeneous graph representation learning
CN107273267A (en) Log analysis method based on elastic components
CN108156131A (en) Webshell detection methods, electronic equipment and computer storage media
CN106095979A (en) URL merging treatment method and apparatus
CN103914655A (en) Downloaded file security detection method and device
CN114679292B (en) Honeypot identification method, device, equipment and medium based on network space mapping
CN104639391A (en) Method for generating network flow record and corresponding flow detection equipment
CN101453359A (en) Database error information extracting method and system
CN110011860A (en) Android application and identification method based on network traffic analysis
CN105516390A (en) Method and device for managing domain name
CN106559498A (en) Air control data collection platform and its collection method
CN108429747A (en) A kind of extensive Web server information collecting method
CN109559121A (en) Transaction path calls exception analysis method, device, equipment and readable storage medium storing program for executing
CN109189681A (en) Data simulation method, client and system based on ajax
KR102128008B1 (en) Method and apparatus for processing cyber threat information

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20170627

RJ01 Rejection of invention patent application after publication