CN110912910A - DNS network data filtering method and device - Google Patents

DNS network data filtering method and device Download PDF

Info

Publication number
CN110912910A
CN110912910A CN201911197902.8A CN201911197902A CN110912910A CN 110912910 A CN110912910 A CN 110912910A CN 201911197902 A CN201911197902 A CN 201911197902A CN 110912910 A CN110912910 A CN 110912910A
Authority
CN
China
Prior art keywords
data
dns
classifier
weight ratio
classification
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911197902.8A
Other languages
Chinese (zh)
Inventor
鄂新华
张思洋
潘恬
刘江
杨帆
黄韬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Technology
Original Assignee
Beijing University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Technology filed Critical Beijing University of Technology
Priority to CN201911197902.8A priority Critical patent/CN110912910A/en
Publication of CN110912910A publication Critical patent/CN110912910A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/02Network architectures or network communication protocols for network security for separating internal from external traffic, e.g. firewalls
    • H04L63/0227Filtering policies
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/254Fusion techniques of classification results, e.g. of results related to same input data
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L61/00Network arrangements, protocols or services for addressing or naming
    • H04L61/45Network directories; Name-to-address mapping
    • H04L61/4505Network directories; Name-to-address mapping using standardised directories; using standardised directory access protocols
    • H04L61/4511Network directories; Name-to-address mapping using standardised directories; using standardised directory access protocols using domain name system [DNS]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/10Network architectures or network communication protocols for network security for controlling access to devices or network resources
    • H04L63/101Access control lists [ACL]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1441Countermeasures against malicious traffic
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/259Fusion by voting

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • General Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)
  • Computer And Data Communications (AREA)

Abstract

The invention discloses a method and a device for filtering DNS network data, which comprises the following steps: (1) collecting network data in a DNS server through a measurement collector; (2) extracting a characteristic value of DNS data; (3) performing data processing, marking each piece of acquired data according to the selected characteristic value to obtain a related characteristic vector, and forming a corresponding characteristic matrix; (4) performing data classification by a plurality of classifiers simultaneously; (5) voting according to the weight ratio of the classifier to determine final classification; (6) and filtering the malicious network activities according to the classification result.

Description

DNS network data filtering method and device
Technical Field
The invention relates to the technical field of network communication, in particular to a DNS network data filtering method and device.
Background
The DNS network is a network system that performs domain name and IP address conversion corresponding thereto. The DNS stores a table of domain names and their corresponding IP addresses (IP addresses) to resolve the domain name of the message. After the domain name registration queries the domain name and purchases the host services, you need to resolve the domain name to the purchased host to see the website content. Currently, domain name resolution security problems exist in DNS networks.
Disclosure of Invention
The invention aims to provide a DNS network data filtering device and a DNS network data filtering method for solving the safety problem of a domain name resolution system, which can classify DNS network data through the system, filter malicious network data and further improve the safety of the domain name resolution system.
A DNS network data filtering method comprises the following steps:
collecting network data in a DNS server through a measurement collector;
extracting a characteristic value of DNS data, calculating characteristic influence weight, and configuring parameters;
processing the data according to the characteristic value;
performing data classification by a plurality of classifiers simultaneously;
voting according to the weight ratio of the classifier to determine final classification;
and filtering the malicious network activities according to the classification result.
Preferably, the network data includes: the method comprises the steps of time stamp, DNS client IP, client port number, DNS server IP, DNS message header ID, resource record type, request URL, request type, response IP, TTL and hop count.
Preferably, the constructing the plurality of classifiers includes:
and during training of the plurality of classifiers, non-return sampling is adopted, p samples are extracted from the training set with return, k features are extracted from the feature set without return, and a plurality of different classifiers are generated according to calculation of feature value influence weights.
Preferably, the processing the data according to the feature value includes:
and processing each piece of collected data according to the selected characteristic value to obtain a related characteristic vector and form a corresponding characteristic matrix.
Preferably, data classification is carried out simultaneously, the classifier weight ratio is calculated by calculating the posterior probability of the classification result of each classifier through an algorithm to obtain the weight ratio of each classifier, and the weight ratio of each classifier is carried out according to the principle that the correctness degree gives priority to the fairness.
Preferably, the malicious result in the data is intercepted by voting, and the malicious result is added into a blacklist.
A DNS network data filtering device comprises a data acquisition module, a data processing module, a parameter configuration module, a classification module and a decision module; wherein,
the data processing module comprises a preset blacklist comparison and processes data according to the selection of the characteristic value; the parameter configuration module comprises the steps of extracting characteristic values, calculating influence weights of the characteristic values and configuring parameters; the classification module consists of a plurality of parallel classifiers and is used for classifying data; the decision module comprises a final result of calculating the weight ratio of the classifier and voting the classifier according to the weight ratio.
Preferably, the preset blacklist is composed of part of malicious network activity data acquired by the system in advance and judged by the operation filtering system. And the malicious network activity data judged by the filtering system can be added into the blacklist.
Preferably, the decision module calculates the weight ratio of the classifiers by calculating the posterior probability of the classification result of each classifier through an algorithm to obtain the weight ratio of each classifier, and performs the weight ratio of each classifier according to the principle that the correctness degree gives priority to the fairness.
Preferably, the selection of the characteristic value comprises the extraction of static characteristics and dynamic characteristics of the data, wherein the static characteristics comprise the length of a secondary domain name, the structure of Chinese characters in the domain name, the number of digits in the domain name, the number of letters in the domain name and the like; the extraction of dynamic features is included in the time at which each querier is queried.
The invention provides a DNS network data filtering system which can classify DNS network data, filter malicious network data and further improve the safety of a domain name resolution system. .
Drawings
FIG. 1 shows a flow diagram of a DNS data filtering method in accordance with an embodiment of the invention;
FIG. 2 illustrates a block diagram of a DNS data filtering device in accordance with an embodiment of the present invention;
fig. 3 is a flow chart illustrating an organization of a DNS data filtering method according to an embodiment of the present invention.
Detailed Description
The following is a detailed description of embodiments of the invention, illustrated in the accompanying drawings in which like or similar reference numerals refer to the same or similar components or components having the same or similar functions throughout the several views. The embodiments described below with reference to the accompanying drawings are illustrative only for the purpose of illustrating the present invention and are not to be construed as limiting the present invention.
As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. It will be understood that when an element is referred to as being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may also be present. Further, "connected" or "coupled" as used herein may include wirelessly connected or "coupled". As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.
It will be understood by those skilled in the art that, unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the prior art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.
As shown in fig. 1, a DNS data filtering method according to an embodiment of the present invention includes the following steps:
and 101, collecting network data in the DNS through a measurement collector.
And 102, extracting characteristic values of DNS data, calculating characteristic influence weights, and configuring parameters.
And 103, processing the data according to the characteristic value.
And 104, simultaneously classifying the data by a plurality of classifiers.
The final classification is determined by matching votes according to the classifier weights 105.
And 106, filtering the malicious network activities according to the classification result.
In step 101, the network data collected in the DNS network includes:
the network data mainly comes from DNS request data cached in a DNS server;
network data includes, but is not limited to, a timestamp, a DNS client IP, a client port number, a DNS server IP, a DNS packet header ID, a resource record type, a request URL, a request type, a reply IP, a TTL, a hop count, and the like.
In step 102, the selecting of the feature value includes:
the selection of the characteristic value comprises but is not limited to the extraction of static characteristics and dynamic characteristics of data, wherein the static characteristics comprise but are not limited to the length of a secondary domain name, the structure of Chinese characters in the domain name, the number of digits in the domain name, the number of letters in the domain name and the like; the extraction of dynamic features includes, but is not limited to, the time at which each querier is queried.
In step 103, processing the data according to the feature value includes:
and processing each piece of collected data according to the selected characteristic value to obtain a related characteristic vector and form a corresponding characteristic matrix.
In step 104, the plurality of classifiers comprises:
a plurality of different classifiers are generated according to the calculation of the eigenvalue influence weights.
In step 105, the decision module comprises:
the calculation of the weight ratio of the classifiers is to calculate the posterior probability of the classification result of each classifier through an algorithm to obtain the weight ratio of each classifier, and the weight ratio of each classifier is carried out according to the principle that the correctness degree gives priority to the fairness.
Fig. 2 is a configuration diagram of a DNS data filtering apparatus according to an embodiment of the present invention. The method is suitable for a network data analysis scene of a domain name analysis system, and firstly, data are collected on a server of the domain name analysis system by using a measurement collector, namely a DNS server data collection module in a figure. And extracting the characteristic value of the data at a parameter configuration module, calculating the influence weight of the characteristic value, and configuring related parameters. And after the data processing module obtains DNS network data, firstly, comparing the DNS network data according to a blacklist preset in the system, filtering the DNS network data which is the same as the data in the blacklist, then, processing the data according to the characteristic value and transmitting the characteristic vector obtained by processing to a classification module. The method comprises the steps that a plurality of different classifiers are generated in a classification module according to calculation of feature value influence weights, each classifier classifies data at the same time, classification results are transmitted to a decision module, the decision module calculates the posterior probability of the classification results of each classifier through an algorithm to obtain the weight proportion of each classifier, the weight proportion of each classifier is carried out according to the principle that the accuracy degree gives priority to fairness, then the results of each classifier are voted according to the weights to generate final results, and malicious network data in the classification results are filtered and added into a preset blacklist.
Fig. 3 is an organization flowchart of a DNS data filtering method according to an embodiment of the present invention.
Firstly, a measurement collector is used for collecting data on a server of a domain name resolution system, characteristic value extraction is carried out on the collected data, influence weight of the characteristic value is calculated, the data are processed according to the extracted characteristic value, relevant characteristic vectors are obtained, and a corresponding characteristic matrix is formed. And then carrying out relevant configuration on the parameters according to the influence weight of the characteristic value. Generating a plurality of different classifiers according to the calculation of the influence weight of the characteristic value, inputting data into the classifiers, classifying the data by each classifier, calculating the posterior probability of the classification result of each classifier through an algorithm to obtain the weight ratio of each classifier, performing the weight ratio of each classifier according to the principle that the accuracy degree gives priority to fairness, voting the results of each classifier according to the weights to generate a final result, filtering malicious network data in the classification result, and adding the malicious network data into a preset blacklist.
By the technical scheme provided by the invention, the DNS network data can be classified by the system, malicious network data can be filtered, and the safety of the domain name resolution system is further improved.
Those skilled in the art will appreciate that the present invention may be directed to an apparatus for performing one or more of the operations described in the present application. The apparatus may be specially designed and constructed for the required purposes, or it may comprise any known apparatus in a general purpose computer selectively activated or reconfigured by a program stored in the general purpose computer.
It will be understood by those within the art that each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the methods specified in the block or blocks of the block diagrams and/or flowchart block or blocks.
Those of skill in the art will appreciate that various operations, methods, steps in the processes, acts, or solutions discussed in the present application may be alternated, modified, combined, or deleted. Further, various operations, methods, steps in the flows, which have been discussed in the present application, may be interchanged, modified, rearranged, decomposed, combined, or eliminated. Further, steps, measures, schemes in the various operations, methods, procedures disclosed in the prior art and the present invention can also be alternated, changed, rearranged, decomposed, combined, or deleted.
The foregoing is only a partial embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.

Claims (10)

1. A DNS network data filtering method is characterized by comprising the following steps:
collecting network data in a DNS server through a measurement collector;
extracting a characteristic value of DNS data, calculating characteristic influence weight, and configuring parameters;
processing the data according to the characteristic value;
performing data classification by a plurality of classifiers simultaneously;
voting according to the weight ratio of the classifier to determine final classification;
and filtering the malicious network activities according to the classification result.
2. The method of claim 1, wherein the network data comprises: the method comprises the steps of time stamp, DNS client IP, client port number, DNS server IP, DNS message header ID, resource record type, request URL, request type, response IP, TTL and hop count.
3. The method of claim 1, wherein constructing a plurality of classifiers comprises:
and during training of the plurality of classifiers, non-return sampling is adopted, p samples are extracted from the training set with return, k features are extracted from the feature set without return, and a plurality of different classifiers are generated according to calculation of feature value influence weights.
4. The method of claim 1, wherein processing data according to the feature values comprises:
and processing each piece of collected data according to the selected characteristic value to obtain a related characteristic vector and form a corresponding characteristic matrix.
5. The method of claim 1, wherein the data classification is performed simultaneously, the classifier weight ratio is calculated by calculating the posterior probability of the classification result of each classifier through an algorithm to obtain the weight ratio of each classifier, and the weight ratio of each classifier is calculated according to a principle that the correctness degree gives priority to the fairness.
6. The method of claim 1, wherein malicious results in the data are intercepted by voting, and the malicious results are added to a blacklist.
7. A DNS network data filtering device comprises a data acquisition module, a data processing module, a parameter configuration module, a classification module and a decision module; wherein,
the data processing module comprises a preset blacklist comparison and processes data according to the selection of the characteristic value; the parameter configuration module comprises the steps of extracting characteristic values, calculating influence weights of the characteristic values and configuring parameters; the classification module consists of a plurality of parallel classifiers and is used for classifying data; the decision module comprises a final result of calculating the weight ratio of the classifier and voting the classifier according to the weight ratio.
8. The DNS network data filtering apparatus according to claim 7, wherein:
the preset blacklist is composed of part of malicious network activity data acquired by the system in advance and the data acquired by the operation of the filtering system. And the malicious network activity data judged by the filtering system can be added into the blacklist.
9. The DNS network data filtering apparatus according to claim 7, wherein:
the decision module calculates the weight ratio of the classifiers by calculating the posterior probability of the classification result of each classifier through an algorithm to obtain the weight ratio of each classifier, and the weight ratio of each classifier is calculated according to the principle that the accuracy degree gives priority to fairness.
10. The DNS network data filtering apparatus according to claim 7, wherein:
the selection of the characteristic value comprises the extraction of static characteristics and dynamic characteristics of the data, wherein the static characteristics comprise the length of a secondary domain name, the structure of Chinese characters in the domain name, the number of digits in the domain name, the number of letters in the domain name and the like; the extraction of dynamic features is included in the time at which each querier is queried.
CN201911197902.8A 2019-11-29 2019-11-29 DNS network data filtering method and device Pending CN110912910A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911197902.8A CN110912910A (en) 2019-11-29 2019-11-29 DNS network data filtering method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911197902.8A CN110912910A (en) 2019-11-29 2019-11-29 DNS network data filtering method and device

Publications (1)

Publication Number Publication Date
CN110912910A true CN110912910A (en) 2020-03-24

Family

ID=69820401

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911197902.8A Pending CN110912910A (en) 2019-11-29 2019-11-29 DNS network data filtering method and device

Country Status (1)

Country Link
CN (1) CN110912910A (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103001825A (en) * 2012-11-15 2013-03-27 中国科学院计算机网络信息中心 Method and system for detecting DNS (domain name system) traffic abnormality
CN105184316A (en) * 2015-08-28 2015-12-23 国网智能电网研究院 Support vector machine power grid business classification method based on feature weight learning
US20160294859A1 (en) * 2015-03-30 2016-10-06 Electronics And Telecommunications Research Institute Apparatus and method for detecting malicious domain cluster
CN107786575A (en) * 2017-11-11 2018-03-09 北京信息科技大学 A kind of adaptive malice domain name detection method based on DNS flows
CN108777674A (en) * 2018-04-24 2018-11-09 东南大学 A kind of detection method for phishing site based on multi-feature fusion
CN108965245A (en) * 2018-05-31 2018-12-07 国家计算机网络与信息安全管理中心 Detection method for phishing site and system based on the more disaggregated models of adaptive isomery
CN110266647A (en) * 2019-05-22 2019-09-20 北京金睛云华科技有限公司 It is a kind of to order and control communication check method and system
CN110417810A (en) * 2019-08-20 2019-11-05 西安电子科技大学 The malice for the enhancing model that logic-based returns encrypts flow rate testing methods

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103001825A (en) * 2012-11-15 2013-03-27 中国科学院计算机网络信息中心 Method and system for detecting DNS (domain name system) traffic abnormality
US20160294859A1 (en) * 2015-03-30 2016-10-06 Electronics And Telecommunications Research Institute Apparatus and method for detecting malicious domain cluster
CN105184316A (en) * 2015-08-28 2015-12-23 国网智能电网研究院 Support vector machine power grid business classification method based on feature weight learning
CN107786575A (en) * 2017-11-11 2018-03-09 北京信息科技大学 A kind of adaptive malice domain name detection method based on DNS flows
CN108777674A (en) * 2018-04-24 2018-11-09 东南大学 A kind of detection method for phishing site based on multi-feature fusion
CN108965245A (en) * 2018-05-31 2018-12-07 国家计算机网络与信息安全管理中心 Detection method for phishing site and system based on the more disaggregated models of adaptive isomery
CN110266647A (en) * 2019-05-22 2019-09-20 北京金睛云华科技有限公司 It is a kind of to order and control communication check method and system
CN110417810A (en) * 2019-08-20 2019-11-05 西安电子科技大学 The malice for the enhancing model that logic-based returns encrypts flow rate testing methods

Similar Documents

Publication Publication Date Title
CN111865815B (en) Flow classification method and system based on federal learning
CN111385297B (en) Wireless device fingerprint identification method, system, device and readable storage medium
CN106657141A (en) Android malware real-time detection method based on network flow analysis
CN109639744A (en) A kind of detection method and relevant device in the tunnel DNS
CN109842588B (en) Network data detection method and related equipment
CN107222511B (en) Malicious software detection method and device, computer device and readable storage medium
CN108183888A (en) A kind of social engineering Network Intrusion path detection method based on random forests algorithm
CN103095672A (en) Multidimensional reputation scoring
CN113206860B (en) DRDoS attack detection method based on machine learning and feature selection
CN110647896B (en) Phishing page identification method based on logo image and related equipment
CN107967488B (en) Server classification method and classification system
CN110399546B (en) Link duplicate removal method, device, equipment and storage medium based on web crawler
CN109299742A (en) Method, apparatus, equipment and the storage medium of automatic discovery unknown network stream
CN110647895B (en) Phishing page identification method based on login box image and related equipment
GB2487800A (en) Communication analysis
CN104506356A (en) Method and device for determining credibility of IP (Internet protocol) address
CN113205134A (en) Network security situation prediction method and system
CN108199878B (en) Personal identification information identification system and method in high-performance IP network
CN114338600A (en) Equipment fingerprint selection method and device, electronic equipment and medium
CN110912910A (en) DNS network data filtering method and device
EP3790260B1 (en) Device and method for identifying network devices in a nat based communication network
CN112087450A (en) Abnormal IP identification method, system and computer equipment
CN115499179A (en) Method for detecting DoH tunnel flow in backbone network
EP3361405A1 (en) Enhancement of intrusion detection systems
CN113095426A (en) Encrypted traffic classification method, system, equipment and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20200324