The content of the invention
This application provides a kind of DNS flow analysis methods, it is intended to simplify the complexity of DNS flow analyses, and preferably
Understand DNS traffic conditions.
A kind of DNS flow analysis methods that the application provides, including:
A, the message information in network is gathered in real time;
B, DNS flow pretreatments are carried out to the message information gathered, therefrom extracts the DNS flow informations of needs;
C, DNS analyses are carried out to the DNS flow informations that are extracted, obtain DNS domain name-domain name set of relationship, DNS domain name-
IP set of relationship;
D, merger operation is carried out to the DNS domain name-domain name set of relationship and DNS domain name-IP set of relationship, obtains DNS
Domain name-IP relationship resource tables.
It is preferred that the C includes:
To each DNS flow informations, obtain nslookup therein and return to domain name, by the nslookup got and
Each returns to field name respectively constitutes a DNS domain name-domain name relation pair, DNS domain name-domain name relation of all DNS flow informations
To forming DNS domain name-domain name set of relationship;
To each DNS flow informations, nslookup and server response IP address therein is obtained, is looked into what is got
It askes domain name and each server response IP address respectively constitutes a DNS domain name-IP relation pair, the DNS of all DNS flow informations
Domain name-IP relation pairs form DNS domain name-IP set of relationship.
It is preferred that the D includes:
D1, merger operation is carried out to the DNS domain name-domain name set of relationship, obtains DNS domain name-domain name relation dictionary;
D2, merger operation is carried out to the DNS domain name-IP set of relationship, obtains DNS domain name-IP relation dictionaries;
D3, the DNS domain name-domain name relation dictionary and DNS domain name-IP relation dictionaries are integrated, obtains DNS domain
Name-IP relationship resource tables.
It is preferred that the D1 includes:
The 1st DNS domain name-domain name relation pair in D11, input DNS domain name-domain name set of relationship, is denoted as(A1, B1);
If A1!=B1 then generates set of domains C1={ A1, B1 };
Otherwise, if A1==B1, set of domains C1={ A1 } is generated;
I-th in D12, input DNS domain name-domain name set of relationship(i>=2)DNS domain name-domain name relation pair, is denoted as
(Ai, Bi),
If Ai!=Bi then compares Ai and Bi with the element in existing set of domains respectively:
If a) existing set of domains Cm includes Ai, and Cn includes Bi, and m==n, then performs step D13;
If b) existing set of domains Cm includes Ai, and Cn includes Bi, and m!=n, then by existing set of domains Cm and
Cn is merged, and the element of Cn is added in Cm, and deletes Cn;
If c) existing set of domains Cm includes Ai, and Bi is not included by arbitrary existing set of domains, then by Bi
It is added in set Cm;Alternatively, if existing set of domains Cm includes Bi, and Ai is not by arbitrary existing set of domains institute
Comprising then Ai is added in set Cm;
If d) Ai and Bi is not included by arbitrary existing set of domains, set of domains Ci={ Ai, Bi } is generated;
If Ai==Bi compares Ai with the element in existing set of domains:
If a) Ai is not included by arbitrary existing set of domains, set of domains Ci={ Ai } is generated;
If b) Ai is included by set of domains Cm, step D13 is performed;
D13, step D12 is repeated, until at all DNS domain name-domain name relation pairs in DNS domain name-domain name set of relationship
Reason finishes, and obtained all set of domains form DNS domain name-domain name relation dictionary.
It is preferred that the D2 includes:
The 1st DNS domain name-IP relation pair in D21, input DNS domain name-IP set of relationship, according to the 1st DNS
Domain name-IP relation pairs generate a new DNS domain name-IP set;
I-th in D22, input DNS domain name-IP set of relationship(i>=2)DNS domain name-IP relation pairs, by described i-th
Nslookup in DNS domain name-IP relation pairs is denoted as Ai;
Ai is compared with the element in existing DNS domain name-IP set:
A) if existing DNS domain name-IP set Cm includes Ai, will be in i-th DNS domain name-IP relation pairs
Server response IP address is added in Cm;
B) if Ai is not included by arbitrary existing DNS domain name-IP set, according to i-th DNS domain name-IP
Relation pair generates a new DNS domain name-IP set;
D23, step D22 is repeated, until all DNS domain name-IP relation pairs in DNS domain name-IP set of relationship have been handled
Finish, obtained all DNS domain name-IP set forms DNS domain name-IP relation dictionaries.
It is preferred that the D3 includes:
D31, entire DNS domain name-IP relation dictionaries are read in, generates an empty DNS domain name-IP relationship resource table;
D32, DNS domain name-domain name relation dictionary is read in by row, for the every a line read in:
If all domain names in the row are not present in DNS domain name-IP relation dictionaries, according to predetermined form
Domain name is output in DNS domain name-IP relationship resource tables;
It is first according to predetermined form if at least 1 domain name is appeared in DNS domain name-IP relation dictionaries in the row
Domain name is output in DNS domain name-IP relationship resource tables, all corresponding server response IP address are then output to DNS
In domain name-IP relationship resource tables;
D33, step D32 is repeated, finishes, obtain until every a line in DNS domain name-domain name relation dictionary is all processed
DNS domain name-IP relationship resource tables.
As seen from the above technical solution, the DNS flow analysis methods that the application provides by gathering the report in network in real time
Literary information carries out DNS flow pretreatments to the message information gathered, therefrom extracts the DNS flow informations of needs;Then,
DNS analyses are carried out to the DNS flow informations extracted, obtain DNS domain name-domain name set of relationship, DNS domain name-IP set of relationship;
Finally, merger operation is carried out to the DNS domain name-domain name set of relationship and DNS domain name-IP set of relationship, reduces DNS flows
Domain name species so that more there is specific aim to the analyses of DNS flows, and can be best understood from the whole current
DNS traffic conditions.
Specific embodiment
Object, technical solution and advantage to make the application are more clearly understood, and develop simultaneously embodiment referring to the drawings, right
The application is described in further detail.
The application's is main by using domain name conflation algorithm, to reduce the domain name species in DNS flows, so as to simplified pair
The complexity of DNS flow analyses.
For traditional flow analysis method, the application is paid close attention to and analyzes in DNS data between 3 important informations
Deep layer contacts, this 3 important informations are respectively:Nslookup returns to domain name and service device response IP address.To a certain degree
On, rather complicated mapping relations, such as one-to-one, one-to-many, many-one are deposited between this 3 important informations.Such as:In reality
A usual domain name has multiple alias, and a server response IP address can also correspond to multiple domain names, thus the application pass through it is every
Relevant domain name is integrated into a set of domains by one inquiry record, is directed toward same ISP (Internet Service
Provider), shaped like xxx.cn xxx.com.cn xxx.com.It is necessary that these directions are same in the analysis of relevance
The domain name or IP of one ISP flocks together, and forms an exclusive set.Magnanimity DNS is flowed with technical scheme
Amount is analyzed, and can obtain related domain name set of relationship and corresponding domain name-IP between DNS domain name and DNS domain name
Set of relationship can be best understood from DNS traffic conditions with this set of relationship.
Fig. 1 is the application DNS flow collections and analytic process schematic diagram.The key step of process shown in Fig. 1 is described as follows:
First, message information is gathered from carrier network by message harvester in real time and stored.
Then, DNS flow pretreatment operations are carried out to the message information gathered, therefrom extracts required DNS streams
Measure information.For example, DNS flows example of fields such as 1 institute of table that DNS flow pretreatments obtain afterwards is typically carried out to message information
Show.
Field name |
Example |
Annotation |
qsec |
1343527269 |
Query time(Second) |
qusec |
785887 |
Query time(Microsecond) |
intvlsec |
0 |
Inquiry-answering interval(Second) |
intvlusec |
3839 |
Inquiry-answering interval(Microsecond) |
clientIP |
0a124aea |
User IP |
serverIP |
70040cc8 |
Server ip |
transID |
50857 |
For unique mark a pair of inquiry response between client and server |
opCode |
0 |
Inquiry, acknowledgement type(0 standard, 1 reversely, the request of 2 server states) |
isAA |
0 |
Whether response result comes from authoritative server |
isTC |
0 |
Whether response result is truncated |
isRD |
1 |
Whether recursive query request is had |
isRA |
1 |
Whether recursive query can be carried out |
rCode |
0 |
The error state of response(0 is error-free;3 be without this domain name etc.) |
queryNum |
1 |
Ask number |
queryType |
1 |
Query type(1 is A, ipv4;28 be AAAA, ipv6) |
queryName |
mobilelogin.sj.91.com |
Inquire about content |
resNum |
1 |
Response number |
resType |
1 |
Acknowledgement type(5 be alias;1 is IP etc.) |
resName |
mobilelogin.sj.91.com |
The data parsed are needed, it is identical with inquiry content before parsing |
resData |
79cff2ef |
Obtained content is parsed, after parsing |
resTTL |
18 |
The TTL of reply data |
Table 1
Since DNS flows are very huge, the DNS flow files of magnanimity are stored in distributed storage cluster by the application
The distributed storage file system HDFS of Hadoop(Hadoop Distributed File System)On, to carry out in next step
DNS analysis operations carry out data preparation.
Finally, DNS analyses are carried out to DNS flows, obtains DNS domain name-domain name set of relationship, DNS domain name-IP set of relations
It closes, and finally obtains DNS domain name-IP relationship resource tables.The following detailed description of the application DNS flow analysis methods.
Fig. 2 is the sequence diagram of the application DNS flow analysis methods, referring to Fig. 2, is mainly included the following steps that:
Action1:MR(Mapreduce)Operation, to the DNS source datas being located on HDFS(That is DNS flows)In inquiry
Domain name and return domain name are screened, and obtain DNS domain name-domain name relation pair.
It is operated by this, a nslookup is then simplified as exporting if there is multiple return domain names in DNS discharge records
Multiple domain names-domain name relation pair.If for example, there is such nslookup www.taobao.com in a DNS discharge record, and
It, which returns to domain name, 3, respectively 11.taobao.com, 22.taobao.com, 33.taobao.com, then the DNS domain exported
Name-domain name relation pair be (www.taobao.com11.taobao.com), (www.taobao.com22.taobao.com),
(www.taobao.com33.taobao.com)。
Action2:MR is operated, and IP address is responded to the nslookup and server being located in the DNS source datas on HDFS
It is screened, obtains DNS domain name-IP relation pairs.
Same Action1, after being operated by Action2, a nslookup is if there is multiple servers in DNS discharge records
The situation of IP address is responded, then is simplified as exporting multiple domain name-IP relation pairs.
Operation in above-mentioned Action1 and Action2 can be real with the distributed programmed frames of mapreduce of hadoop
It is existing, magnanimity DNS flows can be quickly handled parallel, so as to greatly improve data-handling capacity.
Action3:Hadoop shell-commands operate, and DNS domain name-domain name relation pair positioned at HDFS is locally downloading
In catalogue to be operated.
Action4:Hadoop shell-commands operate, and will treat positioned at the DNS domain name-IP relation pairs of HDFS are locally downloading
It operates in catalogue.
Action5:Merger operation is carried out to the DNS domain name-domain name relation pair being located locally(Specifically see Fig. 3 and its say
It is bright), obtained result " DNS domain name-domain name relation dictionary " is stored in the local catalogue to be operated specified.
Action6:Merger operation, the result " DNS domain that will be obtained are carried out to the DNS domain name-IP relation pairs being located locally
Name-IP relation dictionaries " are stored in the catalogue to be operated of the local specified.
Action7:The DNS domain being located locally name-domain name relation dictionary and DNS domain name-IP relation dictionaries are integrated,
It finally obtains DNS domain name-IP relationship resources table and is stored in the local directory specified.
Fig. 3 is the flow diagram that the application carries out DNS domain name-domain name relation pair domain name merger operation, referring to Fig. 3,
Input, processing and the output of domain name merger operation are described as follows first:
Input:N DNS domain name-domain name relation pair(Including nslookup and return to domain name)+ DNS domain name-domain name relation word
Allusion quotation.
Processing spec:
1) for DNS domain name-domain name relation dictionary generation operation, the value of DNS domain name-domain name relation dictionary of input is
Null。
2) for DNS domain name-domain name relation dictionary update operation, DNS domain name-domain name relation dictionary of input is the last time
Obtained DNS domain name-domain name relation dictionary.
Output:DNS domain name-domain name relation dictionary.
The operative algorithm processing procedure of domain name merger shown in Fig. 3 is as follows:
1)The 1st article of DNS domain name-domain name relation pair is inputted, is denoted as(A1, B1), wherein, A1 is nslookup, and B1 is return
Domain name.
If nslookup is different from returning to domain name, i.e. A1!=B1, then by two domain name merger into a new domain name collection
Close C1={ A1, B1 };
Otherwise, i.e. A1==B1 then only needs to form new set of domains C1={ A1 } with a domain name.
2) i-th is inputted(i>=2)Domain name pair(Ai, Bi).
If Ai!=Bi then compares them with the element in existing set of domains respectively:
If a) existing set of domains Cm includes Ai, and Cn includes Bi, and m==n, then does not operate;
If b) existing set of domains Cm includes Ai, and Cn includes Bi, and m!=n, then by existing two set of domains
Cm and Cn are merged, and the element of Cn is added in Cm, and deletes Cn;
C) if existing set of domains Cm includes Ai(Or Bi), and Bi(Or Ai)Not by arbitrary existing set of domains bag
Contain, then Bi(Or Ai)It is added in set Cm;
If d) Ai and Bi is not included by arbitrary existing set of domains, the new set of domains of formation one Ai,
Bi}。
If Ai==Bi, with Ai compared with the middle element of already existing set of domains:
If a) Ai is not included by arbitrary existing set of domains, a new domain name set { Ai } is formed;
If b) Ai is included by set of domains Cm, operated into next step.
3) step 2 is repeated, until all domain names are finished to all processed, you can each set of domains after output merger.
Here, all set of domains form domain name dictionary.
It needs to carry out DNS domain name-IP relation pairs merger operation in above-mentioned Action6, input, processing and output explanation
It is as follows:
Input:N DNS domain name-IP relation pair(Including nslookup and server response IP address)+ DNS domain name-IP is closed
It is dictionary.
Processing spec:
1) generate and operate for DNS domain name-IP relation dictionaries, the value of the DNS domain name-IP relation dictionaries of input is Null.
2) update and operate for DNS domain name-IP relation dictionaries, the DNS domain name-IP relation dictionaries of input obtain for the last time
DNS domain name-IP relation dictionaries.
Output:DNS domain name-IP relation dictionaries.
Generation(Or update)The processing procedure of DNS domain name-IP relation dictionaries is as follows:
1) since first DNS domain name-IP relation pair, i-th DNS domain name-IP relation pair is inputted, such as(A.com,
192.168.0.1,192.168.0.2);
If a) a.com is present in DNS domain name-IP relation dictionaries, 192.168.0.1 and 192.168.0.2 are added
Enter into corresponding DNS domain name-IP set;
If b) a.com is not present in DNS domain name-IP relation dictionaries, this is recorded to the DNS domain new as one
Name-IP set is added in DNS domain name-IP relation dictionaries;
2) finished up to all DNS domain name-IP relation pairs are all processed, you can output generation(Or after update)DNS
Domain name-IP relation dictionaries.
Fig. 4 generates for the application DNS domain name-IP relationship resources table(Or update)Flow diagram, i.e.,:DNS shown in Fig. 2
The flow diagram of the last one committed step Action7 of flow analysis method.Input, the processing and defeated of Action7 operations
Go out to be described as follows:
Input:DNS domain name-domain name relation dictionary+DNS domain name-IP relation dictionaries;
Output:DNS domain name-IP relationship resource tables.
Generation shown in Fig. 4(Or update)The processing procedure of DNS domain name-IP relationship resource tables is as follows:
1)It is whole to read in DNS domain name-IP relation dictionaries, generate an empty DNS domain name-IP relationship resource table;
2)DNS domain name-domain name relation dictionary is read in by row, for the every a line read in:
If a) all domain names in the row are not present in DNS domain name-IP relation dictionaries(I.e.:There is no 1 in the row
A domain name is appeared in DNS domain name-IP relation dictionaries), then domain name is output to DNS domain name-IP according to certain form and is closed
It is in resource table;
If b) at least 1 domain name is appeared in DNS domain name-IP relation dictionaries in the row, then according to certain lattice
Domain name is first output in DNS domain name-IP relationship resource tables by formula, then again that all corresponding server response IP address are defeated
Go out into DNS domain name-IP relationship resource tables;
3)Repeat step 2), until every a line in DNS domain name-domain name relation dictionary is all processed and finishes, you can obtain
Final DNS domain name-IP relationship resource tables.
As seen from the above technical solution, the DNS flow analysis methods that the application provides by gathering the report in network in real time
Literary information carries out DNS flow pretreatments to the message information gathered, therefrom extracts the DNS flow informations of needs;Then,
DNS analyses are carried out to the DNS flow informations extracted, obtain DNS domain name-domain name set of relationship, DNS domain name-IP set of relationship;
Finally, merger operation is carried out to the DNS domain name-domain name set of relationship and DNS domain name-IP set of relationship, reduces DNS flows
Domain name species so that more there is specific aim to the analyses of DNS flows, and can be best understood from the whole current
DNS traffic conditions.
The foregoing is merely the preferred embodiment of the application, not limiting the application, all essences in the application
God and any modification, equivalent substitution, improvement and etc. within principle, done, should be included within the scope of the application protection.