CN109361575A - A kind of method and its system obtaining analysis DNS data on flows - Google Patents

A kind of method and its system obtaining analysis DNS data on flows Download PDF

Info

Publication number
CN109361575A
CN109361575A CN201811563066.6A CN201811563066A CN109361575A CN 109361575 A CN109361575 A CN 109361575A CN 201811563066 A CN201811563066 A CN 201811563066A CN 109361575 A CN109361575 A CN 109361575A
Authority
CN
China
Prior art keywords
dns
module
webpage
data
domain name
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811563066.6A
Other languages
Chinese (zh)
Inventor
张兆心
刘晓燕
程亚楠
陆柯羽
杜跃进
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin Institute of Technology Weihai
Original Assignee
Harbin Institute of Technology Weihai
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin Institute of Technology Weihai filed Critical Harbin Institute of Technology Weihai
Priority to CN201811563066.6A priority Critical patent/CN109361575A/en
Publication of CN109361575A publication Critical patent/CN109361575A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L61/00Network arrangements, protocols or services for addressing or naming
    • H04L61/45Network directories; Name-to-address mapping
    • H04L61/4505Network directories; Name-to-address mapping using standardised directories; using standardised directory access protocols
    • H04L61/4511Network directories; Name-to-address mapping using standardised directories; using standardised directory access protocols using domain name system [DNS]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • H04L43/0876Network utilisation, e.g. volume of load or congestion level

Abstract

The present invention relates to a kind of methods and its system for obtaining analysis DNS data on flows, the technical issues of which solve existing method analysis web page performance and servicing the accuracy of service condition comprising following steps: A. obtains DNS data on flows of the webpage for the first time in loading procedure;B. the DNS data on flows obtained in step A is started the cleaning processing, carries out domain name quantity statistics, dns resolution time statistics and resource record statistic of classification respectively;C. according to domain name quantity statistics in step B and dns resolution Statistical Analysis webpage performance;D. CDN service situation is used according to the server geographical location of resource record statistic of classification analysis web page resources, the distribution situation of IP operator and webpage in step B.Invention also provides its systems.It the composite can be widely applied to internet data acquisition and analysis field.

Description

A kind of method and its system obtaining analysis DNS data on flows
Technical field
The present invention relates to biological chemical fields, specifically a kind of to obtain the method for analyzing DNS data on flows and its be System.
Background technique
With the rapid development of internet and growing stronger day by day for web userbase, the content of webpage is further rich and varied, The element species for forming webpage are also more various, not only include image, animation, video, the sound that can be directly perceived by web user The multimedia resources such as frequency, further include the codes such as web page frame, shell script, cascading style sheets for increasing user's viewing experience Resource file.
These page elements are deposited on different web page resources servers with individual document form, in net It initiates to request by browser during page load to obtain these resource files from the server of storage respective resources, and obtains The premise of these resource files is to carry out domain name mapping, is necessarily required to the participation of DNS in this process, generates a large amount of DNS Flow.
It for the research method of DNS flow is completed using the measuring node for being deployed in backbone network at present, lacked simultaneously It is few that web page performance, page resource server disposition distribution feelings are analyzed as research object using DNS flow in web page loading process The method of condition and webpage CDN service service condition.
Summary of the invention
The present invention is exactly the technology in order to solve the accuracy of existing method analysis web page performance and service service condition Problem, provide a kind of convenient, acquisition analysis webpage that accuracy is high for the first time in loading procedure the method for DNS data on flows and its System.
For this purpose, the present invention provides a kind of methods for obtaining analysis DNS data on flows, specifically includes the following steps:
A. DNS data on flows of the webpage for the first time in loading procedure is obtained;
B. the DNS data on flows obtained in the step A is handled, carries out domain name quantity statistics, dns resolution respectively Time statistics and resource record statistic of classification;
C. according to domain name quantity statistics in the step B and dns resolution Statistical Analysis webpage performance;
D. it is runed according to the server geographical location of resource record statistic of classification analysis web page resources, IP in the step B The distribution situation and webpage of quotient uses CDN service situation.
Preferably, DNS data on flows of the webpage for the first time in loading procedure is obtained in step A, comprising the following steps:
A. removing forbids system and browser to use DNS cache;
B. URL is obtained as webpage research object, takes out the webpage URL as webpage detected object;
C. network interface card port is monitored, analog subscriber opens webpage URL behavior, and capture webpage flows through network interface card in loading procedure for the first time The DNS data on flows of port;
D. it extracts the DNS data on flows that network interface card port captures in the step c and is stored in non-relationship according to the form of key-value pair Type database.
Preferably, DNS data on flows is handled in step B, comprising the following steps:
(1) DNS message is divided into DNS request message and DNS response message, determines the resource record class of DNS response message Type;
(2) type for requesting domain name in web page loading process and the quantity for every kind of domain name request are obtained, domain name is requested Total quantity responds the type of domain name and the quantity for every kind of dns response in web page loading process, responds domain name total quantity, net Not by the resource domain name and quantity of success response in page loading procedure, domain name mapping rate in web page loading process;
(3) time that the resource domain name during webpage adds for the first time in webpage carries out dns resolution is obtained;
(4) DNS response message is counted according to A record and CNAME record.
Preferably, A records data in step (4), is parsed IP according to interface data using mesh number, obtains IP geography position Set and determine IP operator;The alias for recording data and CDN service quotient offer of CNAME in step (4) is used into keyword match Method determine whether webpage uses CDN service.
Invention also provides it is a kind of obtain analysis DNS data on flows system, be equipped with obtain DNS flow module and DNS flow analysis module obtains DNS flow module, for the url data source that webpage loads for the first time, captures net in url data source DNS flow in page URL loading procedure, and the content extracted in the request message and response message of DNS flow is deposited into data In library;
DNS flow analysis module is used to answer from number of angle, time angle and DNS response message resource record respectively Regional perspective analyzes DNS flow.
Preferably, it obtains DNS flow module and is equipped with processing DNS cache module, acquisition url data source module, capture webpage It the DNS flow module that loads for the first time and extracts and stores DNS message content module;
Handle DNS cache module, for take different modes to remove not homologous ray different browsers and forbid system with And browser DNS cache;Url data source module is acquired, is used to obtain webpage URL using web crawlers;Capture webpage adds for the first time DNS flow module is carried, for capturing the DNS data on flows that each URL in url data source flows through port in loading procedure for the first time; The content module of DNS message is extracted and stores, for the DNS data on flows classification deposit database that will be captured.
Preferably, DNS flow analysis module is equipped with DNS message content processing module, domain name quantity statistics module, DNS solution It analyses time statistical module, resource record statistic of classification module, A record IP parsing module, CNAME and records CDN matching module, the page Performance evaluation module, web page resources geographical location and operator's statistical module and webpage CDN service service condition statistical module;
DNS message content processing module carries out data cleansing processing for the DNS message content to deposit database;Domain Name quantity statistical module, for counting the quantitative relation of resource domain name in DNS request message and DNS response message;Dns resolution Time statistical module obtains the dns resolution time of each resource domain name by DNS response message;Resource record statistic of classification mould Block, for dividing according to A record and CNAME record the DNS response message after the DNS message content processing module data cleansing Class;A records IP parsing module, for A record IP is resolved to geographical location information;The CNAME records CDN matching module, Matching judgment is carried out for the alias that the domain name for recording the CNAME of response message and CDN service quotient provide;Page performance evaluation Module solves for analyzing webpage according to domain name quantity statistics module and the dns resolution time statistical module in DNS Analyse the performance of this respect;Web page resources geographical location and operator's statistical module obtain for recording IP parsing module according to the A The geographical location information obtained counts the location distribution of web page resources and uses operator's situation;Webpage CDN service uses Situation statistical module, the data statistics webpage after CDN matching module matching judgment is recorded according to the CNAME use CDN service Situation.
Present invention has the advantages that proposing a kind of completely new acquisition from the DNS flow in webpage for the first time loading procedure With the method for comprehensive analysis DNS data on flows.Client NIC port is monitored, analog subscriber opens webpage URL, more just It is prompt, accurately obtain DNS flow of the webpage for the first time in loading procedure.Moreover, multiple angles are set out, analysis webpage loaded for the first time DNS data on flows in journey can therefrom obtain web page performance, page resource server location distribution situation and page Face CDN service condition.
Detailed description of the invention
Fig. 1 is 1 structural schematic diagram of the embodiment of the present invention;
Fig. 2 is 2 structural schematic diagram of the embodiment of the present invention.
Specific embodiment
According to following embodiments, the present invention may be better understood.However, as it will be easily appreciated by one skilled in the art that real It applies content described in example and is merely to illustrate the present invention, without this hair described in claims should will not be limited It is bright.
Embodiment 1 obtains DNS flow module, mainly including the following steps:
Step 1: removing forbids system and browser to close DNS cache using DNS cache, such as linux system default, Red fox browser by modify it configuration information by network.dnsCacheExpirationGracePer, Network.dnsCacheExpiration is set as 0.
Step 2: by before the ranking of acquisition url data source module acquisition Alexa China 200 website URL as webpage Research object takes out webpage URL as this webpage detected object one by one;
Step 3: loading DNS flow module for the first time by capturing webpage, monitor No. 53 ports of network interface card, analog subscriber is browsing Webpage URL behavior is opened in device, captures the DNS data on flows that webpage flows through No. 53 ports in loading procedure for the first time.
Step 4: the content module by extracting and storing DNS message is extracted in the DNS flow research that No. 53 ports capture Hold in the form deposit non-relational database according to key-value pair.
Embodiment 2DNS flow analysis module, mainly including the following steps:
Step 1: by DNS message content processing module, DNS message is divided into DNS request message and DNS response message, Determine the resource record types of DNS response message.
Step 2: by domain name quantity statistics module, obtaining the type for requesting domain name in this web page loading process and be directed to The quantity of every kind of domain name request, the domain name total quantity of request;The type of domain name is responded in this web page loading process and for every The quantity of kind dns response, the domain name total quantity of response;In this web page loading process not by the resource domain name of success response and Quantity;Domain name mapping rate in this web page loading process.
Step 3: by dns resolution time statistical module, obtain resource domain name during webpage adds for the first time in webpage into The time of row dns resolution.
Step 4: by resource record statistic of classification module, the DNS response message after DNS message content processing module being pressed It is counted according to A record and CNAME record.
Step 5: IP parsing module is recorded by A, is parsed IP using the data-interface data currently existed, It obtains the geographical location IP and determines IP operator.
Step 6: CDN matching module being recorded by CNAME, the alias that the domain name and CDN service quotient to CNAME record provide Determine whether webpage uses CDN service using the method for keyword match.
Step 7: by page performance evaluation module, utilizing domain name quantity statistics module and dns resolution time statistical module Angle analysis webpage performance of the data of acquisition from webpage dns resolution performance.
Step 8: the data obtained according to step 5 go out web page resources using web page resources geographical location statistical module counts Server geographical location and IP operator distribution situation.
Step 9: the data obtained according to step 6, going out webpage using webpage CDN service service condition statistical module counts makes With CDN service situation.
Only as described above, only specific embodiments of the present invention, when the model that cannot be limited the present invention with this and implement It encloses, therefore the displacement of its equivalent assemblies, or according to equivalent changes and modifications made by the invention patent protection scope, should still belong to this hair The scope that bright claims are covered.

Claims (7)

1. a kind of method for obtaining analysis DNS data on flows, characterized in that the following steps are included:
A. DNS data on flows of the webpage for the first time in loading procedure is obtained;
B. the DNS data on flows obtained in the step A is handled, carries out domain name quantity statistics, dns resolution time respectively Statistics and resource record statistic of classification;
C. according to domain name quantity statistics in the step B and dns resolution Statistical Analysis webpage performance;
D. according to the resource record statistic of classification analysis server geographical location of web page resources in the step B, IP operator Distribution situation and webpage use CDN service situation.
2. obtaining the method for analysis DNS data on flows according to claim 1, which is characterized in that obtain net in the step A DNS data on flows in the loading procedure of beginning of the page time, comprising the following steps:
A. removing forbids system and browser to use DNS cache;
B. URL is obtained as webpage research object, takes out the webpage URL as webpage detected object;
C. network interface card port is monitored, analog subscriber opens webpage URL behavior, and capture webpage flows through network interface card port in loading procedure for the first time DNS data on flows;
D. it extracts the DNS data on flows that network interface card port captures in the step c and is stored in non-relational number according to the form of key-value pair According to library.
3. the method according to claim 1 for obtaining analysis DNS data on flows, which is characterized in that DNS in the step B Data on flows is handled, comprising the following steps:
(1) DNS message is divided into DNS request message and DNS response message, determines the resource record types of DNS response message;
(2) type that domain name is requested in web page loading process and the quantity for every kind of domain name request, request domain name sum are obtained It measures, the type of domain name and the quantity for every kind of dns response is responded in web page loading process, responds domain name total quantity, webpage adds Not by the resource domain name and quantity of success response during carrying, domain name mapping rate in web page loading process;
(3) time that the resource domain name during webpage adds for the first time in webpage carries out dns resolution is obtained;
(4) DNS response message is counted according to A record and CNAME record.
4. obtaining the method for analysis DNS data on flows according to claim 3, which is characterized in that A remembers in the step (4) Data are recorded, are parsed IP according to interface data using mesh number, obtain the geographical location IP and determine IP operator;
The alias for recording data and CDN service quotient offer of CNAME in the step (4) is determined using the method for keyword match Whether webpage uses CDN service.
5. a kind of system for obtaining analysis DNS data on flows, which is characterized in that be equipped with and obtain DNS flow module and DNS flow point Module is analysed, the acquisition DNS flow module captures webpage URL in url data source for the url data source that webpage loads for the first time DNS flow in loading procedure, and the content extracted in the request message and response message of DNS flow is deposited into database;
The DNS flow analysis module answers region from number of angle, time angle and DNS response message resource record respectively Angle analysis DNS flow.
6. the system according to claim 5 for obtaining analysis DNS flow, which is characterized in that the acquisition DNS flow module Equipped with processing DNS cache module, acquires url data source module, the DNS flow module that capture webpage loads for the first time and extraction and deposit Store up DNS message content module;
The processing DNS cache module, for take different modes to remove not homologous ray different browsers and forbid system with And browser DNS cache;The acquisition url data source module is used to obtain webpage URL using web crawlers;The capture net Beginning of the page time load DNS flow module, for capturing the DNS stream that each URL in url data source flows through port in loading procedure for the first time Measure data;The extraction and the content module for storing DNS message, for the DNS data on flows classification deposit data that will be captured Library.
7. the system according to claim 5 for obtaining analysis DNS flow, which is characterized in that the DNS flow analysis module Equipped with DNS message content processing module, domain name quantity statistics module, dns resolution time statistical module, resource record statistic of classification Module, A record IP parsing module, CNAME record CDN matching module, page performance evaluation module, web page resources geographical location and Operator's statistical module and webpage CDN service service condition statistical module;
The DNS message content processing module carries out data cleansing processing for the DNS message content to deposit database;Institute Domain name quantity statistics module is stated, for counting the quantitative relation of resource domain name in DNS request message and DNS response message;Institute Dns resolution time statistical module is stated, the dns resolution time of each resource domain name is obtained by DNS response message;The resource note Statistic of classification module is recorded, for recording to the DNS response message after the DNS message content processing module data cleansing according to A With CNAME record sort;The A records IP parsing module, for A record IP is resolved to geographical location information;The CNAME CDN matching module is recorded, the domain name for recording the CNAME of response message is matched with the alias that CDN service quotient provides Judgement;The page performance evaluation module is used to be counted according to domain name quantity statistics module and the dns resolution time Module analysis goes out webpage in the performance of dns resolution this respect;The web page resources geographical location and operator's statistical module, are used to The geographical location information that IP parsing module obtains, which is recorded, according to the A counts the location distribution of web page resources and using fortune Seek market conditions condition;Webpage CDN service service condition statistical module, after recording CDN matching module matching judgment according to the CNAME Data statistics webpage uses CDN service situation.
CN201811563066.6A 2018-12-20 2018-12-20 A kind of method and its system obtaining analysis DNS data on flows Pending CN109361575A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811563066.6A CN109361575A (en) 2018-12-20 2018-12-20 A kind of method and its system obtaining analysis DNS data on flows

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811563066.6A CN109361575A (en) 2018-12-20 2018-12-20 A kind of method and its system obtaining analysis DNS data on flows

Publications (1)

Publication Number Publication Date
CN109361575A true CN109361575A (en) 2019-02-19

Family

ID=65329302

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811563066.6A Pending CN109361575A (en) 2018-12-20 2018-12-20 A kind of method and its system obtaining analysis DNS data on flows

Country Status (1)

Country Link
CN (1) CN109361575A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110290188A (en) * 2019-06-13 2019-09-27 四川大学 A kind of HTTPS stream service online identification method suitable for large-scale network environment
CN111541793A (en) * 2020-04-03 2020-08-14 北京市天元网络技术股份有限公司 Content distribution network scheduling process analysis method and device and electronic equipment
CN112949768A (en) * 2021-04-07 2021-06-11 苏州瑞立思科技有限公司 Traffic classification method based on LSTM
CN113065078A (en) * 2021-03-16 2021-07-02 赛尔新技术(北京)有限公司 Statistical analysis method for simulating user behavior to dial and test multistage domain names of WEB sites

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102164048A (en) * 2011-04-06 2011-08-24 上海美琦浦悦通讯科技有限公司 Data stream optimization device and method for realizing multi-ISP (internet service provider) access in local area network
CN104038363A (en) * 2013-10-24 2014-09-10 南京汇吉递特网络科技有限公司 Method for acquiring and counting CCDN provider information
CN104038471A (en) * 2013-03-08 2014-09-10 中国移动通信集团浙江有限公司 Method for managing IDC resources in internet and service provider network
CN104202418A (en) * 2014-09-17 2014-12-10 北京瑞汛世纪科技有限公司 Method and system for recommending commercial content distribution network for content provider
CN106452940A (en) * 2016-08-22 2017-02-22 中国联合网络通信有限公司重庆市分公司 Method and device for identifying Internet business flow ownership
CN107071084A (en) * 2017-04-01 2017-08-18 北京神州绿盟信息安全科技股份有限公司 A kind of DNS evaluation method and device
CN107786575A (en) * 2017-11-11 2018-03-09 北京信息科技大学 A kind of adaptive malice domain name detection method based on DNS flows

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102164048A (en) * 2011-04-06 2011-08-24 上海美琦浦悦通讯科技有限公司 Data stream optimization device and method for realizing multi-ISP (internet service provider) access in local area network
CN104038471A (en) * 2013-03-08 2014-09-10 中国移动通信集团浙江有限公司 Method for managing IDC resources in internet and service provider network
CN104038363A (en) * 2013-10-24 2014-09-10 南京汇吉递特网络科技有限公司 Method for acquiring and counting CCDN provider information
CN104202418A (en) * 2014-09-17 2014-12-10 北京瑞汛世纪科技有限公司 Method and system for recommending commercial content distribution network for content provider
CN106452940A (en) * 2016-08-22 2017-02-22 中国联合网络通信有限公司重庆市分公司 Method and device for identifying Internet business flow ownership
CN107071084A (en) * 2017-04-01 2017-08-18 北京神州绿盟信息安全科技股份有限公司 A kind of DNS evaluation method and device
CN107786575A (en) * 2017-11-11 2018-03-09 北京信息科技大学 A kind of adaptive malice domain name detection method based on DNS flows

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
田世奇: ""DNS流量采集系统的实现与流量分析"", 《中国优秀硕士论文全文数据库》 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110290188A (en) * 2019-06-13 2019-09-27 四川大学 A kind of HTTPS stream service online identification method suitable for large-scale network environment
CN110290188B (en) * 2019-06-13 2020-06-02 四川大学 HTTPS (hypertext transfer protocol secure) stream service online identification method suitable for large-scale network environment
CN111541793A (en) * 2020-04-03 2020-08-14 北京市天元网络技术股份有限公司 Content distribution network scheduling process analysis method and device and electronic equipment
CN111541793B (en) * 2020-04-03 2021-10-22 北京市天元网络技术股份有限公司 Content distribution network scheduling process analysis method and device and electronic equipment
CN113065078A (en) * 2021-03-16 2021-07-02 赛尔新技术(北京)有限公司 Statistical analysis method for simulating user behavior to dial and test multistage domain names of WEB sites
CN113065078B (en) * 2021-03-16 2022-11-11 赛尔新技术(北京)有限公司 Statistical analysis method for simulating user behavior to dial and test multistage domain names of WEB sites
CN112949768A (en) * 2021-04-07 2021-06-11 苏州瑞立思科技有限公司 Traffic classification method based on LSTM

Similar Documents

Publication Publication Date Title
CN109361575A (en) A kind of method and its system obtaining analysis DNS data on flows
US10762549B2 (en) Analysis and collection system for user interest data and method therefor
Butkiewicz et al. Understanding website complexity: measurements, metrics, and implications
CN105490854B (en) Real-time logs collection method, system and application server cluster
CN103218431B (en) A kind ofly can identify the system that info web gathers automatically
US8935390B2 (en) Method and system for efficient and exhaustive URL categorization
US20120317151A1 (en) Model-Based Method for Managing Information Derived From Network Traffic
WO2016101464A1 (en) Quality of experience estimation method, device, terminal and server
US8818927B2 (en) Method for generating rules and parameters for assessing relevance of information derived from internet traffic
CN109275045B (en) DFI-based mobile terminal encrypted video advertisement traffic identification method
CN113407886A (en) Network crime platform identification method, system, device and computer storage medium
CN105159992A (en) Method and device for detecting page contents and network behaviors of application program
CN107370830B (en) Trade information supplying system based on big data and method
Kepner et al. Hypersparse neural network analysis of large-scale internet traffic
CN110011860A (en) Android application and identification method based on network traffic analysis
Chitraa et al. An efficient path completion technique for web log mining
CN110225009A (en) It is a kind of that user's detection method is acted on behalf of based on communication behavior portrait
CN103684856A (en) Video website infrastructure measurement and analysis method
CN108650145A (en) Phone number characteristic automatic extraction method under a kind of home broadband WiFi
CN105989019B (en) A kind of method and device for cleaning data
CN104539452B (en) A kind of method that statistics Web applications access regional characteristic
CN116401479A (en) Website content behavior identification method and system based on encrypted traffic bidirectional burst sequence
Goel et al. Preprocessing web logs: A critical phase in web usage mining
CN106789411B (en) Method and device for acquiring active IP data in machine room
CN111611483B (en) Object portrait construction method, device and equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20190219