CN104065532A - Unrecorded website search method and system based on multi-channel data access method - Google Patents

Unrecorded website search method and system based on multi-channel data access method Download PDF

Info

Publication number
CN104065532A
CN104065532A CN201410299875.6A CN201410299875A CN104065532A CN 104065532 A CN104065532 A CN 104065532A CN 201410299875 A CN201410299875 A CN 201410299875A CN 104065532 A CN104065532 A CN 104065532A
Authority
CN
China
Prior art keywords
domain name
module
record
data
name
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201410299875.6A
Other languages
Chinese (zh)
Other versions
CN104065532B (en
Inventor
王勇
朱春鸽
周润林
丁国栋
杨书童
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National Computer Network and Information Security Management Center
Original Assignee
National Computer Network and Information Security Management Center
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National Computer Network and Information Security Management Center filed Critical National Computer Network and Information Security Management Center
Priority to CN201410299875.6A priority Critical patent/CN104065532B/en
Publication of CN104065532A publication Critical patent/CN104065532A/en
Application granted granted Critical
Publication of CN104065532B publication Critical patent/CN104065532B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The invention provides an unrecorded website search method and system based on the multi-channel data access method. The method comprises the following steps: domain names are acquired through the multi-channel data access method, and unrecorded domain names are screened out to form a domain name seed library; DNS analysis is performed on the unrecorded domain names to acquire corresponding IP addresses; the IP address are located to obtain an unrecorded domain name library; and unrecorded website information is obtained through activity verification. According to the method and the system provided in the invention, through the multi-channel data access method, the accuracy and comprehensiveness of the finally-obtained unrecorded website information can be ensured, the achievement has been verified in an unrecorded website finding and multi-language website recognition system; according to the invention, the polling mechanism is used, the modules operate all the time simultaneously, so the finally-obtained unrecorded website information can be ensured to be always up to data.

Description

A kind of not recorded website search method and system based on multichannel data access way
Technical field
The present invention relates to not recorded website and seek business, specifically relate to a kind of not recorded website search method and system based on multichannel data access way.
Background technology
Put on record the main task of management system of ICP/IP address/domain name information is the relevant information of collecting domestic IC P website and IP address, realize the standardized administration to ICP/IP, for network security management and the information security of national the Internet are monitored the means that quick location is provided, for relevant functional departments provide decision basis.
Recorded website does not find that subsystem is the ICP/IP address/domain name information subsystem of management system of putting on record.
Recorded website does not find that the main task of subsystem is that the legitimacy that realizes the certificate of the ICP website of having put on record detects, automatic discovery, location and the statistics of the new ICP website of not putting on record, being the ICP/IP address/domain name information management system basic data supplier that puts on record, is that the smooth data basis of work is rectified in ICP website.
The business of Data Source find to(for) recorded website not has plurality of access modes, as initiatively discovery of reptile, domain name journal file etc.Single data access cannot ensure the comprehensive, accurate of data, and any data access all has its advantage and defect.
For reptile is initiatively found access way, crawl more domain name by seed bank, advantage is to utilize the mutual chain structure feature of network of the Internet, utilizes limited resource to capture more domain name; Weak point is cannot ensure to capture the comprehensive and promptness of domain name, if this domain name, on isolated island, will can not be found so.
For domain name daily record access way, because data obtain from the daily record of domestic main flow name server, advantage is apparent, this kind of ageing height of new domain name that access way obtains, and can solve the problem of isolated island domain name; Weak point is main domain name mapping service all to be contained, and domain name must someone be accessed simultaneously, if name server can not entirely contain or domain name is not accessed by people, will lose a large amount of domain names so.
Therefore, obtain by multichannel data access way, the discovery data of not putting on record that complement each other, thus improve the overall accuracy of putting work on record.
Summary of the invention
In order to overcome above-mentioned the deficiencies in the prior art, the invention provides a kind of not recorded website search method and system based on multichannel data access way.By multichannel data access way automatic search the Internet, find the independent domain name of website at home, IP address, whether this domain name is put on record and is detected to the system of putting on record, and will be not recorded website information pushing give its direct access service supplier, and then improve the ICP website rate of putting on record.
In order to realize foregoing invention object, the present invention takes following technical scheme:
An aspect of of the present present invention, provides a kind of not recorded website search method based on multichannel data access way, it is characterized in that, described method comprises the steps:
A. obtain domain name by multichannel data access way, filter out the domain name of not putting on record and form domain name seed bank;
B. the domain name of not putting on record is carried out to dns resolution, obtain corresponding IP address;
C. locate IP address, draw the domain name storehouse of not putting on record;
D. verify by activity the not recorded website information that draws.
Preferably, in steps A, described multichannel data access way comprises web crawlers access way and domain name daily record access way; Obtaining domain name by web crawlers access way comprises the steps:
A-1-1. choose the addressable website of domestic 100,000 magnitudes as kind of a subdomain name;
A-1-2. by web crawlers, in the webpage grabbing, extract domain name;
A-1-3. the domain name grabbing and existing domain name seed bank are compared, duplicate removal;
A-1-4. the domain name after duplicate removal is added to domain name seed bank, enters next round circulation;
Obtaining domain name by domain name daily record access way comprises the steps:
A-2-1. obtain original domain name journal file and gather from domestic main flow name server; Described main flow name server comprises each province name server and international export name server;
A-2-2. format described original domain name journal file, find out in domain name journal file every and record corresponding TLD;
A-2-3. the described TLD in steps A-2-2 and existing domain name seed bank are compared, duplicate removal;
A-2-4. the domain name after duplicate removal is added to domain name seed bank, enters next round circulation.
Preferably, step B comprises the steps:
B-1. from domain name seed bank, extract formatted website TLD;
B-2. by name server, the TLD obtaining in step B-1 is done to domain name mapping; The number of domain name server is greater than 1;
B-3. common factor is got in the IP address different name servers being obtained, and obtains IP address corresponding to domain name.
Preferably, step C comprises the steps:
C-1. IP address record information table is loaded into internal memory;
C-2. an IP address and IP address record information table information are compared, located, obtain operator corresponding to this IP address, province and direct connector's information, and get rid of the corresponding domain name of IP that cannot locate;
C-3. repeating step C-2, until finish the domain name of not putting on record described in obtaining storehouse.
Preferably, step D comprises the steps:
D-1. generate the assignment file that comprises domain-name information;
D-2. utilize multithreading to gather webpage corresponding to domain name;
D-3. judge the activity of website according to collection result; Described judge comprise: if the conditional code of the HTTP message returning be 200 and can download to normal webpage, judge that website is movable; Otherwise be inactive.
Preferably, described method comprises the not recorded website information drawing in investigation step D; Described investigation comprises:
(1) blocking-up page checking; If certain domain name is jumped to the interception page by ISP, this domain name is never rejected in recorded website information;
(2) IP intelligent correction; If domain name mapping is corrected as the IP of ISP node automatically, this domain name is never rejected in recorded website information;
(3) domain name duplicate removal; Compare with recorded website information, will in recorded website information and recorded website information not, occur that the domain name of occuring simultaneously never rejects in recorded website information.
Another aspect of the present invention, provide a kind of not recorded website based on multichannel data access way to seek system, it is characterized in that, described system comprises: data access module, dns resolution module, IP locating module, activity authentication module and not recorded website information data generation module;
Described data access module is obtained domain name, filters out the domain name of not putting on record and forms domain name seed bank;
Described dns resolution module is carried out dns resolution to the domain name of not putting on record, obtains corresponding IP address;
IP address, described IP locating module location, draws the domain name storehouse of not putting on record;
Described activity authentication module carries out activity checking to website;
Described not recorded website information data generation module draws not recorded website information.
Preferably, described data access module comprises web crawlers access module and domain name daily record access module; Described web crawlers access module comprises: data download module, data analysis module and data duplicate removal module; Described data download module is downloaded the data on WEB server; Described data analysis module is analyzed the external linkage comprising in data source code; Described data duplicate removal module in the domain name grabbing, reject seed bank in already present domain name.
Preferably, domain name daily record access module comprises: providing data formatting module and data duplicate removal module; Described providing data formatting module format original domain name journal file.
Preferably, described system comprises investigation module, investigates not recorded website information; Described investigation comprises: the checking of the blocking-up page, IP intelligent correction and domain name duplicate removal.
Compared with prior art, beneficial effect of the present invention is:
(1) by multichannel data access way, can guarantee the not recorded website information that finally obtains accurately, comprehensively, this achievement recorded website not find and multi-language website recognition system in be verified;
(2) in domain name mapping process, by multiple name servers, same domain name is resolved, analysis result is got to this mode of occuring simultaneously, can improve validity, the correctness of analysis result;
(3), in activity proof procedure, by the dual effect with the web data downloading to HTTP message status code, can improve the accuracy of activity judgement; In addition, increase activity authentication module, reject not recorded website of a part, make the not recorded website information that finally obtains more accurate and effective;
(4) the data duplicate removal module that the present invention uses was carried out duplicate removal, and is not relied on the duplicate removal of database before the domain name that multichannel data access way is obtained enters domain name seed bank, can greatly reduce the pressure of ground database;
(5) the present invention uses polling mechanism, and modules while, operation always, can ensure that the not recorded website information finally obtaining is up-to-date all the time.
Brief description of the drawings
Fig. 1 is that the not recorded website that the present invention is based on multichannel data access way is sought system results figure;
Fig. 2 is the flow chart that in the inventive method, reptile is initiatively found domain name;
Fig. 3 is the flow chart of domain name log acquisition domain name in the inventive method;
Fig. 4 is the flow chart of domain name mapping in the inventive method;
Fig. 5 is IP positioning flow figure in the inventive method;
Fig. 6 is the flow chart of activity checking in the inventive method.
Embodiment
The not recorded website discover method based on multichannel data access way that this patent proposes and the application scenarios of system include but not limited to following several situation:
Recorded website is not found;
Site information statistics (as, website operator, affiliated branch center, direct connector, web site activity, website category of language etc.);
Below in conjunction with " Figure of description " and instantiation, the present invention is described in detail.
Fig. 1 is the structure chart of system of the present invention, and this system comprises: web crawlers access module, domain name daily record access module, dns resolution module, IP locating module, activity authentication module, investigation module and not recorded website information data generation module.
Method of the present invention mainly comprises the steps:
1. obtain domain name by multichannel data access way, the domain-name information of putting on record providing in conjunction with the system of putting on record carries out Preliminary screening, forms the domain name seed bank of not putting on record;
2. domain name in is 1. carried out to dns resolution, obtain corresponding IP address;
3. IP in is 2. carried out to IP status in address, obtain preliminary domestic domain name storehouse of not putting on record (getting rid of not carrying out domain name corresponding to IP address location);
4. gained domain name in is 3. carried out to detection of activity, obtain final not recorded website information.
Concrete technical scheme is as follows:
One, seek module (comprising web crawlers access module and domain name daily record access module) by multichannel data access way and find new domain name, form the domain name seed bank of not putting on record.
obtain domain name by web crawlers
The groundwork flow process of web crawlers is as shown in " Figure of description " Fig. 2:
1. first choose domestic, websites that can access, 100,000 magnitudes as kind of a subdomain name (URL);
2. these URL are put into URL queue to be captured;
3. from URL queue to be captured, take out and wait to capture at URL, resolve DNS, and obtain the IP of main frame, and page download corresponding URL is got off, be stored in downloading web pages storehouse.In addition, these URL are put into and capture URL queue.
4. analyze the URL having captured in URL queue, analyze other link URL wherein, and other link URL are put into URL queue to be captured, thereby enter next circulation.
In the not recorded website discover method and system based on multichannel data access way, first from the domain name of putting on record, select movable domain name as seed URL, by web crawlers, in the webpage grabbing, extract new URL; Secondly, by the URL grabbing and existing domain name seed bank compare, duplicate removal, obtain new URL; Finally, new URL is added in domain name seed bank, enter next round circulation.
This step relates to three nucleus modules: data download, data analysis, data duplicate removal.
Wherein, data are downloaded, and with the active scope basis by name of putting on record in domain name, download the data on the WEB server in Internet according to http protocol, obtain web data, and main purpose is to provide data basis for the content in analyzing web page;
Comprise the steps:
1) connect WEB server;
2) send HTTP request to WEB server;
3) receive the result that WEB server returns;
4) analyze the header that HTTP returns;
5) successfully receive so if returned the data content returning.
Data analysis, downloading the web data obtaining taking data is basis, analyzes the external linkage comprising in source code.
Data duplicate removal, general principle is to adopt HASH algorithm, a character string is calculated to be to the number of a DWORD type, distinguishes the difference of character string according to the difference of numerical value.The external linkage obtaining according to data analysis, adopts HASH algorithm to generate characteristic of correspondence value domain name, in conjunction with duplicate removal log file, rejects already present domain name in seed bank, retains new domain name.
by domain name log acquisition domain name
Its groundwork flow process is as shown in " Figure of description " Fig. 3:
1. obtain original domain name journal file from domestic main flow name server (mainly containing each province name server, international export name server), gather;
2. format domain name journal file obtained in the previous step, finds out in domain name journal file every and records corresponding TLD;
3. domain name obtained in the previous step and original domain name seed bank are compared, duplicate removal, obtain new domain name;
Domain name obtained in the previous step is added in domain name seed bank, enter next round circulation.
This step relates to two nucleus modules: providing data formatting, data duplicate removal.
Wherein, providing data formatting, is exactly by standards on domain name, the TLD form that comprises domain name recording of information and be formatted into standard obtaining from domain name journal file according to international domain identifier brigadier, as, music.baidu.com → baidu.com.
Data duplicate removal, principle, content are similar to " reptile obtains domain name ".
Two, by dns resolution module, domain name is carried out to dns resolution, obtain corresponding IP
Key step comprises:
1. from domain name seed bank, extract formatted website TLD;
2. by multiple name servers, domain name obtained in the previous step is done to domain name mapping;
3. common factor is got in the IP address different name servers being obtained, and obtains IP address corresponding to domain name.
Its groundwork flow process is as shown in " Figure of description " Fig. 4:
1) connect dns server;
2) send request to dns server;
3) receive the result that dns server returns;
4) analyze the result that dns server is resolved, get the common factor of different dns resolution results;
5) if there is common factor, obtain IP list corresponding to domain name.
Three, do IP location by IP locating module to completing the domain name of dns resolution, form domestic domain name storehouse of not putting on record
Its basic workflow is as shown in " Figure of description " Fig. 5:
1. IP address record information table is loaded into internal memory;
2. get an IP address, compare, locate with IP address record information table information, according to the record in IP address record information table corresponding to this IP address, obtain operator corresponding to this IP address, province, direct connector; Get rid of carrying out domain name corresponding to IP address location;
3. repeat above-mentioned the 2nd step, until finish, obtain preliminary domestic domain name storehouse of not putting on record.
Remarks: IP address record information table, comprises operator corresponding to IP address, province, direct connector's information.
Four, by activity authentication module, the domestic domain name of not putting on record is done to detection of activity, form finally not recorded website information
Its basic workflow is as shown in " Figure of description " Fig. 6:
1. generate the assignment file that comprises domain-name information;
2. utilize multithreading to gather webpage corresponding to domain name;
3. the activity of carrying out website according to collection result judges: if the conditional code of the HTTP message returning be 200 and can download to normal webpage, judge that website is movable; Otherwise be inactive.
Remarks: after carrying out the detection of activity of website, in order to obtain more accurately not recorded website information, simultaneously added following investigation module:
A. block page checking, if certain domain name is jumped to the interception page by ISP (ISP), this domain name is never rejected in recorded website information;
B.IP intelligent correction, under certain conditions (as, user network condition is poor, browser rs cache is made mistakes, Website server visit capacity is excessive, browser-incompatible), the IP that domain name mapping is corrected as ISP node automatically, never rejects this domain name in recorded website information;
C. compare with the up-to-date information of recorded website, will in recorded website information and recorded website information not, occur that the domain name of occuring simultaneously never rejects in recorded website information.
Finally should be noted that: above embodiment is only in order to illustrate that technical scheme of the present invention is not intended to limit, although the present invention is had been described in detail with reference to above-described embodiment, those of ordinary skill in the field are to be understood that: still can modify or be equal to replacement the specific embodiment of the present invention, and do not depart from any amendment of spirit and scope of the invention or be equal to replacement, it all should be encompassed in the middle of claim scope of the present invention.

Claims (10)

1. the not recorded website search method based on multichannel data access way, is characterized in that, described method comprises the steps:
A. obtain domain name by multichannel data access way, filter out the domain name of not putting on record and form domain name seed bank;
B. the domain name of not putting on record is carried out to dns resolution, obtain corresponding IP address;
C. locate IP address, draw the domain name storehouse of not putting on record;
D. verify by activity the not recorded website information that draws.
2. the method for claim 1, is characterized in that, in steps A, described multichannel data access way comprises web crawlers access way and domain name daily record access way; Obtaining domain name by web crawlers access way comprises the steps:
A-1-1. choose the addressable website of domestic 100,000 magnitudes as kind of a subdomain name;
A-1-2. by web crawlers, in the webpage grabbing, extract domain name;
A-1-3. the domain name grabbing and existing domain name seed bank are compared, duplicate removal;
A-1-4. the domain name after duplicate removal is added to domain name seed bank, enters next round circulation;
Obtaining domain name by domain name daily record access way comprises the steps:
A-2-1. obtain original domain name journal file and gather from domestic main flow name server; Described main flow name server comprises each province name server and international export name server;
A-2-2. format described original domain name journal file, find out in domain name journal file every and record corresponding TLD;
A-2-3. the described TLD in steps A-2-2 and existing domain name seed bank are compared, duplicate removal;
A-2-4. the domain name after duplicate removal is added to domain name seed bank, enters next round circulation.
3. the method for claim 1, is characterized in that, step B comprises the steps:
B-1. from domain name seed bank, extract formatted website TLD;
B-2. by name server, the TLD obtaining in step B-1 is done to domain name mapping; The number of domain name server is greater than 1;
B-3. common factor is got in the IP address different name servers being obtained, and obtains IP address corresponding to domain name.
4. the method for claim 1, is characterized in that, step C comprises the steps:
C-1. IP address record information table is loaded into internal memory;
C-2. an IP address and IP address record information table information are compared, located, obtain operator corresponding to this IP address, province and direct connector's information, and get rid of the corresponding domain name of IP that cannot locate;
C-3. repeating step C-2, until finish the domain name of not putting on record described in obtaining storehouse.
5. the method for claim 1, is characterized in that, step D comprises the steps:
D-1. generate the assignment file that comprises domain-name information;
D-2. utilize multithreading to gather webpage corresponding to domain name;
D-3. judge the activity of website according to collection result; Described judge comprise: if the conditional code of the HTTP message returning be 200 and can download to normal webpage, judge that website is movable; Otherwise be inactive.
6. the method for claim 1, is characterized in that, described method comprises the not recorded website information drawing in investigation step D; Described investigation comprises:
(1) blocking-up page checking; If certain domain name is jumped to the interception page by ISP, this domain name is never rejected in recorded website information;
(2) IP intelligent correction; If domain name mapping is corrected as the IP of ISP node automatically, this domain name is never rejected in recorded website information;
(3) domain name duplicate removal; Compare with recorded website information, will in recorded website information and recorded website information not, occur that the domain name of occuring simultaneously never rejects in recorded website information.
7. the not recorded website based on multichannel data access way is sought a system, it is characterized in that, described system comprises: data access module, dns resolution module, IP locating module, activity authentication module and not recorded website information data generation module;
Described data access module is obtained domain name, filters out the domain name of not putting on record and forms domain name seed bank;
Described dns resolution module is carried out dns resolution to the domain name of not putting on record, obtains corresponding IP address;
IP address, described IP locating module location, draws the domain name storehouse of not putting on record;
Described activity authentication module carries out activity checking to website;
Described not recorded website information data generation module draws not recorded website information.
8. system as claimed in claim 7, is characterized in that, described data access module comprises web crawlers access module and domain name daily record access module; Described web crawlers access module comprises: data download module, data analysis module and data duplicate removal module; Described data download module is downloaded the data on WEB server; Described data analysis module is analyzed the external linkage comprising in data source code; Described data duplicate removal module in the domain name grabbing, reject seed bank in already present domain name.
9. system as claimed in claim 7, is characterized in that, domain name daily record access module comprises: providing data formatting module and data duplicate removal module; Described providing data formatting module format original domain name journal file.
10. system as claimed in claim 7, is characterized in that, described system comprises investigation module, investigates not recorded website information; Described investigation comprises: the checking of the blocking-up page, IP intelligent correction and domain name duplicate removal.
CN201410299875.6A 2014-06-26 2014-06-26 A kind of non-recorded website search method and system based on multichannel data access way Active CN104065532B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410299875.6A CN104065532B (en) 2014-06-26 2014-06-26 A kind of non-recorded website search method and system based on multichannel data access way

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410299875.6A CN104065532B (en) 2014-06-26 2014-06-26 A kind of non-recorded website search method and system based on multichannel data access way

Publications (2)

Publication Number Publication Date
CN104065532A true CN104065532A (en) 2014-09-24
CN104065532B CN104065532B (en) 2018-08-14

Family

ID=51553073

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410299875.6A Active CN104065532B (en) 2014-06-26 2014-06-26 A kind of non-recorded website search method and system based on multichannel data access way

Country Status (1)

Country Link
CN (1) CN104065532B (en)

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105528351A (en) * 2014-09-29 2016-04-27 中国电信股份有限公司 Method and system for removing duplicate content during process of acquiring Internet information by mobile terminal
CN105763633A (en) * 2016-04-14 2016-07-13 上海牙木通讯技术有限公司 Association method of domain name and website visiting behavior
CN106302850A (en) * 2016-08-04 2017-01-04 北京迅达云成科技有限公司 A kind of authority's DNS method for optimizing configuration and device
CN106657374A (en) * 2017-01-04 2017-05-10 贵州力创科技发展有限公司 Internet traffic and flow direction big data intelligent analysis and decision-making method and system
CN107404495A (en) * 2017-09-01 2017-11-28 北京亚鸿世纪科技发展有限公司 A kind of device based on IP address portrait
CN108111547A (en) * 2018-03-06 2018-06-01 深圳互联先锋科技有限公司 A kind of domain name health monitor method and system
CN108259630A (en) * 2016-12-29 2018-07-06 中国电信股份有限公司 Non- recorded website detection method, platform and system
CN109040333A (en) * 2018-07-10 2018-12-18 厦门秦淮科技有限公司 A kind of domain name is put on record management system
CN109190074A (en) * 2018-08-02 2019-01-11 北京北信源信息安全技术有限公司 WEB application automatic discovering method and system based on terminal internet behavior data
CN109241483A (en) * 2018-08-31 2019-01-18 中国科学院计算技术研究所 A kind of website discovery method and system recommended based on domain name
CN109474587A (en) * 2018-11-01 2019-03-15 北京亚鸿世纪科技发展有限公司 The method that HTTP based on letter peace system kidnaps monitoring analysis and positioning
CN109547440A (en) * 2018-11-27 2019-03-29 深圳互联先锋科技有限公司 Website monitoring method, device, electronic equipment and readable storage medium storing program for executing
CN110677514A (en) * 2019-10-21 2020-01-10 怀来斯达铭数据有限公司 IP filing information management method and device
CN111614797A (en) * 2020-06-02 2020-09-01 中国信息通信研究院 Method and system for detecting IP address missing coverage
CN116055180A (en) * 2023-01-28 2023-05-02 北京亿赛通科技发展有限责任公司 Internet resource record information inquiry verification method and device based on gateway

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140053061A1 (en) * 2012-08-16 2014-02-20 Realnetworks, Inc. System for clipping webpages
CN103701769A (en) * 2013-11-07 2014-04-02 江南大学 Method and system for detecting hazardous network source
CN103856437A (en) * 2012-11-28 2014-06-11 深圳市金蝶中间件有限公司 Site security detection method and system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140053061A1 (en) * 2012-08-16 2014-02-20 Realnetworks, Inc. System for clipping webpages
CN103856437A (en) * 2012-11-28 2014-06-11 深圳市金蝶中间件有限公司 Site security detection method and system
CN103701769A (en) * 2013-11-07 2014-04-02 江南大学 Method and system for detecting hazardous network source

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
杨世标 等: "DNS数据挖掘与搜索引擎技术相结合提升网络安全", 《电信技术》 *
谢龙德: "基于DNS被动监测的ICP/IP未备案信息发现系统设计与实现", 《万方数据库》 *
郭英鹏 等: "网站备案技术研究", 《广东通信技术》 *

Cited By (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105528351A (en) * 2014-09-29 2016-04-27 中国电信股份有限公司 Method and system for removing duplicate content during process of acquiring Internet information by mobile terminal
WO2017177590A1 (en) * 2016-04-14 2017-10-19 上海牙木通讯技术有限公司 Method for associating domain name with website access behavior
RU2709647C9 (en) * 2016-04-14 2020-04-02 Шанхай Яму Коммюникейшн Текнолоджи Ко., Лтд Method of associating a domain name with a characteristic of visiting a website
RU2709647C1 (en) * 2016-04-14 2019-12-19 Шанхай Яму Коммюникейшн Текнолоджи Ко., Лтд Method of associating a domain name with a characteristic of visiting a website
CN105763633B (en) * 2016-04-14 2019-05-21 上海牙木通讯技术有限公司 A kind of correlating method of domain name and website visiting behavior
GB2567749A (en) * 2016-04-14 2019-04-24 Shanghai Yamu Communication Tech Co Ltd Method for associating domain name with website access behavior
CN105763633A (en) * 2016-04-14 2016-07-13 上海牙木通讯技术有限公司 Association method of domain name and website visiting behavior
CN106302850A (en) * 2016-08-04 2017-01-04 北京迅达云成科技有限公司 A kind of authority's DNS method for optimizing configuration and device
CN108259630A (en) * 2016-12-29 2018-07-06 中国电信股份有限公司 Non- recorded website detection method, platform and system
CN108259630B (en) * 2016-12-29 2021-01-12 中国电信股份有限公司 Detection method, platform and system for unregistered website
CN106657374A (en) * 2017-01-04 2017-05-10 贵州力创科技发展有限公司 Internet traffic and flow direction big data intelligent analysis and decision-making method and system
CN107404495A (en) * 2017-09-01 2017-11-28 北京亚鸿世纪科技发展有限公司 A kind of device based on IP address portrait
CN108111547B (en) * 2018-03-06 2021-03-19 深圳互联先锋科技有限公司 Domain name health monitoring method and system
CN108111547A (en) * 2018-03-06 2018-06-01 深圳互联先锋科技有限公司 A kind of domain name health monitor method and system
CN109040333A (en) * 2018-07-10 2018-12-18 厦门秦淮科技有限公司 A kind of domain name is put on record management system
CN109040333B (en) * 2018-07-10 2021-12-07 北京秦淮数据有限公司 Domain name filing management system
CN109190074A (en) * 2018-08-02 2019-01-11 北京北信源信息安全技术有限公司 WEB application automatic discovering method and system based on terminal internet behavior data
CN109241483A (en) * 2018-08-31 2019-01-18 中国科学院计算技术研究所 A kind of website discovery method and system recommended based on domain name
CN109474587A (en) * 2018-11-01 2019-03-15 北京亚鸿世纪科技发展有限公司 The method that HTTP based on letter peace system kidnaps monitoring analysis and positioning
CN109547440A (en) * 2018-11-27 2019-03-29 深圳互联先锋科技有限公司 Website monitoring method, device, electronic equipment and readable storage medium storing program for executing
CN110677514A (en) * 2019-10-21 2020-01-10 怀来斯达铭数据有限公司 IP filing information management method and device
CN111614797A (en) * 2020-06-02 2020-09-01 中国信息通信研究院 Method and system for detecting IP address missing coverage
CN116055180A (en) * 2023-01-28 2023-05-02 北京亿赛通科技发展有限责任公司 Internet resource record information inquiry verification method and device based on gateway
CN116055180B (en) * 2023-01-28 2023-06-16 北京亿赛通科技发展有限责任公司 Internet resource record information inquiry verification method and device based on gateway

Also Published As

Publication number Publication date
CN104065532B (en) 2018-08-14

Similar Documents

Publication Publication Date Title
CN104065532A (en) Unrecorded website search method and system based on multi-channel data access method
CN105763664A (en) Search method and system of unrecorded websites
CN106354765B (en) Log analysis system and method based on distributed acquisition
CN101924757B (en) Method and system for reviewing Botnet
US20170041321A1 (en) Method and system for providing root domain name resolution service
CN107579874B (en) Method and device for detecting data collection missing report of flow collection equipment
CN106095979A (en) URL merging treatment method and apparatus
CN110474994A (en) Domain name analytic method, device, electronic equipment and storage medium
CN103824069A (en) Intrusion detection method based on multi-host-log correlation
CN103888490A (en) Automatic WEB client man-machine identification method
EP2692119B1 (en) Non-existent domain names traffic analysis
CN105025025A (en) Cloud-platform-based domain name active detecting method and system
CN102541884B (en) Method and device for database optimization
CN106708700A (en) Operation and maintenance monitoring method and device applied to server side
CN107249049A (en) A kind of method and apparatus screened to the domain name data that network is gathered
US11956261B2 (en) Detection method for malicious domain name in domain name system and detection device
CN113328990B (en) Internet route hijacking detection method based on multiple filtering and electronic equipment
CN115333966A (en) Nginx log analysis method, system and equipment based on topology
CN108199878B (en) Personal identification information identification system and method in high-performance IP network
CN106067879A (en) The detection method of information and device
KR101293954B1 (en) Apparatus and method for detecting roundabout access
CN110769076B (en) DNS (Domain name System) testing method and system
CN105487936A (en) Information system security evaluation method for classified protection under cloud environment
CN113766046B (en) Iterative traffic tracking method, DNS server and computer readable storage medium
CN102546683A (en) Host computer domain name collecting method and device

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant