CN103440454B - A kind of active honeypot detection method based on search engine keywords - Google Patents

A kind of active honeypot detection method based on search engine keywords Download PDF

Info

Publication number
CN103440454B
CN103440454B CN201310332730.7A CN201310332730A CN103440454B CN 103440454 B CN103440454 B CN 103440454B CN 201310332730 A CN201310332730 A CN 201310332730A CN 103440454 B CN103440454 B CN 103440454B
Authority
CN
China
Prior art keywords
honeypot
webpage
engine
malicious
keyword
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201310332730.7A
Other languages
Chinese (zh)
Other versions
CN103440454A (en
Inventor
邹福泰
白巍
王佳慧
潘道欣
易平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Jiaotong University
Original Assignee
Shanghai Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Jiaotong University filed Critical Shanghai Jiaotong University
Priority to CN201310332730.7A priority Critical patent/CN103440454B/en
Publication of CN103440454A publication Critical patent/CN103440454A/en
Application granted granted Critical
Publication of CN103440454B publication Critical patent/CN103440454B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The invention discloses a kind of active honeypot detection method based on search engine keywords, first known malicious searches engine keywords database is utilized, the corresponding honeypot webpage of automatic structure: for the malicious searches engine keyword for URL path, utilize Appache? HTTP? the corresponding honeypot webpage of address rewrite technical construction of Server engine; For the malicious searches engine keyword for web page contents, keyword is re-entered in search engine, using the web results that returns as honeypot webpage.Secondly honeypot webpage is indexed in search engine.Finally adopt data mining algorithm to extract new malicious searches engine keyword according to the malice Visitor Logs of honeypot webpage, and be incorporated into malicious searches engine keywords database, re-construct new honeypot webpage.The present invention promotes the detection efficiency of honey jar greatly, makes up the passivity shortcoming of traditional honey jar; And dynamically update honeypot webpage, to obtain up-to-date assault vulnerability information.

Description

A kind of active honeypot detection method based on search engine keywords
Technical field
The present invention relates to a kind of active honeypot detection method, particularly relate to a kind of active honeypot detection method based on search engine keywords.
Background technology
Assault, often on the basis having found system or some leak of network, always constantly produces new attack method for new leak.In order to test new leak and attack method, hacker often will utilize search engine to search for the website that may there is certain leak on the internet, attacks it.Also have hacker for certain leak, write out certain specifically scanning and instrument of invasion automatically, by search engine, all websites that internet may exist this leak have been scanned on a large scale and invaded.These several years, the assault of search engine is utilized to become a kind of important assault means.
Honey jar is a deception system comprising leak, and it is special in attracting and inveigling those hackers and design, and by simulating one or more pregnable main frame, provides one to hold pregnable target to hacker.Because honey jar does not outwardly provide real valuable service, so all trials to honey jar are all regarded as suspicious.Another purposes of honey jar delays the attack to real target in attack, allows hacker lose time on honey jar.
Honey jar is divided into real system honey jar and antiforge system honey jar.Real system honey jar is real honey jar, and it runs real system, and with the leak truly can invaded, this leak belongs to the most dangerous leak; And the invasion information that it is recorded is the most real.Antiforge system honey jar is equally also be based upon on the basis of real system, and it utilizes the powerful ability to model of some implementing procedures, and puppet produces not one's own leak.Invade such leak, just spin in a program frame.Honey jar can at utmost prevent invader from destroying, and also can simulate non-existent leak to confuse hacker.
If corresponding honey jar can be simulated according to the search keyword of hacker, be deployed on internet, and allow well-known search engine search, in conjunction with search engine algorithms optimisation technique, honey jar is presented to hacker, lures assault, then can reach the object initiatively attracting assault, greatly improve the Detection results of honey jar, and the new hacker that can also occur according to every day like this searches for keyword continuous renewal honey jar, ensures the object that the content of honey jar is synchronous with the attack means of hacker.
Therefore, those skilled in the art is devoted to develop a kind of active honeypot detection method based on search engine keywords, to make up the shortcoming of the passive wait of traditional honey jar, and better initiatively lures assault, constantly update the content of honey jar, make it synchronous with up-to-date hacking technique.
Summary of the invention
Because the above-mentioned defect of prior art, technical matters to be solved by this invention is to provide a kind of active wooden pipe detection method based on search engine keywords.
For achieving the above object, the invention provides a kind of active honeypot detection method based on search engine keywords, it is characterized in that, comprise the following steps:
Step (101) utilization had collected and the malicious searches engine keywords database be identified constructs honeypot webpage automatically;
The described honeypot webpage of structure is indexed to search engine by search engine rank optimisation technique by step (102), and improves the rank of described honeypot webpage in described search engine and access initiatively to attract hacker;
Step (103) usage data mining algorithm from the Visitor Logs of described honeypot webpage extracts new malicious searches engine keyword, and the described new malicious searches engine keyword extracted is incorporated to described malicious searches engine keywords database, step (101) is returned in redirect again.
Further, in described step (101), described malicious searches engine keywords database comprises the malicious searches engine keyword for URL path and the malicious searches engine keyword for web page contents.
Further, for the described malicious searches engine keyword for URL path, described honeypot webpage adopts address rewrite technology to construct.
Further, wherein, described address rewrite technology uses ApacheHTTPServer engine.
Further, for the described malicious searches engine keyword for web page contents, again search for described malicious searches engine keyword on a search engine, the webpage searched out is processed rear as described honeypot webpage.
Further, described search engine rank optimisation technique comprises the high prestige domain name of registration, increases link and optimizing webpage content.
Further, described step (103) also comprises and distinguishes normal access in described web page access record and malicious attack.
Further, the Visitor Logs of described honeypot webpage is divided into engine reptile, described normal access and described malicious attack; Wherein said engine reptile belongs to described malicious attack equally.
Further, in described step (103), be not also that the access of 200 is all as described malicious attack using all http response codes.
Further, described data mining algorithm extracts described new malicious searches engine keyword by HTTPReferrer information.
In better embodiment of the present invention, first the malicious searches engine keywords database of the nearest networks enjoy popularity identified is passed through, adopt two kinds of method constructs to go out Virtual honeypot webpage: for the malicious searches engine keyword for URL path, utilize ApacheHTTPServer engine to adopt address rewrite technical construction honeypot webpage; For the malicious searches engine keyword for web page contents, again inquire about these keywords on a search engine, the webpage returned is processed rear as corresponding honeypot webpage.Secondly, the searched engine index of honeypot webpage allowing these simulate out by search engine rank optimisation technique, and improve their rank, attract hacker on one's own initiative.Last usage data mining algorithm distinguishes the record of different malicious attack and normal access in flow, thus analyze object and the step of the up-to-date attack of hacker, and extract the malicious searches engine keyword that makes new advances, and new malicious searches engine keyword is incorporated to malicious searches engine keywords database, to dynamically update malicious searches engine keyword, then can dynamically update honeypot webpage according to the malicious searches engine keywords database dynamically updated, run so forth.
A kind of active honeypot detection method based on search engine keywords of the present invention, by attracting assault on one's own initiative, greatly promoting the detection efficiency of honey jar, making up the passivity shortcoming of traditional honey jar; And the inventive method adopts and dynamically updates honeypot webpage, obtains up-to-date assault vulnerability information, with the malicious attack behavior in mining data flow, analyze the feature of the up-to-date attack of hacker.
Be described further below with reference to the technique effect of accompanying drawing to design of the present invention, concrete structure and generation, to understand object of the present invention, characteristic sum effect fully.
Accompanying drawing explanation
Fig. 1 is the process flow diagram of a kind of active honeypot detection method based on search engine keywords of the present invention;
Fig. 2 is the honey pot system Organization Chart of a kind of active honeypot detection method based on search engine keywords of the present invention.
Embodiment
Below in conjunction with accompanying drawing, embodiments of the invention are elaborated: the present embodiment is implemented under with technical solution of the present invention prerequisite, give detailed embodiment and concrete operating process, but protection scope of the present invention is not limited to following embodiment.
In the present embodiment, as shown in Figure 1, a kind of active honeypot detection method based on search engine keywords of the present invention comprises the following steps:
Step 101: utilize and collected and the malicious searches engine keywords database be identified, automatically construct honeypot webpage, as shown in Figure 2:
Classification and Identification has searched and the malicious searches engine keywords database be identified: malicious searches engine keyword is divided into for web page address URL(UniformResourceLocator) the malicious searches engine keyword in path and the malicious searches engine keyword for web page contents.
So to the malicious searches engine keyword for URL path, the search engine of ApacheHTTPServer is utilized to adopt URLRewrite technology, i.e. address rewrite technical construction honeypot webpage.Redirecting of URLRewrite i.e. address, URLRewrite technology intercepts user's request of importing into, and automatically this request is redirected to the process of other resources.The working method of server when processing user and asking does not change, and only increases the processing procedure to asking to redirect.In the present invention, URLRewrite technology, according to the malicious searches engine keyword in existing URL path, redirects to corresponding honeypot webpage it.
For the malicious searches engine keyword for web page contents, again inquire about these malicious searches engine keywords, and the web page contents of Search Results is left in local ApacheWEB server as corresponding honeypot webpage.
Step 102: by search engine rank optimisation technique, the described honeypot webpage of structure is indexed to search engine: improve the rank of described honeypot webpage in described search engine and access initiatively to attract hacker.Utilize the high prestige domain name of registration, increase the search engine optimization technology such as link and optimizing webpage content and make the searched engine index of honeypot webpage, and promote its rank further, enable better to attract hacker.
Step 103: adopt following algorithm to extract malicious searches engine keyword from web page access record:
The first step, distinguishes the normal access of webpage record and malicious attack: after the searched engine of honeypot webpage is included, and hacker by the keyword search of malicious searches engine to these honeypot webpages, and can attack it.Record all access to honeypot webpage, and therefrom excavate the attack of hacker.Because being linked in website of honeypot webpage is all hide link, user cannot see, but for the attack tool of hacker, such link can be found.In the Visitor Logs of honey jar, as shown in Figure 2, except normally accessing, also comprise two class access: namely attack 201 and malicious searches 203; In addition, utilize the distinctive user agent of search engine (UserAgent) and source IP address identification from the reptile 202 of well-known search engine in addition.Therefore, all situations except normal access for honeypot webpage will be identified as malicious attack.Further, may attempt the phenomenon of accessing some system sensitive resource paths for assailant, the access being 200 due to http response code is all normal, thus by all http response codes be not 200 access be all classified as malicious attack.
Second step, record for malicious attack records its attack source, using the attack source of record as database, utilize data mining algorithm to the attack source database initialize data mining model of record, then carry out classification analysis, and extract the malicious searches engine keyword made new advances: due to HTTPReferrer, i.e. address, HTTP source, be a field of HTTP gauge outfit, be used for representing from where being linked to current webpage, the form of employing is URL.By HTTPReferrer, current webpage can check visitor wherefrom; So by HTTPReferrer information, hacker can be extracted and access the new malicious searches engine keyword that honeypot webpage may use.
After having extracted new malicious searches engine keyword, the new malicious searches engine keyword extracted is added in malice keywords database, jump to step 101, re-construct new honeypot webpage, to reach the object dynamically updating honeypot webpage.
More than describe preferred embodiment of the present invention in detail.Should be appreciated that the ordinary skill of this area just design according to the present invention can make many modifications and variations without the need to creative work.Therefore, all technician in the art, all should by the determined protection domain of claims under this invention's idea on the basis of existing technology by the available technical scheme of logical analysis, reasoning, or a limited experiment.

Claims (1)

1., based on an active honeypot detection method for search engine keywords, it is characterized in that, comprise the following steps:
Step (101) utilization had collected and the malicious searches engine keywords database be identified constructs honeypot webpage automatically;
The described honeypot webpage of structure is indexed to search engine by search engine rank optimisation technique by step (102), and improves the rank of described honeypot webpage in described search engine and access initiatively to attract hacker;
Step (103) usage data mining algorithm from the Visitor Logs of described honeypot webpage extracts new malicious searches engine keyword, and the described new malicious searches engine keyword extracted is incorporated to described malicious searches engine keywords database, step (101) is returned in redirect again;
In described step (101), described malicious searches engine keywords database comprises the malicious searches engine keyword for URL path and the malicious searches engine keyword for web page contents;
For the described malicious searches engine keyword for URL path, described honeypot webpage adopts address rewrite technology to construct;
Described address rewrite technology uses AppacheHTTPServer engine;
For the described malicious searches engine keyword for web page contents, again search for described malicious searches engine keyword on a search engine, the webpage searched out is processed rear as described honeypot webpage;
Described search engine rank optimisation technique comprises the high prestige domain name of registration, increases link and optimizing webpage content; Described step (103) also comprises distinguishes normal access in described honeypot webpage Visitor Logs and malicious attack;
The Visitor Logs of described honeypot webpage is divided into engine reptile, described normal access and described malicious attack; Wherein said engine reptile belongs to described malicious attack equally;
In described step (103), be not also that the access of 200 is all as described malicious attack using all http response codes;
Described data mining algorithm extracts described new malicious searches engine keyword by HTTPReferrer information.
CN201310332730.7A 2013-08-01 2013-08-01 A kind of active honeypot detection method based on search engine keywords Expired - Fee Related CN103440454B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310332730.7A CN103440454B (en) 2013-08-01 2013-08-01 A kind of active honeypot detection method based on search engine keywords

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310332730.7A CN103440454B (en) 2013-08-01 2013-08-01 A kind of active honeypot detection method based on search engine keywords

Publications (2)

Publication Number Publication Date
CN103440454A CN103440454A (en) 2013-12-11
CN103440454B true CN103440454B (en) 2016-04-06

Family

ID=49694147

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310332730.7A Expired - Fee Related CN103440454B (en) 2013-08-01 2013-08-01 A kind of active honeypot detection method based on search engine keywords

Country Status (1)

Country Link
CN (1) CN103440454B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104978519A (en) * 2014-10-31 2015-10-14 哈尔滨安天科技股份有限公司 Implementation method and device of application-type honeypot
CN108229166A (en) * 2017-12-08 2018-06-29 重庆邮电大学 A kind of webpage Trojan horse detecting system and method searched for using leading type
CN111917691A (en) * 2019-05-10 2020-11-10 张长河 WEB dynamic self-adaptive defense system and method based on false response
CN110677414A (en) * 2019-09-27 2020-01-10 北京知道创宇信息技术股份有限公司 Network detection method and device, electronic equipment and computer readable storage medium
CN110971605B (en) * 2019-12-05 2022-03-08 福建天晴在线互动科技有限公司 Method for acquiring pirated game server information by capturing data packet

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1567118A (en) * 2004-03-29 2005-01-19 四川大学 Computer viruses detection and identification system and method
CN101060444A (en) * 2007-05-23 2007-10-24 西安交大捷普网络科技有限公司 Bayesian statistical model based network anomaly detection method
CN102571484A (en) * 2011-12-14 2012-07-11 上海交通大学 Method for detecting and finding online water army

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1567118A (en) * 2004-03-29 2005-01-19 四川大学 Computer viruses detection and identification system and method
CN101060444A (en) * 2007-05-23 2007-10-24 西安交大捷普网络科技有限公司 Bayesian statistical model based network anomaly detection method
CN102571484A (en) * 2011-12-14 2012-07-11 上海交通大学 Method for detecting and finding online water army

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
恶意的URL捕获分析系统;周佩颖;《中国优秀硕士学位论文全文数据库(电子期刊)信息科技辑》;20110415;I139-133页 *

Also Published As

Publication number Publication date
CN103440454A (en) 2013-12-11

Similar Documents

Publication Publication Date Title
US9723018B2 (en) System and method of analyzing web content
US9734332B2 (en) Behavior profiling for malware detection
CN104125209B (en) Malice website prompt method and router
US8972401B2 (en) Search spam analysis and detection
KR100619178B1 (en) Method and apparatus for detecting invalid clicks on the internet search engine
US9430577B2 (en) Search ranger system and double-funnel model for search spam analyses and browser protection
CN105491053A (en) Web malicious code detection method and system
CN103440454B (en) A kind of active honeypot detection method based on search engine keywords
WO2011094746A2 (en) Url reputation system
CA2671183A1 (en) System and method of analyzing web addresses
US20180131708A1 (en) Identifying Fraudulent and Malicious Websites, Domain and Sub-domain Names
Sun et al. Automating URL blacklist generation with similarity search approach
CN111371778B (en) Attack group identification method, device, computing equipment and medium
CN113454621A (en) Method, apparatus and computer program for collecting data from multiple domains
CN103036896A (en) Method and system for testing malicious links
Sun et al. AutoBLG: Automatic URL blacklist generator using search space expansion and filters
CN107231364A (en) A kind of website vulnerability detection method and device, computer installation and storage medium
KR102190316B1 (en) Deep web analysis system and method using browser simulator
Tao Suspicious URL and device detection by log mining
US9094452B2 (en) Method and apparatus for locating phishing kits
Takata et al. MineSpider: Extracting hidden URLs behind evasive drive-by download attacks
KR100619179B1 (en) Method and apparatus for detecting invalid clicks on the internet search engine
JP6478730B2 (en) Malignant URL candidate acquisition device, malignant URL candidate acquisition method, and program
KR101767589B1 (en) Web address extraction system for checking malicious code and method thereof
Wang et al. Design and Implementation of Web Honeypot Detection System Based on Search Engine

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C53 Correction of patent for invention or patent application
CB03 Change of inventor or designer information

Inventor after: Zou Futai

Inventor after: Bai Wei

Inventor after: Wang Jiahui

Inventor after: Pan Daoxin

Inventor after: Yi Ping

Inventor before: Zou Futai

Inventor before: Bai Wei

Inventor before: Pan Daoxin

Inventor before: Yi Ping

COR Change of bibliographic data

Free format text: CORRECT: INVENTOR; FROM: ZOU FUTAI BAI WEI PAN DAOXIN YI PING TO: ZOU FUTAI BAI WEI WANG JIAHUI PAN DAOXIN YI PING

C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20160406

Termination date: 20180801