CN103685158A - accurate collection method and system based on phishing website propagation - Google Patents

accurate collection method and system based on phishing website propagation Download PDF

Info

Publication number
CN103685158A
CN103685158A CN201210324614.6A CN201210324614A CN103685158A CN 103685158 A CN103685158 A CN 103685158A CN 201210324614 A CN201210324614 A CN 201210324614A CN 103685158 A CN103685158 A CN 103685158A
Authority
CN
China
Prior art keywords
website
url
fishing website
black chain
fishing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201210324614.6A
Other languages
Chinese (zh)
Inventor
潘建波
彭仁诚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Kingsoft Internet Security Software Co Ltd
Shell Internet Beijing Security Technology Co Ltd
Zhuhai Juntian Electronic Technology Co Ltd
Beijing Kingsoft Internet Science and Technology Co Ltd
Original Assignee
Beijing Kingsoft Internet Security Software Co Ltd
Shell Internet Beijing Security Technology Co Ltd
Zhuhai Juntian Electronic Technology Co Ltd
Beijing Kingsoft Internet Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Kingsoft Internet Security Software Co Ltd, Shell Internet Beijing Security Technology Co Ltd, Zhuhai Juntian Electronic Technology Co Ltd, Beijing Kingsoft Internet Science and Technology Co Ltd filed Critical Beijing Kingsoft Internet Security Software Co Ltd
Priority to CN201210324614.6A priority Critical patent/CN103685158A/en
Publication of CN103685158A publication Critical patent/CN103685158A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention belongs to the technical field of computer defense, and particularly discloses an accurate collection method and system based on phishing website propagation. The method comprises the following steps: recording URLs and query times of all websites queried at a server side within a preset time range; acquiring URLs of websites ranked within a preset range according to the query times; detecting whether a black chain exists in a webpage corresponding to each URL, and if so, acquiring the URL of the black chain; judging whether the website corresponding to the URL of each black chain is a phishing website or not; and if the phishing website is the phishing website, adding the URL corresponding to the phishing website into the phishing website database. The system comprises a query recording module, a ranking acquisition module, a black chain detection module, a phishing website judgment module, a phishing website acquisition module and a phishing website database which correspond to the method. By the method and the system, the phishing websites can be collected more timely and accurately.

Description

Accurate collection method and the system based on fishing website, propagated
Technical field
The invention belongs to computer defense technique field, be specifically related to a kind of accurate collection method and system of propagating based on fishing website.
Background technology
Fishing website is a kind of network fraud behavior, refer to that lawless person utilizes various means, URL address and the content of pages of counterfeit true website, or utilize the leak in true Website server program to insert dangerous HTML code in some webpage of website, with this, gain user bank or the private data such as credit card account, password by cheating or allow consumer directly in the mode paying, money be imported in cheat's bank account, seriously affected the development of on-line finance service, ecommerce, endanger public interest, affect the confidence of public's applying Internet.
In order to prevent the harm of fishing website, current thinking has two kinds:
One, by a kind of method or device, check fishing website, such as recognition methods and device, the method for No. 200710072997.1 patent of China based on gateway, bridge guarding phishing website etc. of the detection method of No. 200910106659 a kind of fishing websites of patent of China and device, No. 201110172952.8 patent fishing websites of China.These schemes all attempt to find a kind of rule according to the feature of fishing website, and then formulate certain detection rule and remove to detect fishing website.These methods or device, initial stage may have certain effect, but along with fishing website producer is for detecting being familiar with and understanding of rule, the new fishing website that they make is just easy to the detection by them, therefore, the fishing website recall rate of this type of precautionary scheme can slowly reduce, and does not have gradually the effect of strick precaution.
Two, set up fishing website database, find that a fishing website just adds this fishing website in this database to, by this database, check that website is fishing website, this kind of mode verification and measurement ratio and accuracy rate are high, but a difficult point of which is, the timely collection of fishing website.
At present, the reasonable method of neither one is gone back in the collection of fishing website, its main method is: user accesses a website-URL of this website is inquired about in local black and white database during whether be fishing website upload onto the server for the URL of local None-identified, and differentiate-server also cannot differentiate URL uploads background authentication system, further
Mainly to obtain by the filtration collection to magnanimity website and the mode of report.Owing to all can producing a large amount of new websites every day, each new website is likely fishing website, and the new website in the face of magnanimity, judges filtration one by one to each new website, be unpractical, existing way is random or judges and collect according to some rule in first o'clock.Random mode, does not have specific aim can do a lot of idle works; According to first rule judgment, collect, will have same problem, the recall rate of fishing website can slowly reduce.
Summary of the invention
In order to address the above problem, the object of the present invention is to provide a kind of accurate collection method and system of propagating based on fishing website, to collect more in time and accurately fishing website.
Applicant finds by scrutinizing the occurrence law of fishing website: in order to make the fishing website can wide-scale distribution; fishing website producer can the mode by black chain be linked at its fishing website on the website that some visit capacities are large conventionally, and then makes it to propagate more extensively to reach the object of extensively casting net.
In order to realize foregoing invention object, based on above-mentioned research, find, obtained following technical scheme:
An accurate collection method of propagating based on fishing website, comprises the following steps:
Within the scope of Preset Time, be recorded in URL and the inquiry times thereof of all websites that server end inquired about;
According to described inquiry times, obtain the URL of the website of rank in preset range;
Detect in the webpage that each URL is corresponding whether have black chain, if there is the URL that obtains its black chain;
Whether website corresponding to URL that judges each black chain is fishing website;
If fishing website, is added into URL corresponding to this fishing website in fishing website database.
Further, the described default time is 24 hours.
Further, described default rank is 1000.
An accurate gathering system of propagating based on fishing website, comprising:
Query note module, within the scope of Preset Time, is recorded in URL and the inquiry times thereof of all websites that server end inquired about;
Rank acquisition module, for according to described inquiry times, obtains the URL of the website of rank in preset range;
Black chain detection module, for detection of whether having black chain in webpage corresponding to each URL, if there is the URL that obtains its black chain;
Fishing website judge module, for judging whether the website corresponding to URL of each black chain is fishing website, if fishing website starts fishing website acquisition module;
Fishing website acquisition module, for being added into fishing website database by URL corresponding to this fishing website;
Fishing website database, for storing the url data of fishing website.
Further, the described default time is 24 hours.
Further, described default rank is 1000.
The present invention, according to the propagating characteristic to fishing website, has adopted the strategy of directional collecting, and the object of collection is concentrated on and enlivens the URL that URL(server end inquiry times is many) in the URL of black chain, thereby greatly dwindled the scope of collecting.Because website number within the scope of this is not very large, can accomplish to differentiate fast and accurately whether this website is fishing website completely, and then fishing website is added in fishing website database, complete the collection to fishing website.
As from the foregoing, with respect to the collection technique of existing fishing website, the present invention is more targeted, can be more in time and collect exactly emerging fishing website, for further improving network security, lay the first stone.
Accompanying drawing explanation
The picture that this accompanying drawing explanation provides is used for assisting a further understanding of the present invention, forms the application's a part, does not form inappropriate limitation of the present invention, in the accompanying drawings:
Fig. 1 is flow chart corresponding to the inventive method;
Fig. 2 is block diagram corresponding to system of the present invention.
Embodiment
As shown in Figure 1, the present embodiment discloses a kind of accurate collection method of propagating based on fishing website, comprises the following steps:
Step1: within the scope of Preset Time, be recorded in URL and the inquiry times thereof of all websites that server end inquired about; Whether the server described in this step refers to existing is the server of fishing website for inquiring user current accessed website, and the object of this step is to record the liveness of URL of the server end inquiry of a period; The Preset Time that it is concrete, can arrange the different time according to the disposal ability of real server, just gathers the website that liveness is high exactly within certain reasonable time, and the present embodiment is selected 24 hours;
Step2: according to described inquiry times, obtain the URL of the website of rank in preset range; The object of this step is the URL that need look for liveness high, also be that fishing website likes best parasitic number of site, thereby reach the object of directional collecting targetedly, the present embodiment is chosen inquiry times rank at the URL of the website of first 1000, certainly can according to real server disposal ability select to choose more or still less ranking website process;
Step3: detect in the webpage that each URL is corresponding whether have black chain, if there is the URL that obtains its black chain; Concrete black chain detects and can adopt existing various computer approach, as long as can detect black chain, such as: by SeoQuake plug-in unit, assist to find etc.;
Step4: whether website corresponding to URL that judges each black chain is fishing website; The determination methods of concrete fishing website, can adopt existing various computer means, such as the detection method of China's patent No. 200910106559.1 disclosed a kind of fishing websites and device etc.;
Step5: if fishing website is added into URL corresponding to this fishing website in fishing website database, thereby complete the directional collecting of fishing website.
This enforcement also discloses a kind of accurate gathering system of propagating based on fishing website, comprising:
Query note module 1, within the scope of Preset Time, is recorded in URL and the inquiry times thereof of all websites that server end inquired about, and the described default time is 24 hours
Rank acquisition module 2, for according to described inquiry times, obtains the URL of the website of rank in preset range, and described default rank is 1000;
Black chain detection module 3, for detection of whether having black chain in webpage corresponding to each URL, if there is the URL that obtains its black chain;
Fishing website judge module 4, for judging whether the website corresponding to URL of each black chain is fishing website, if fishing website starts fishing website acquisition module;
Fishing website acquisition module 5, for being added into fishing website database by URL corresponding to this fishing website;
Fishing website database 6, for storing the url data of fishing website.
By said method or system, the collection of fishing website can be concentrated on to this class more among a small circle of black chain in the URL that liveness is high, thus can be more in time and collect exactly emerging fishing website, for further improving network security, lay the first stone.
The present embodiment can also record place, website industry (title that website industry judgement can be based on storehouse, industry website, website, the domain name suffix of website etc.) in record queries number of times, then according to different industries, gets rid of or key monitoring number of site.Such as government and education sector website, although its common visit capacity is large but it exists may going of black chain just little, yet such as some entertainment sites exist may going of black chain just high, therefore coordinate industrial nature to collect fishing website and can further obtain the effect of getting twice the result with half the effort.
More than describe preferred embodiment of the present invention in detail, should be appreciated that the ordinary skill of this area just can design according to the present invention be made many modifications and variations without creative work.Therefore, all technical staff in the art according to the present invention design on prior art basis by logic analysis, reasoning or according to the available technical scheme of limited experiment, all should be among the determined protection range by these claims.

Claims (6)

1. an accurate collection method of propagating based on fishing website, is characterized in that comprising the following steps:
Within the scope of Preset Time, be recorded in URL and the inquiry times thereof of all websites that server end inquired about;
According to described inquiry times, obtain the URL of the website of rank in preset range;
Detect in the webpage that each URL is corresponding whether have black chain, if there is the URL that obtains its black chain;
Whether website corresponding to URL that judges each black chain is fishing website;
If fishing website, is added into URL corresponding to this fishing website in fishing website database.
2. collection method according to claim 1, is characterized in that:
The described default time is 24 hours.
3. collection method according to claim 1, is characterized in that:
Described default rank is 1000.
4. an accurate gathering system of propagating based on fishing website, is characterized in that comprising:
Query note module, within the scope of Preset Time, is recorded in URL and the inquiry times thereof of all websites that server end inquired about;
Rank acquisition module, for according to described inquiry times, obtains the URL of the website of rank in preset range;
Black chain detection module, for detection of whether having black chain in webpage corresponding to each URL, if there is the URL that obtains its black chain;
Fishing website judge module, for judging whether the website corresponding to URL of each black chain is fishing website, if fishing website starts fishing website acquisition module;
Fishing website acquisition module, for being added into fishing website database by URL corresponding to this fishing website;
Fishing website database, for storing the url data of fishing website.
5. gathering system according to claim 4, is characterized in that:
The described default time is 24 hours.
6. gathering system according to claim 4, is characterized in that:
Described default rank is 1000.
CN201210324614.6A 2012-09-04 2012-09-04 accurate collection method and system based on phishing website propagation Pending CN103685158A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210324614.6A CN103685158A (en) 2012-09-04 2012-09-04 accurate collection method and system based on phishing website propagation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210324614.6A CN103685158A (en) 2012-09-04 2012-09-04 accurate collection method and system based on phishing website propagation

Publications (1)

Publication Number Publication Date
CN103685158A true CN103685158A (en) 2014-03-26

Family

ID=50321491

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210324614.6A Pending CN103685158A (en) 2012-09-04 2012-09-04 accurate collection method and system based on phishing website propagation

Country Status (1)

Country Link
CN (1) CN103685158A (en)

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101534306A (en) * 2009-04-14 2009-09-16 深圳市腾讯计算机系统有限公司 Detecting method and a device for fishing website
CN101547197A (en) * 2009-04-30 2009-09-30 珠海金山软件股份有限公司 A URL washing device and a washing method
US20100064366A1 (en) * 2008-09-11 2010-03-11 Alibaba Group Holding Limited Request processing in a distributed environment
US20100095375A1 (en) * 2008-10-14 2010-04-15 Balachander Krishnamurthy Method for locating fraudulent replicas of web sites
CN102404741A (en) * 2011-11-30 2012-04-04 中国联合网络通信集团有限公司 Method and device for detecting abnormal online of mobile terminal
CN102436563A (en) * 2011-12-30 2012-05-02 奇智软件(北京)有限公司 Method and device for detecting page tampering
CN102446255A (en) * 2011-12-30 2012-05-09 奇智软件(北京)有限公司 Method and device for detecting page tamper
CN102523210A (en) * 2011-12-06 2012-06-27 中国科学院计算机网络信息中心 Phishing website detection method and device
CN102591965A (en) * 2011-12-30 2012-07-18 奇智软件(北京)有限公司 Method and device for detecting black chain
CN102638448A (en) * 2012-02-27 2012-08-15 珠海市君天电子科技有限公司 Method for judging phishing websites based on non-content analysis

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100064366A1 (en) * 2008-09-11 2010-03-11 Alibaba Group Holding Limited Request processing in a distributed environment
US20100095375A1 (en) * 2008-10-14 2010-04-15 Balachander Krishnamurthy Method for locating fraudulent replicas of web sites
CN101534306A (en) * 2009-04-14 2009-09-16 深圳市腾讯计算机系统有限公司 Detecting method and a device for fishing website
CN101547197A (en) * 2009-04-30 2009-09-30 珠海金山软件股份有限公司 A URL washing device and a washing method
CN102404741A (en) * 2011-11-30 2012-04-04 中国联合网络通信集团有限公司 Method and device for detecting abnormal online of mobile terminal
CN102523210A (en) * 2011-12-06 2012-06-27 中国科学院计算机网络信息中心 Phishing website detection method and device
CN102436563A (en) * 2011-12-30 2012-05-02 奇智软件(北京)有限公司 Method and device for detecting page tampering
CN102446255A (en) * 2011-12-30 2012-05-09 奇智软件(北京)有限公司 Method and device for detecting page tamper
CN102591965A (en) * 2011-12-30 2012-07-18 奇智软件(北京)有限公司 Method and device for detecting black chain
CN102638448A (en) * 2012-02-27 2012-08-15 珠海市君天电子科技有限公司 Method for judging phishing websites based on non-content analysis

Similar Documents

Publication Publication Date Title
Ding Applying weighted PageRank to author citation networks
WO2014036801A1 (en) Method for detecting phishing website without depending on sample
CN102567407B (en) Method and system for collecting forum reply increment
CN103618696B (en) Method and server for processing cookie information
CN104077396A (en) Method and device for detecting phishing website
CN102592067A (en) Webpage recognition method, device and system
CN103905372A (en) Method and device for removing false alarm of phishing website
CN102957664A (en) Method and device for identifying phishing websites
CN111756724A (en) Detection method, device and equipment for phishing website and computer readable storage medium
TWI474199B (en) A method of increasing search engine optimization performance of a social media webpage of an entity
CN106230835B (en) Method based on Nginx log analysis and the IPTABLES anti-malicious access forwarded
CN102891861B (en) Client-based phishing website detection method and device
CN104079531A (en) Hotlinking detection method, system and device
Rogers et al. National Web studies: The case of Iran online
CN103049456B (en) A kind of method and device screening webpage
Le Pochat et al. Evaluating the long-term effects of parameters on the characteristics of the tranco top sites ranking
TWI467399B (en) Automated system and method for analyzing backlinks
CN108270754B (en) Detection method and device for phishing website
CN104394158A (en) Information security filtering method
CN104753758B (en) A kind of information attribute recognition methods and device
CN103177084A (en) Data mining method considering data reliability
WO2015149550A1 (en) Method and apparatus for determining grades of links within website
CN103685157A (en) Method and system for collecting phishing websites based on payment
CN107404497A (en) A kind of method that WebShell is detected in massive logs
CN116861128A (en) Website risk assessment method and device based on simulated access and storable medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20140326