WO2013152610A1 - 钓鱼网站检测方法及设备 - Google Patents

钓鱼网站检测方法及设备 Download PDF

Info

Publication number
WO2013152610A1
WO2013152610A1 PCT/CN2012/087762 CN2012087762W WO2013152610A1 WO 2013152610 A1 WO2013152610 A1 WO 2013152610A1 CN 2012087762 W CN2012087762 W CN 2012087762W WO 2013152610 A1 WO2013152610 A1 WO 2013152610A1
Authority
WO
WIPO (PCT)
Prior art keywords
website
detected
phishing
feature value
link
Prior art date
Application number
PCT/CN2012/087762
Other languages
English (en)
French (fr)
Inventor
洪博
王利明
肖娅丽
Original Assignee
中国科学院计算机网络信息中心
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中国科学院计算机网络信息中心 filed Critical 中国科学院计算机网络信息中心
Publication of WO2013152610A1 publication Critical patent/WO2013152610A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/51Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems at application loading time, e.g. accepting, rejecting, starting or inhibiting executable software based on integrity or source reliability
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1441Countermeasures against malicious traffic
    • H04L63/1483Countermeasures against malicious traffic service impersonation, e.g. phishing, pharming or web spoofing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2221/00Indexing scheme relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F2221/21Indexing scheme relating to G06F21/00 and subgroups addressing additional information or applications relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F2221/2119Authenticating web pages, e.g. with suspicious links

Definitions

  • the present invention relates to information processing technologies, and in particular, to a phishing website detection method and device, and belongs to the field of network security technologies. Background technique
  • Phishing refers to plausible a receiving user to a phishing website that is very similar to the target organization's website by sending spam emails, etc., and obtaining the personal sensitive information entered by the recipient on the website. criminal behavior. Due to the popularity and development of e-commerce and Internet applications, the damage caused by phishing has become increasingly serious.
  • the current phishing website detection method is mainly blacklist filtering technology.
  • the blacklist filtering technology mainly relies on continuously updating the blacklists including all known phishing websites and/or user reporting websites. When detecting suspicious websites, information such as the domain name of the suspicious websites is included in the blacklist. To determine if the suspicious website is a phishing website.
  • the detection of the suspicious website by the above method is passive detection, that is, the detection method usually plays a role after the user has been attacked by the phishing website, and has a certain lag. Therefore, how to effectively detect phishing websites that are not recorded in the blacklist, that is, to achieve active detection of phishing websites, thereby avoiding or reducing user losses, has become the focus of phishing website detection.
  • the present invention provides a phishing website detection method and device for implementing active detection of a phishing website.
  • a phishing website detecting method including:
  • the website page link of the website to be detected is obtained; If the detection is that the in-site page link includes a login box link, the website to be detected is determined to be a phishing website.
  • determining that the to-be-detected website is a phishing website specifically includes: if detecting that the in-site page link includes a login box Linking, obtaining a feature vector of the website to be detected;
  • the acquiring the feature vector of the website to be detected specifically includes acquiring a first feature value, a second feature value, a third feature value, and/or a fourth feature value;
  • Obtaining the first feature value of the website to be detected specifically includes:
  • the identity information keyword of the fishing tendency target website detecting whether the identity information keyword is included in the title and/or copyright information of the website to be detected; if yes, the first feature value is 1; , the first characteristic value is 0;
  • Obtaining the second feature value of the website to be detected specifically includes:
  • Obtaining the third feature value of the website to be detected specifically includes:
  • the obtaining the suspicious page link of the website to be detected specifically includes: If it is detected that the off-site page link and/or the in-site page link includes the domain name keyword of the phishing target website, or detects that the off-site page link and/or the in-site page link is an internet protocol address
  • the form of the uniform resource locator determines that the off-site page link and/or the in-site page link is a suspicious page link of the website to be detected.
  • the performing phishing website detection on the website to be detected according to the feature vector of the website to be detected specifically includes:
  • Assigning a corresponding weight to the first feature value, the second feature value, the third feature value, and/or the fourth feature value acquiring the first feature value, the second feature value, the third feature value, and/or An accumulated value of a product of four eigenvalues and corresponding weights;
  • a phishing website detecting device including: a first processing module, configured to acquire a website to be detected;
  • a second processing module configured to: obtain, according to a domain name of the website to be detected, a website of the website of the website to be detected, and obtain a link of the website of the website to be detected; and a third processing module, if The website link to be detected is determined to be a phishing website.
  • the method further includes:
  • a fourth processing module configured to acquire a feature vector of the website to be detected if the link to the in-site page includes a login box link;
  • a fifth processing module configured to perform phishing website detection on the website to be detected according to the feature vector of the website to be detected.
  • the fourth processing module includes a first processing unit, a second processing unit, a third processing unit, and/or a fourth processing unit, where: the first processing unit is used for Obtaining an identity information keyword of the fishing tendency target website; detecting whether the identity information keyword is included in a title and/or copyright information of the website to be detected; if yes, the first feature value is 1; if not, then The first characteristic value is 0;
  • the second processing unit is configured to acquire an off-site page link of the website to be detected, and obtain a total number of links of the website to be detected, where the total number of links includes the number of links of the off-site page and the page link of the station And a first ratio of the number of the off-page page links to the total number of links, and the first ratio as the second feature value;
  • the third processing unit is configured to acquire a suspicious page link of the website to be detected; obtain a second ratio of the number of the suspicious page links to the total number of links, and use the second ratio as a third feature value;
  • the fourth processing unit is configured to acquire a registration duration of the website to be detected; if the registration duration is not greater than a preset duration, the fourth feature value is 1; if not, the fourth feature value is 0.
  • the third processing unit is further configured to: if detecting that the out-of-station page link and/or the in-site page link includes the domain name keyword of the phishing target website, or Detecting that the off-site page link and/or the in-site page link is a uniform resource locator in the form of an internet protocol address, determining that the off-site page link and/or the in-site page link is a suspicious page of a website to be detected link.
  • the fifth processing module includes: a fifth processing unit, configured to use the first feature value, the second feature value, the third feature value, and
  • the fourth eigenvalue is assigned a corresponding weight, and the accumulated value of the first eigenvalue, the second eigenvalue, the third eigenvalue, and/or the fourth eigenvalue and the corresponding weight product is obtained;
  • the sixth processing unit determines that the website to be detected is a phishing website if the accumulated value is greater than a preset threshold.
  • the phishing website detecting method and device it is determined according to whether the website to be detected is a phishing website according to whether the website link of the website to be detected includes a login box according to the to-be-detected network step, so that the website can be detected according to the characteristics of the website to be detected. , phishing websites not recorded in the blacklist are detected, and the active detection of the phishing website is realized.
  • FIG. 1 is a schematic flowchart of a method for detecting a phishing website according to an embodiment of the present invention.
  • FIG. 2 is a schematic structural diagram of a phishing website detecting device according to an embodiment of the present invention.
  • BEST MODE FOR CARRYING OUT THE INVENTION A phishing website detecting method according to an embodiment of the present invention is executed, for example, by a phishing website detecting device provided in a network.
  • FIG. 1 is a schematic flowchart diagram of a method for detecting a phishing website according to an embodiment of the present invention. As shown in Figure 1, The method includes the following steps:
  • Step S101 obtaining a website to be detected
  • Step S102 If it is detected that the phishing tendency target website of the website to be detected is found according to the domain name of the website to be detected, obtain the website page link of the website to be detected;
  • Step S103 If it is detected that the in-site page link includes a login box link, determine that the to-be-detected website is a phishing website.
  • the domain name of the website to be detected after obtaining the domain name of the website to be detected, firstly, by detecting whether the website to be detected is specific to the fishing target website, it is determined whether the website to be detected may perform a phishing attack on a known normal website.
  • a phishing tendency target website for example: the domain name of the website to be detected and the domain name of a well-known website that is often phishing attack are similarly detected, and the similarity value between the two is used to judge Check whether the domain name of the website is a phishing domain name of a well-known website.
  • the well-known website is considered to be a phishing target website of the website to be tested, that is, the website to be tested may be phishing attacks on the well-known website.
  • the domain name of the website to be detected includes the domain name keyword of the well-known website, and if so, the well-known website is considered to be the target of the fishing tendency target website.
  • the well-known website in the above detection process is, for example, a website stored in a protected domain name feature database, and the protected domain name feature library includes, for example, a known phishing attack website and a high-click website.
  • the website to be detected is traversed in the whole station, and all the intra-page links of the website to be detected are extracted without repeating, and subsequent detection is performed.
  • the website to be detected includes a login box that requires the user to input private information.
  • traversing all the intra-page links of the website to be detected, and detecting whether the page links in each station include a login box, and the specific detection manner includes: detecting whether the page corresponding to the page link in the station includes a ⁇ form>... ⁇ form> form Element, if not, it is determined that the page link in the station does not include a login box; if yes, further detecting whether the value field in the form element includes the words "account", "password” and "login", if any, then determining The in-site page link includes a login box.
  • the phishing website necessarily contains a login box that requires the user to input private information, if all the in-site page links of the website to be tested complete the above-mentioned login box detection, if the website to be detected is known If the website link does not include the login box link, it may be determined that the website to be detected is not a phishing website; if it is known that the website link of the website to be detected includes a login box link, it may be determined that the website to be detected is a phishing website or suspicious to be further detected. Phishing website.
  • the phishing attack tendency is determined according to the domain name of the website to be detected, and when it is determined to be YES, it is further determined according to whether the link in the website of the website to be detected includes a login box. Whether the website to be detected is a phishing website, so that the phishing website not recorded in the blacklist can be detected according to the characteristics of the website to be detected, and the active detection of the phishing website is realized.
  • the phishing website detection method the phishing website can be actively detected before the user is attacked by the phishing website, thereby effectively avoiding or reducing user losses.
  • determining that the website to be detected is a phishing website specifically includes:
  • the feature vector of the website to be detected may include one or more feature values, respectively, which are used to represent different features or information of the website to be detected, and therefore, the website to be detected is performed according to the feature vector of the website to be detected.
  • the phishing website detection can detect the phishing website according to other characteristics or information of the website to be detected after detecting the link of the website page including the login box link, thereby improving the accuracy of the phishing website detection.
  • the feature vector of the website to be detected includes the first feature value, the second feature value, the third feature value, and/or the fourth feature value.
  • the acquiring the feature vector of the website to be detected specifically includes acquiring the first feature value, the second feature value, the third feature value, and/or the fourth feature value. For example, as a feature vector
  • the obtaining the first feature value VI of the website to be detected specifically includes: acquiring an identity information keyword of the phishing target website; detecting a title (title) and/or copyright of the website to be detected ( copyright ) whether the information key is included in the information a word; if yes, the first feature value VI is 1; if not, the first feature value VI is 0.
  • keywords such as "title” or "copyright” of the fishing propensity target website are obtained from the text content of the part of the fishing tendency target website.
  • the identity information keywords of the Tencent website include "Tencent", “Tencent” and "qq" and so on.
  • the identity information keyword of the phishing target website After obtaining the identity information keyword of the phishing target website, traversing the text content of the "title” and “copyright” of the website to be detected, and detecting whether the identity information keyword of the phishing target website is included, and if so, the first feature is obtained.
  • Obtaining the second feature value V2 of the website to be detected specifically includes:
  • Obtaining the third feature value V3 of the website to be detected specifically includes:
  • determining whether a certain link of the website to be detected is a suspicious page link is as follows: if detecting that the off-site page link is known and/or the in-site page link includes the domain name keyword of the fishing preference target website, Or detecting that the off-site page link and/or the in-site page link is a uniform resource locator in the form of an internet protocol address (IP), determining that the off-site page link and/or the in-site page link are to be detected A link to the suspicious page of the website.
  • IP internet protocol address
  • all the off-site page links and the in-page page links of the website to be detected are detected, and it is determined whether the URL of the link includes the domain name keyword of the fishing-destination target website, for example, the domain name keyword of the Taobao website "www.taobao.com" is "taobao", and determine whether the URL of the link is in the form of IP, that is, whether the URL of the link is expressed as "210.46.102.141". If the URL of the link includes the domain name keyword of the phishing target website and/or the URL of the link is in the form of IP, it is determined that the link is a suspicious page link of the website to be detected, and vice versa.
  • the URL of the link does not include the domain name keyword of the phishing target website, and the URL of the link is not in the IP form, it is determined that the link is a normal page link of the website to be detected.
  • this suspicious page link judging method on the one hand, it is possible to detect a link to a phishing target website and a suspicious link of a domain name keyword of a phishing target website; on the other hand, a website with a high reputation is usually The IP form is not used as the URL, so it is also possible to detect a suspicious link with a low reputation with a URL as a link in the form of IP.
  • Obtaining the fourth feature value V4 of the website to be detected specifically includes:
  • a determination model is generated according to a sample of a normal website and a phishing website in advance, and a feature vector obtained by the above process is input as a determination model, and the determination model generates a determination result of whether the website to be detected is a phishing website according to the feature value in the feature vector.
  • the performing phishing website detection on the website to be detected according to the feature vector of the website to be detected specifically includes:
  • the first eigenvalue VI is assigned a first weight value a1
  • the second eigenvalue V2 is assigned a first weight value al
  • the third eigenvalue V3 is assigned a first weight value a1
  • the fourth eigenvalue V3 is assigned a fourth eigenvalue V4.
  • the fourth weight value a4 the accumulated value of the feature vector is a V1 + a2 V2 + a3 V3 + a4 V4 chalk the accumulated value of the feature vector is compared with a preset threshold, and if it is greater than the preset threshold, the to-be-detected is determined
  • the website is a phishing website. If the website is less than or equal to the preset threshold, it is determined that the website to be detected is not a phishing website.
  • the first weight value a1, the first weight value a1, the first weight value a1, and the fourth weight value For example, a4 is greater than 0 and less than or equal to 1, and the first weight value a1, the first weight value a1, the first weight value a1, the fourth weight value a4, and the preset threshold are all provided by a determination model, and the specific value may be based on Samples of normal websites and phishing websites are obtained statistically.
  • a plurality of information such as the identity characteristics, the link characteristics and the registration time of the website to be detected are combined, and the website is assisted to determine whether the website is a phishing website, thereby realizing rapid and reliable active phishing website detection. .
  • the phishing website detecting device includes:
  • the first processing module 21 is configured to obtain a website to be detected
  • the second processing module 22 is configured to: if the phishing tendency target website that knows that the website to be detected exists, according to the domain name of the website to be detected, obtain the website page link of the website to be detected;
  • the third processing module 23 is configured to determine that the website to be detected is a phishing website if the link to the in-site page includes a login frame link.
  • the specific process for the phishing website detection device to perform the phishing website detection in the above embodiment is the same as the phishing website detection method in the above embodiment, and therefore will not be described herein.
  • the phishing website detecting device determines whether the website to be detected has a phishing attack tendency according to the domain name of the website to be detected, and when the determination is yes, further determines whether the website link of the website to be detected includes a login box. Whether the website to be detected is a phishing website, so that the phishing website not recorded in the blacklist can be detected according to the characteristics of the website to be detected, and the active detection of the phishing website is realized.
  • the phishing website detection device can be actively detected before the user is attacked by the phishing website, thereby effectively avoiding or reducing user losses.
  • the method further includes:
  • a fourth processing module configured to acquire a feature vector of the website to be detected if the link to the in-site page includes a login box link;
  • a fifth processing module configured to perform phishing website detection on the website to be detected according to the feature vector of the website to be detected.
  • the fourth processing module includes a first processing unit, a second processing unit, a third processing unit, and/or a fourth processing unit, Medium:
  • the first processing unit is configured to acquire an identity information keyword of the phishing target website; and detecting whether the identity information keyword is included in a title and/or copyright information of the website to be detected; if yes, the first feature The value is 1; if not, the first feature value is 0;
  • the second processing unit is configured to acquire an off-site page link of the website to be detected, and obtain a total number of links of the website to be detected, where the total number of links includes the number of links of the off-site page and the page link of the station And the sum of the number of the off-page page links and the total number of the links, and the first ratio is used as the second feature value;
  • the third processing unit is configured to acquire a suspicious page link of the website to be detected; obtain a second ratio of the number of the suspicious page links to the total number of links, and use the second ratio as a third feature value;
  • the fourth processing unit is configured to acquire a registration duration of the website to be detected; if the registration duration is not greater than a preset duration, the fourth feature value is 1; if not, the fourth feature value is 0.
  • the third processing unit is further configured to: if detecting that the off-site page link and/or the in-site page link includes the domain name key of the phishing target website Terminating, or detecting that the off-site page link and/or the in-site page link is a uniform resource locator in the form of an internet protocol address, determining that the off-site page link and/or the in-site page link is a to-be-detected website Suspicious page link.
  • the fifth processing module includes:
  • a fifth processing unit configured to allocate a corresponding weight to the first feature value, the second feature value, the third feature value, and/or the fourth feature value, to obtain the first feature value, the second feature value, and the first An accumulated value of the product of the three eigenvalues and/or the fourth eigenvalue and the corresponding weight;
  • the sixth processing unit determines that the website to be detected is a phishing website if the accumulated value is greater than a preset threshold.
  • a plurality of information such as the identity characteristics, the link characteristics and the registration time of the website to be detected are combined, and the website is assisted to determine whether the website is a phishing website, thereby realizing rapid and reliable active phishing website detection. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

本发明提供一种钓鱼网站检测方法及设备。该钓鱼网站检测方法包括:获取待检测网站;若根据待检测网站的域名,检测获知存在所述待检测网站的钓鱼倾向目标网站,则获取所述待检测网站的站内页面链接;若检测获知所述站内页面链接包括登陆框链接,则判定所述待检测网站为钓鱼网站。本发明提供的钓鱼网站检测方法及设备能够实现钓鱼网站的主动检测。

Description

钓鱼网站检测方法及设备
技术领域
本发明涉及信息处理技术, 尤其涉及一种钓鱼网站检测方法及设备, 属 于网络安全技术领域。 背景技术
随着科技的普及化, 网络通讯技术以不可取代的地位深入各个领域, 而 网络安全问题也日益严峻, 其中以网络钓鱼问题尤为突出。
网络钓鱼, 是指通过发送垃圾电子邮件等方式, 将收信用户引诱到一个 通过精心设计与目标组织的网站非常相似的钓鱼网站上, 并获取收信人在此 网站上输入的个人敏感信息的网络犯罪行为。 由于电子商务和互联网应用的 普及和发展, 网络钓鱼造成的损失日益严重。 目前的钓鱼网站检测方法主要为黑名单过滤技术。 黑名单过滤技术主要 依靠不断对包括所有已知钓鱼网站和 /或用户举报网站的黑名单进行更新, 在 对可疑网站进行检测时, 通过查找该可疑网站的域名等信息是否包括在黑名 单中, 来判断该可疑网站是否为钓鱼网站。
利用上述方法对可疑网站进行的检测为被动检测, 即这种检测方法通常 在用户已经遭受钓鱼网站侵害后才能发挥作用, 具有一定的滞后性。 因此, 如何能够有效检测出未记录在黑名单中的钓鱼网站, 即实现钓鱼网站的主动 检测, 从而避免或减少用户损失, 成为钓鱼网站检测的重点所在。 发明内容
针对现有技术中的缺陷, 本发明提供一种钓鱼网站检测方法及设备, 用以实现钓鱼网站的主动检测。
根据本发明的一方面, 提供一种钓鱼网站检测方法, 包括:
获取待检测网站;
若根据待检测网站的域名, 检测获知存在所述待检测网站的钓鱼倾向 目标网站, 则获取所述待检测网站的站内页面链接; 若检测获知所述站内页面链接包括登陆框链接, 则判定所述待检测网 站为钓鱼网站。
进一步地, 在上述钓鱼网站检测方法中, 所述若检测获知所述站内页 面链接包括登陆框链接, 则判定所述待检测网站为钓鱼网站具体包括: 若检测获知所述站内页面链接包括登陆框链接, 则获取所述待检测网 站的特征向量;
根据所述待检测网站的特征向量, 对所述待检测网站进行钓鱼网站检 测。
进一步地, 在上述钓鱼网站检测方法中, 所述获取所述待检测网站的 特征向量具体包括获取第一特征值、第二特征值、第三特征值和 /或第四特 征值; 其中:
获取所述待检测网站的所述第一特征值具体包括:
获取所述钓鱼倾向目标网站的身份信息关键词; 检测所述待检测网站 的标题和 /或版权信息中是否包括所述身份信息关键词; 若是, 则所述第一 特征值为 1 ; 若否, 则所述第一特征值为 0;
获取所述待检测网站的所述第二特征值具体包括:
获取所述待检测网站的站外页面链接, 并获取所述待检测网站的链接 总数, 所述链接总数包括所述站外页面链接的数量与所述站内页面链接的 数量之和;
获取所述站外页面链接的数量与所述链接总数的第一比值, 并将所述 第 ―比值作为所述第二特征值;
获取所述待检测网站的所述第三特征值具体包括:
获取所述待检测网站的可疑页面链接;
获取所述可疑页面链接的数量与所述链接总数的第二比值, 并将所述 第二比值作为所述第三特征值;
获取所述待检测网站的所述第四特征值具体包括:
获取所述待检测网站的注册时长; 若所述注册时长不大于预设时长, 则所述第四特征值为 1 ; 若否, 则所述第四特征值为 0。
进一步地, 在上述钓鱼网站检测方法中, 所述获取所述待检测网站的 可疑页面链接具体包括: 若检测获知所述站外页面链接和 /或所述站内页面链接包括所述钓鱼 倾向目标网站的域名关键词,或检测获知所述站外页面链接和 /或所述站内 页面链接为互联网协议地址形式的统一资源定位符, 则判定所述站外页面 链接和 /或所述站内页面链接为待检测网站的可疑页面链接。
进一步地, 在上述钓鱼网站检测方法中, 所述根据所述待检测网站的 特征向量, 对所述待检测网站进行钓鱼网站检测具体包括:
为所述第一特征值、第二特征值、第三特征值和 /或第四特征值分配对 应的权重, 获取所述第一特征值、 第二特征值、 第三特征值和 /或第四特征 值与对应权重乘积的累加值;
若所述累加值大于预设阈值, 则判定所述待检测网站为钓鱼网站。 根据本发明的另一方面, 还提供一种钓鱼网站检测设备, 包括: 第一处理模块, 用于获取待检测网站;
第二处理模块, 用于若根据待检测网站的域名, 检测获知存在所述待 检测网站的钓鱼倾向目标网站, 则获取所述待检测网站的站内页面链接; 第三处理模块, 用于若检测获知所述站内页面链接包括登陆框链接, 则判定所述待检测网站为钓鱼网站。
进一步地, 在上述钓鱼网站检测设备中, 还包括:
第四处理模块, 用于若检测获知所述站内页面链接包括登陆框链接, 则获取所述待检测网站的特征向量;
第五处理模块, 用于根据所述待检测网站的特征向量, 对所述待检测 网站进行钓鱼网站检测。
进一步地, 在上述钓鱼网站检测设备中, 所述第四处理模块包括第一 处理单元、 第二处理单元、 第三处理单元和 /或第四处理单元, 其中: 所述第一处理单元用于获取所述钓鱼倾向目标网站的身份信息关键 词;检测所述待检测网站的标题和 /或版权信息中是否包括所述身份信息关 键词; 若是, 则第一特征值为 1 ; 若否, 则第一特征值为 0;
所述第二处理单元用于获取所述待检测网站的站外页面链接, 并获取 所述待检测网站的链接总数, 所述链接总数包括所述站外页面链接的数量 与所述站内页面链接的数量之和; 获取所述站外页面链接的数量与所述链 接总数的第一比值, 并将所述第一比值作为第二特征值; 所述第三处理单元用于获取所述待检测网站的可疑页面链接; 获取所 述可疑页面链接的数量与所述链接总数的第二比值, 并将所述第二比值作 为第三特征值;
所述第四处理单元用于获取所述待检测网站的注册时长; 若所述注册 时长不大于预设时长, 则第四特征值为 1 ; 若否, 则第四特征值为 0。
进一步地, 在上述钓鱼网站检测设备中, 所述第三处理单元还用于若 检测获知所述站外页面链接和 /或所述站内页面链接包括所述钓鱼倾向目 标网站的域名关键词,或检测获知所述站外页面链接和 /或所述站内页面链 接为互联网协议地址形式的统一资源定位符, 则判定所述站外页面链接和 /或所述站内页面链接为待检测网站的可疑页面链接。
进一步地, 在上述钓鱼网站检测设备中, 所述第五处理模块包括: 第五处理单元, 用于为所述第一特征值、 第二特征值、 第三特征值和
/或第四特征值分配对应的权重, 获取所述第一特征值、 第二特征值、 第三 特征值和 /或第四特征值与对应权重乘积的累加值;
第六处理单元, 若所述累加值大于预设阈值, 则判定所述待检测网站 为钓鱼网站。
根据本发明提供的钓鱼网站检测方法及设备, 由于首先根据待检测网 步根据待检测网站的站内页面链接是否包括登陆框来判定待检测网站是 否为钓鱼网站, 从而可根据待检测网站的自身特征, 检测出未记录在黑名 单中的钓鱼网站, 实现了钓鱼网站的主动检测。 附图说明
图 1为本发明实施例钓鱼网站检测方法的流程示意图。
图 2为本发明实施例钓鱼网站检测设备的结构示意图。 具体实施方式 本发明实施例的钓鱼网站检测方法例如由设置在网络中的钓鱼网站 检测设备来执行。
图 1为本发明实施例钓鱼网站检测方法的流程示意图。 如图 1所示, 该方法包括以下步骤:
步骤 S 101 , 获取待检测网站;
步骤 S102, 若根据待检测网站的域名,检测获知存在所述待检测网站 的钓鱼倾向目标网站, 则获取所述待检测网站的站内页面链接;
步骤 S103 , 若检测获知所述站内页面链接包括登陆框链接, 则判定所 述待检测网站为钓鱼网站。
具体地, 获取待检测网站的域名后, 首先通过检测该待检测网站是否 具体钓鱼倾向目标网站, 来判断待检测网站是否可能对某一已知正常网站 进行钓鱼攻击。 其中, 可通过多种方式检测是否存在钓鱼倾向目标网站, 比如: 对待检测网站的域名和经常被钓鱼攻击的知名网站的域名进行相似 度检测, 并根据两者之间的相似度值来判断待检测网站的域名是否为知名 网站的仿冒域名, 若是, 则认为该知名网站为待检测网站的钓鱼倾向目标 网站, 即认为待检测网站可能对该知名网站进行钓鱼攻击。 此外, 也可以 判断待检测网站的域名中是否包括知名网站的域名关键词, 若是, 则认为 该知名网站为待检测网站的钓鱼倾向目标网站。 其中, 上述检测过程中的 知名网站例如为存储在保护域名特征库中的网站, 该保护域名特征库例如 包括已知的受到钓鱼攻击的网站和点击量较高的网站等。
若检测到不存在钓鱼倾向目标网站, 则可知待检测网站不存在钓鱼攻 击的倾向, 从而判定待检测网站不为钓鱼网站。 若检测到存在钓鱼倾向网 站, 则对该待检测网站进行全站遍历, 无重复的提取出待检测网站的所有 站内页面链接, 并进行后续检测。
根据提取出的待检测网站的所有站内页面链接, 检测待检测网站中是 否包括要求用户输入隐私信息的登陆框。 具体地, 遍历待检测网站的所有 站内页面链接, 分别检测各站内页面链接是否包括登陆框, 具体的检测方 式例如包括:检测站内页面链接对应的页面中是否包括 <form>…… <form> 表单元素, 若无, 则判定该站内页面链接不包括登陆框; 若有, 则进一步 检测该表单元素中的值域是否包括 "账号" 、 "密码" 和 "登陆" 等字样, 若有, 则判定该站内页面链接包括登陆框。
由于钓鱼网站必然包含要求用户输入隐私信息的登陆框, 因此, 对待 检测网站的所有站内页面链接完成上述登陆框检测后, 若获知待检测网站 的站内页面链接不包括登陆框链接, 则可确定待检测网站不为钓鱼网站; 若获知待检测网站的站内页面链接包括登陆框链接, 则可确定待检测网站 为钓鱼网站或待进一步检测的可疑钓鱼网站。
根据上述实施例的钓鱼网站检测方法, 由于首先根据待检测网站的域 名判断待检测网站是否存在钓鱼攻击倾向, 并当判断为是时, 进一步根据 待检测网站的站内页面链接是否包括登陆框来判定待检测网站是否为钓 鱼网站, 从而可根据待检测网站的自身特征, 检测出未记录在黑名单中的 钓鱼网站, 实现了钓鱼网站的主动检测。 通过釆用这种钓鱼网站检测方法, 能够在用户遭受钓鱼网站侵害之前, 主动检测到钓鱼网站, 从而有效避免或 减少用户损失。
进一步地, 在上述实施例的钓鱼网站检测方法中, 所述若检测获知所 述站内页面链接包括登陆框链接, 则判定所述待检测网站为钓鱼网站具体 包括:
若检测获知所述站内页面链接包括登陆框链接, 则获取所述待检测网 站的特征向量;
根据所述待检测网站的特征向量, 对所述待检测网站进行钓鱼网站检 测。
其中, 待检测网站的特征向量可包括一个或多个特征值, 该一个或多 个特征值分别用于表征待检测网站的不同特征或信息, 因此, 根据待检测 网站的特征向量对待检测网站进行钓鱼网站检测 , 可在检测获知站内页面 链接包括登陆框链接后, 进一步根据待检测网站的其他特征或信息进行钓 鱼网站检测, 从而提高钓鱼网站检测的准确性。
进一步地, 在上述实施例的钓鱼网站检测方法中, 待检测网站的特征 向量包括第一特征值、第二特征值、第三特征值和 /或第四特征值。相应地, 所述获取所述待检测网站的特征向量具体包括获取第一特征值、 第二特征 值、 第三特征值和 /或第四特征值。 例如表示为特征向量
Vector{Vl,V2,V3,V4}。
更为具体地, 获取所述待检测网站的所述第一特征值 VI具体包括: 获取所述钓鱼倾向目标网站的身份信息关键词; 检测所述待检测网站 的标题( title ) 和 /或版权( copyright )信息中是否包括所述身份信息关键 词; 若是, 则所述第一特征值 VI为 1 ; 若否, 则所述第一特征值 VI为 0。 其中, 例如从钓鱼倾向目标网站的 "title" 或 "copyright" 等部分的 文本内容中获取表明钓鱼倾向目标网站身份的关键词, 例如腾讯网站的身 份信息关键词包括 "腾讯" 、 "Tencent" 和 "qq" 等。 获取钓鱼倾向目标 网站的身份信息关键词后, 遍历待检测网站的 "title" 和 "copyright" 的 文本内容, 检测是否包括上述钓鱼倾向目标网站的身份信息关键词, 若包 括, 则令第一特征值 Vl=l , 以表明待检测网站与钓鱼倾向目标网站身份 匹配; 若不包括, 则令第一特征值 V1=0, 以表明待检测网站与钓鱼倾向 目标网站身份不匹配。
获取所述待检测网站的所述第二特征值 V2具体包括:
获取所述待检测网站的站外页面链接, 并获取所述待检测网站的链接 总数, 所述链接总数包括所述站外页面链接的数量与所述站内页面链接的 数量之和;
获取所述站外页面链接的数量与所述链接总数的第 ―比值, 并将所述 第一比值作为所述第二特征值 V2;
获取所述待检测网站的所述第三特征值 V3具体包括:
获取所述待检测网站的可疑页面链接;
获取所述可疑页面链接的数量与所述链接总数的第二比值, 并将所述 第二比值作为所述第三特征值 V3 ;
其中, 判定待检测网站的某一链接是否为可疑页面链接例如釆用如下 方式:若检测获知所述站外页面链接和 /或所述站内页面链接包括所述钓鱼 倾向目标网站的域名关键词,或检测获知所述站外页面链接和 /或所述站内 页面链接为互联网协议地址 (IP ) 形式的统一资源定位符, 则判定所述站 外页面链接和 /或所述站内页面链接为待检测网站的可疑页面链接。
具体地, 对待检测网站的所有站外页面链接和站内页面链接进行检 测, 判断链接的 URL中是否包括钓鱼倾向目标网站的域名关键词, 例如 淘宝网站 "www.taobao.com" 的域名关键词为 "taobao" , 并判断链接的 URL是否为 IP形式, 即该链接的 URL是否表示为 "210.46.102.141" 的 格式。 若链接的 URL中包括钓鱼倾向目标网站的域名关键词和 /或链接的 URL为 IP形式, 则判定该链接为待检测网站的可疑页面链接, 反之, 若 链接的 URL中不包括钓鱼倾向目标网站的域名关键词、 且链接的 URL不 为 IP形式, 则判定该链接为待检测网站的正常页面链接。 通过这种可疑 页面链接判断方法, 一方面, 能够检测出链向钓鱼倾向目标网站的链接和 釆用钓鱼倾向目标网站的域名关键词的可疑链接; 另一方面, 由于信誉度 较高的网站通常不会釆用 IP形式作为 URL, 所以还能够检测出以 IP形式 作为链接的 URL这种信誉度低的可疑链接。
获取所述待检测网站的所述第四特征值 V4具体包括:
获取所述待检测网站的注册时长; 若所述注册时长不大于预设时长, 则所述第四特征值 V4为 1 ; 若否, 则所述第四特征值 V4为 0。
例如, 通过访问 "WHOIS"数据库检测待检测网站的域名的注册时间 是否大于一年。 根据统计, 钓鱼网站的域名中超过 95%的域名注册时间小 于一年。 因此, 通过注册时间的检测, 可以降低误报。 若小于或等于一年, 则令第四特征值 V4=l , 表示该站点不是钓鱼网站; 若大于一年, 则令第 四特征值 V4=0。
此外, 例如预先根据正常网站和钓鱼网站的样本生成判定模型, 上述 过程得到的特征向量作为判定模型的输入, 由判定模型根据特征向量中的 特征值, 生成待检测网站是否为钓鱼网站的判定结果。
进一步地, 在上述实施例的钓鱼网站检测方法中, 所述根据所述待检 测网站的特征向量, 对所述待检测网站进行钓鱼网站检测具体包括:
为所述第一特征值 V 1、 第二特征值 V2、 第三特征值 V3和 /或第四特 征值 V4分配对应的权重, 获取所述第一特征值 VI、 第二特征值 V2、 第 三特征值 V3和 /或第四特征值 V4与对应权重乘积的累加值;
若所述累加值大于预设阈值, 则判定所述待检测网站为钓鱼网站。 具体地, 例如为第一特征值 VI分配第一权重值 al、 为第二特征值 V2分配第一权重值 al、 为第三特征值 V3分配第一权重值 al、 为第四特 征值 V4分配第四权重值 a4, 则该特征向量的累加值为 al V1+ a2 V2+ a3 V3+ a4 V4„ 将该特征向量的累加值与一个预设阈值相比较, 若大于 预设阈值, 则判定所述待检测网站为钓鱼网站, 若小于或等于预设阈值, 则判定所述待检测网站不为钓鱼网站。
其中, 第一权重值 al、 第一权重值 al、 第一权重值 al、 第四权重值 a4例如均大于 0且小于等于 1 , 且第一权重值 al、 第一权重值 al、 第一权 重值 al、 第四权重值 a4以及预设阈值例如均由判定模型提供, 具体数值 可根据对正常网站和钓鱼网站的样本进行统计获得。
根据上述实施例的钓鱼网站检测方法, 结合了待检测网站的身份特 征、 链接特征和注册时间等多项信息, 对该网站是否为钓鱼网站进行辅助 判定, 实现了快速、 可靠的主动钓鱼网站检测。
图 2为本发明实施例钓鱼网站检测设备的结构示意图。 如图 2所示, 该一种钓鱼网站检测设备包括:
第一处理模块 21 , 用于获取待检测网站;
第二处理模块 22, 用于若根据待检测网站的域名,检测获知存在所述 待检测网站的钓鱼倾向目标网站, 则获取所述待检测网站的站内页面链 接;
第三处理模块 23 , 用于若检测获知所述站内页面链接包括登陆框链 接, 则判定所述待检测网站为钓鱼网站。
上述实施例的钓鱼网站检测设备执行钓鱼网站检测的具体流程与上 述实施例的钓鱼网站检测方法相同, 故此处不再赘述。
根据上述实施例的钓鱼网站检测设备, 由于首先根据待检测网站的域 名判断待检测网站是否存在钓鱼攻击倾向, 并当判断为是时, 进一步根据 待检测网站的站内页面链接是否包括登陆框来判定待检测网站是否为钓 鱼网站, 从而可根据待检测网站的自身特征, 检测出未记录在黑名单中的 钓鱼网站, 实现了钓鱼网站的主动检测。 通过釆用这种钓鱼网站检测设备, 能够在用户遭受钓鱼网站侵害之前, 主动检测到钓鱼网站, 从而有效避免或 减少用户损失。
进一步地, 在上述实施例的钓鱼网站检测设备中, 还包括:
第四处理模块, 用于若检测获知所述站内页面链接包括登陆框链接, 则获取所述待检测网站的特征向量;
第五处理模块, 用于根据所述待检测网站的特征向量, 对所述待检测 网站进行钓鱼网站检测。
进一步地, 在上述实施例的钓鱼网站检测设备中, 所述第四处理模块 包括第一处理单元、 第二处理单元、 第三处理单元和 /或第四处理单元, 其 中:
所述第一处理单元用于获取所述钓鱼倾向目标网站的身份信息关键 词;检测所述待检测网站的标题和 /或版权信息中是否包括所述身份信息关 键词; 若是, 则第一特征值为 1 ; 若否, 则第一特征值为 0;
所述第二处理单元用于获取所述待检测网站的站外页面链接, 并获取 所述待检测网站的链接总数, 所述链接总数包括所述站外页面链接的数量 与所述站内页面链接的数量之和; 获取所述站外页面链接的数量与所述链 接总数的第 ―比值, 并将所述第 ―比值作为第二特征值;
所述第三处理单元用于获取所述待检测网站的可疑页面链接; 获取所 述可疑页面链接的数量与所述链接总数的第二比值, 并将所述第二比值作 为第三特征值;
所述第四处理单元用于获取所述待检测网站的注册时长; 若所述注册 时长不大于预设时长, 则第四特征值为 1 ; 若否, 则第四特征值为 0。
进一步地, 在上述实施例的钓鱼网站检测设备中, 所述第三处理单元 还用于若检测获知所述站外页面链接和 /或所述站内页面链接包括所述钓 鱼倾向目标网站的域名关键词 ,或检测获知所述站外页面链接和 /或所述站 内页面链接为互联网协议地址形式的统一资源定位符, 则判定所述站外页 面链接和 /或所述站内页面链接为待检测网站的可疑页面链接。
进一步地, 在上述实施例的钓鱼网站检测设备中, 所述第五处理模块 包括:
第五处理单元, 用于为所述第一特征值、 第二特征值、 第三特征值和 /或第四特征值分配对应的权重, 获取所述第一特征值、 第二特征值、 第三 特征值和 /或第四特征值与对应权重乘积的累加值;
第六处理单元, 若所述累加值大于预设阈值, 则判定所述待检测网站 为钓鱼网站。
根据上述实施例的钓鱼网站检测设备, 结合了待检测网站的身份特 征、 链接特征和注册时间等多项信息, 对该网站是否为钓鱼网站进行辅助 判定, 实现了快速、 可靠的主动钓鱼网站检测。
最后应说明的是: 以上各实施例仅用以说明本发明的技术方案, 而非 对其限制; 尽管参照前述各实施例对本发明进行了详细的说明, 本领域的 普通技术人员应当理解: 其依然可以对前述各实施例所记载的技术方案进 行修改, 或者对其中部分或者全部技术特征进行等同替换; 而这些修改或 者替换, 并不使相应技术方案的本质脱离本发明各实施例技术方案的范 围。

Claims

权 利 要 求 书
1、 一种钓鱼网站检测方法, 其特征在于, 包括:
获取待检测网站;
若根据待检测网站的域名, 检测获知存在所述待检测网站的钓鱼倾向 目标网站, 则获取所述待检测网站的站内页面链接;
若检测获知所述站内页面链接包括登陆框链接, 则判定所述待检测网 站为钓鱼网站。
2、 根据权利要求 1所述的钓鱼网站检测方法, 其特征在于, 所述若 检测获知所述站内页面链接包括登陆框链接, 则判定所述待检测网站为钓 鱼网站具体包括:
若检测获知所述站内页面链接包括登陆框链接, 则获取所述待检测网 站的特征向量;
根据所述待检测网站的特征向量, 对所述待检测网站进行钓鱼网站检 测。
3、 根据权利要求 2所述的钓鱼网站检测方法, 其特征在于, 所述获 取所述待检测网站的特征向量具体包括获取第一特征值、 第二特征值、 第 三特征值和 /或第四特征值; 其中:
获取所述待检测网站的所述第一特征值具体包括:
获取所述钓鱼倾向目标网站的身份信息关键词; 检测所述待检测网站 的标题和 /或版权信息中是否包括所述身份信息关键词; 若是, 则所述第一 特征值为 1 ; 若否, 则所述第一特征值为 0;
获取所述待检测网站的所述第二特征值具体包括:
获取所述待检测网站的站外页面链接, 并获取所述待检测网站的链接 总数, 所述链接总数包括所述站外页面链接的数量与所述站内页面链接的 数量之和;
获取所述站外页面链接的数量与所述链接总数的第一比值, 并将所述 第 ―比值作为所述第二特征值;
获取所述待检测网站的所述第三特征值具体包括:
获取所述待检测网站的可疑页面链接;
获取所述可疑页面链接的数量与所述链接总数的第二比值, 并将所述 第二比值作为所述第三特征值;
获取所述待检测网站的所述第四特征值具体包括:
获取所述待检测网站的注册时长; 若所述注册时长不大于预设时长, 则所述第四特征值为 1 ; 若否, 则所述第四特征值为 0。
4、 根据权利要求 3所述的钓鱼网站检测方法, 其特征在于, 所述获 取所述待检测网站的可疑页面链接具体包括:
若检测获知所述站外页面链接和 /或所述站内页面链接包括所述钓鱼 倾向目标网站的域名关键词,或检测获知所述站外页面链接和 /或所述站内 页面链接为互联网协议地址形式的统一资源定位符, 则判定所述站外页面 链接和 /或所述站内页面链接为待检测网站的可疑页面链接。
5、 根据权利要求 3或 4所述的钓鱼网站检测方法, 其特征在于, 所 述根据所述待检测网站的特征向量, 对所述待检测网站进行钓鱼网站检测 具体包括:
为所述第一特征值、第二特征值、第三特征值和 /或第四特征值分配对 应的权重, 获取所述第一特征值、 第二特征值、 第三特征值和 /或第四特征 值与对应权重乘积的累加值;
若所述累加值大于预设阈值, 则判定所述待检测网站为钓鱼网站。
6、 一种钓鱼网站检测设备, 其特征在于, 包括:
第一处理模块, 用于获取待检测网站;
第二处理模块, 用于若根据待检测网站的域名, 检测获知存在所述待 检测网站的钓鱼倾向目标网站, 则获取所述待检测网站的站内页面链接; 第三处理模块, 用于若检测获知所述站内页面链接包括登陆框链接, 则判定所述待检测网站为钓鱼网站。
7、 根据权利要求 6所述的钓鱼网站检测设备, 其特征在于, 还包括: 第四处理模块, 用于若检测获知所述站内页面链接包括登陆框链接, 则获取所述待检测网站的特征向量;
第五处理模块, 用于根据所述待检测网站的特征向量, 对所述待检测 网站进行钓鱼网站检测。
8、 根据权利要求 7所述的钓鱼网站检测设备, 其特征在于, 所述第 四处理模块包括第一处理单元、第二处理单元、第三处理单元和 /或第四处 理单元, 其中:
所述第一处理单元用于获取所述钓鱼倾向目标网站的身份信息关键 词;检测所述待检测网站的标题和 /或版权信息中是否包括所述身份信息关 键词; 若是, 则第一特征值为 1 ; 若否, 则第一特征值为 0;
所述第二处理单元用于获取所述待检测网站的站外页面链接, 并获取 所述待检测网站的链接总数, 所述链接总数包括所述站外页面链接的数量 与所述站内页面链接的数量之和; 获取所述站外页面链接的数量与所述链 接总数的第 ―比值, 并将所述第 ―比值作为第二特征值;
所述第三处理单元用于获取所述待检测网站的可疑页面链接; 获取所 述可疑页面链接的数量与所述链接总数的第二比值, 并将所述第二比值作 为第三特征值;
所述第四处理单元用于获取所述待检测网站的注册时长; 若所述注册 时长不大于预设时长, 则第四特征值为 1 ; 若否, 则第四特征值为 0。
9、 根据权利要求 8所述的钓鱼网站检测设备, 其特征在于, 所述第 三处理单元还用于若检测获知所述站外页面链接和 /或所述站内页面链接 包括所述钓鱼倾向目标网站的域名关键词, 或检测获知所述站外页面链接 和 /或所述站内页面链接为互联网协议地址形式的统一资源定位符,则判定 所述站外页面链接和 /或所述站内页面链接为待检测网站的可疑页面链接。
10、 根据权利要求 8或 9所述的钓鱼网站检测设备, 其特征在于, 所 述第五处理模块包括:
第五处理单元, 用于为所述第一特征值、 第二特征值、 第三特征值和 /或第四特征值分配对应的权重, 获取所述第一特征值、 第二特征值、 第三 特征值和 /或第四特征值与对应权重乘积的累加值;
第六处理单元, 若所述累加值大于预设阈值, 则判定所述待检测网站 为钓鱼网站。
PCT/CN2012/087762 2012-04-10 2012-12-28 钓鱼网站检测方法及设备 WO2013152610A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201210104080.6A CN102647422B (zh) 2012-04-10 2012-04-10 钓鱼网站检测方法及设备
CN201210104080.6 2012-04-10

Publications (1)

Publication Number Publication Date
WO2013152610A1 true WO2013152610A1 (zh) 2013-10-17

Family

ID=46659997

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2012/087762 WO2013152610A1 (zh) 2012-04-10 2012-12-28 钓鱼网站检测方法及设备

Country Status (2)

Country Link
CN (1) CN102647422B (zh)
WO (1) WO2013152610A1 (zh)

Families Citing this family (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102647422B (zh) * 2012-04-10 2014-09-17 中国科学院计算机网络信息中心 钓鱼网站检测方法及设备
CN102833262B (zh) * 2012-09-04 2015-07-01 珠海市君天电子科技有限公司 基于whois信息的钓鱼网站收集、鉴定方法和系统
CN102882716A (zh) * 2012-09-25 2013-01-16 杭州安恒信息技术有限公司 工信部备案检测方法及系统
CN102882889B (zh) * 2012-10-18 2016-05-11 珠海市君天电子科技有限公司 基于钓鱼网站ip集中性的收集与鉴定的方法和系统
CN103023894B (zh) * 2012-11-30 2016-01-06 北京奇虎科技有限公司 一种进行网上银行登录的方法和浏览器
CN104462152B (zh) * 2013-09-23 2019-04-09 深圳市腾讯计算机系统有限公司 一种网页的识别方法及装置
CN103475673B (zh) * 2013-09-30 2018-04-13 北京猎豹网络科技有限公司 钓鱼网站识别方法、装置及客户端
CN104580092B (zh) * 2013-10-21 2018-01-02 航天信息股份有限公司 对网络页面进行安全性检测的方法和装置
CN105323210A (zh) * 2014-06-10 2016-02-10 腾讯科技(深圳)有限公司 一种检测网站安全的方法、装置及云服务器
CN105574036B (zh) * 2014-10-16 2020-04-21 腾讯科技(深圳)有限公司 一种网页数据的处理方法及装置
CN104978523A (zh) * 2014-11-06 2015-10-14 哈尔滨安天科技股份有限公司 一种基于网络热词识别的恶意样本捕获方法及系统
CN105138921B (zh) * 2015-08-18 2018-02-09 中南大学 基于页面特征匹配的钓鱼网站目标域名识别方法
CN107204956B (zh) * 2016-03-16 2020-06-23 腾讯科技(深圳)有限公司 网站识别方法及装置
CN106302440B (zh) * 2016-08-11 2019-12-10 国家计算机网络与信息安全管理中心 一种多渠道获取可疑钓鱼网站的方法
CN108270754B (zh) * 2017-01-03 2021-08-06 中国移动通信有限公司研究院 一种钓鱼网站的检测方法及装置
CN107896225A (zh) * 2017-12-08 2018-04-10 深信服科技股份有限公司 钓鱼网站判定方法、服务器及存储介质
CN110413866B (zh) * 2018-04-27 2024-02-02 北京搜狗科技发展有限公司 数据处理方法和装置、用于数据处理的装置
CN110650108A (zh) * 2018-06-26 2020-01-03 深信服科技股份有限公司 一种基于icon图标的钓鱼页面识别方法及相关设备
CN109067716B (zh) * 2018-07-18 2021-05-28 杭州安恒信息技术股份有限公司 一种识别暗链的方法与系统
CN111756724A (zh) * 2020-06-22 2020-10-09 杭州安恒信息技术股份有限公司 钓鱼网站的检测方法、装置、设备、计算机可读存储介质
CN114095278B (zh) * 2022-01-19 2022-05-24 南京明博互联网安全创新研究院有限公司 一种基于混合特征选择框架的钓鱼网站检测方法

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101183415A (zh) * 2007-12-19 2008-05-21 腾讯科技(深圳)有限公司 预防敏感信息泄漏的方法和装置以及计算机终端
CN101504673A (zh) * 2009-03-24 2009-08-12 阿里巴巴集团控股有限公司 一种识别疑似仿冒网站的方法与系统
CN102647422A (zh) * 2012-04-10 2012-08-22 中国科学院计算机网络信息中心 钓鱼网站检测方法及设备

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101183415A (zh) * 2007-12-19 2008-05-21 腾讯科技(深圳)有限公司 预防敏感信息泄漏的方法和装置以及计算机终端
CN101504673A (zh) * 2009-03-24 2009-08-12 阿里巴巴集团控股有限公司 一种识别疑似仿冒网站的方法与系统
CN102647422A (zh) * 2012-04-10 2012-08-22 中国科学院计算机网络信息中心 钓鱼网站检测方法及设备

Also Published As

Publication number Publication date
CN102647422A (zh) 2012-08-22
CN102647422B (zh) 2014-09-17

Similar Documents

Publication Publication Date Title
WO2013152610A1 (zh) 钓鱼网站检测方法及设备
US8307431B2 (en) Method and apparatus for identifying phishing websites in network traffic using generated regular expressions
US8495737B2 (en) Systems and methods for detecting email spam and variants thereof
US8893278B1 (en) Detecting malware communication on an infected computing device
US20160142429A1 (en) Preventing access to malicious content
EP2863611B1 (en) Device for detecting cyber attack based on event analysis and method thereof
CN109274632B (zh) 一种网站的识别方法及装置
US20140047543A1 (en) Apparatus and method for detecting http botnet based on densities of web transactions
JP6701390B2 (ja) ネットワーク攻撃防御システムおよび方法
WO2014187120A1 (zh) 基于网页图标匹配的品牌仿冒网站检测方法
Ranganayakulu et al. Detecting malicious urls in e-mail–an implementation
WO2017041666A1 (zh) 一种针对访问请求的处理方法和装置
CN112929390B (zh) 一种基于多策略融合的网络智能监控方法
TW201909016A (zh) 閘道裝置、其惡意網域與受駭主機的偵測方法及非暫態電腦可讀取媒體
US10122722B2 (en) Resource classification using resource requests
WO2017000439A1 (zh) 一种恶意行为的检测方法、系统、设备及计算机存储介质
WO2013013475A1 (zh) 网络钓鱼检测方法及装置
Aldwairi et al. Malurls: A lightweight malicious website classification based on url features
Form et al. Phishing email detection technique by using hybrid features
Cai et al. Detecting HTTP botnet with clustering network traffic
CN103297433A (zh) 基于网络数据流的http僵尸网络检测方法及系统
JP5980968B2 (ja) 情報処理装置、情報処理方法及びプログラム
Jia et al. A novel real‐time ddos attack detection mechanism based on MDRA algorithm in big data
JP2017147575A (ja) 制御プログラム、制御装置、および、制御方法
TW202009767A (zh) 閘道裝置、惡意網域與受駭主機的偵測方法及其非暫態電腦可讀取媒體

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 12873966

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 12873966

Country of ref document: EP

Kind code of ref document: A1