CN106453351A - Financial fishing webpage detection method based on Web page characteristics - Google Patents

Financial fishing webpage detection method based on Web page characteristics Download PDF

Info

Publication number
CN106453351A
CN106453351A CN201610933083.9A CN201610933083A CN106453351A CN 106453351 A CN106453351 A CN 106453351A CN 201610933083 A CN201610933083 A CN 201610933083A CN 106453351 A CN106453351 A CN 106453351A
Authority
CN
China
Prior art keywords
webpage
title
measured
logo
text
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610933083.9A
Other languages
Chinese (zh)
Inventor
胡向东
林家富
刘可
张峰
魏琴芳
李林乐
杨子明
陈国军
白银
刘玥
付俊
郭智慧
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing University of Post and Telecommunications
Original Assignee
Chongqing University of Post and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University of Post and Telecommunications filed Critical Chongqing University of Post and Telecommunications
Priority to CN201610933083.9A priority Critical patent/CN106453351A/en
Publication of CN106453351A publication Critical patent/CN106453351A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1441Countermeasures against malicious traffic
    • H04L63/1483Countermeasures against malicious traffic service impersonation, e.g. phishing, pharming or web spoofing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L2463/00Additional details relating to network architectures or network communication protocols for network security covered by H04L63/00
    • H04L2463/102Additional details relating to network architectures or network communication protocols for network security covered by H04L63/00 applying security measure for e-commerce

Abstract

The invention relates to a financial fishing webpage detection method based on Web page characteristics. The method is based on pre-established financial first Title keyword library, second Title keyword library, sensitive keyword library and webpage Logo icon characteristic point rule library, and comprises the steps of obtaining an HTML of a to-be-detected webpage by employing a crawler, extracting text information of a Title label, calculating a matching degree with the first and second Title keyword libraries, and if the matching degree is greater than a threshold value, judging as a fishing webpage, or otherwise, going to the next step of detecting; extracting the text information of a special label of the to-be-detected webpage, making a statistics on a matching number with the sensitive keyword library, calculating a sensitive characteristic value, and if the characteristic value is greater than the threshold value, judging as the fishing webpage, or otherwise, going to the next step of detecting; and carrying out fixed point interception on the to-be-detected webpage, obtaining a Logo icon of the to-be-detected webpage, extracting characteristic points of the Logo icon, comparing with the icon characteristic point rule library, calculating a similar degree according to the matching number of characteristic points, and if the similar degree is greater than the threshold value, judging as the fishing webpage, or otherwise, as a normal webpage. According to the financial fishing webpage detection method based on the Web page characteristics related by the invention, whether the to-be-detected Web page is the financial fishing webpage can be judged accurately and quickly.

Description

Financial class fishing webpage detection method based on Web page region feature
Technical field
The invention belongs to field of information security technology, particularly web portal security detection technique field, are related to a kind of based on Web The financial class fishing webpage detection method of page feature.
Background technology
With the fast development of the Internet especially mobile Internet, all trades and professions are goed deep into based on the application of Web, is people Work, life bring great convenience.At the same time, the personal information of people is extensively collected, and faces the information of sternness Safety problem, such as gains personal sensitive information by cheating typically via phishing and then extracts wealth.
Phishing is mainly the mail and webpage for imitating legal entity, inveigles victim to provide personal sensitive information, such as The contents such as bank account, identification card number, bank card password, gain the wealth of victim further by cheating.According to 2016 first half of the year China The statistical data of anti-phishing website monitoring (APAC), counterfeit industrial and commercial bank, the fishing webpage quantity of Taobao are constantly in prostatitis, In the industry that fishing website is related to, financial class fishing website is constantly in front three, security situation very severe.Consider finance Class website user's substantial amounts, the financial asset of user in direct correlation, with important power of influence;And fishing webpage would generally In the official Internet page of the aspect such as word and picture counterfeit finance class website as far as possible, it is therefore possible to use to financial class Fishing net The method that detected of page is protecting the property of user.
Existing fishing webpage detection method mainly have blacklist filtering technique, based on the heuristic detection technique of the page, Detection technique of view-based access control model similarity etc..Blacklist filtering technique relies on upgrading in time for domain name blacklist, and detection is newly gone out Existing fishing webpage has hysteresis quality.Heuristic detection technique based on the page is using URL feature, the content of pages feature of webpage Detected, the fishing webpage similar to content of pages has higher verification and measurement ratio, but part fishing webpage is using embedded picture Or useless word is evading the detection of content of pages.The detection technique of view-based access control model similarity utilizes Web page picture similarity Or webpage DOM tree structure similarity is detected, there is to the fishing webpage of visual similarity higher verification and measurement ratio, but algorithm Complicated and detection efficiency is relatively low.
Content of the invention
In view of this, it is an object of the invention to provide a kind of financial class fishing webpage based on Web page region feature is detected Method, the method carries out fishing webpage detection for telecommunication fraud very rampant at present, it is possible to increase the inspection to fishing webpage Survey rate and detection efficiency, reduce False Rate.
For reaching above-mentioned purpose, the present invention provides following technical scheme:
A kind of financial class fishing webpage detection method based on Web page region feature, the execution of the method is based on and pre-builds The first Title keywords database of financial class, the 2nd Title keywords database, sensitive keys dictionary and webpage Logo picture feature Point rule base;The method specifically includes following steps:
S1:The HTML of the webpage to be measured for being obtained using reptile, extracts the text message in Title label, calculates text envelope Breath with a Title keywords database, the matching degree of the 2nd Title key word, if matching degree be more than threshold value, judge webpage to be measured as Fishing webpage, otherwise, enters step S2 and treats survey grid page and do and detect further;
S2:The text message in webpage specific label to be measured is extracted, statistics text message is mated with sensitive keys dictionary Number, calculates Web sensitive features value, if eigenvalue is more than threshold value, judges webpage to be measured as fishing webpage, otherwise, enters step Rapid S3 treats survey grid page and does and detects further;
S3:Treating survey grid page carries out fixed point sectional drawing, and sectional drawing is schemed with Logo of the minimum area comprising webpage to be measured as far as possible Piece;
S4:The characteristic point of Logo sectional drawing is extracted, which is contrasted with webpage Logo picture feature point rule base, according to The coupling number of characteristic point calculates the similarity of two width Logo pictures, if similarity is more than threshold value, judges webpage to be measured as fishing Fishnet page, otherwise, it is determined that webpage to be measured is normal webpage.
Preferably, in step sl, the HTML of the webpage to be measured that the use reptile obtains, extracts in Title label Text message, calculate text message specifically include with the matching degree of a Title keywords database, the 2nd Title key word:
S11:By spiders instrument, the Web page Title text message of URL to be measured is obtained, Title text is done Pretreatment, removes the space in Title text and the interference contents such as underscore, uses participle to pretreated Title text Technology carries out participle, obtains participle number N0
S12:The key word for obtaining after participle is mated with a Title keywords database, if both mate number N1 Not less than 1, then mate with the 2nd Title keywords database, also obtain mating number N2
S13:By mating twice, total Keywords matching number is obtained, define Title Keywords matching degreeThe size of α represents the similarity degree of webpage to be measured and financial class webpage Title;Can be according to first The match condition of Title keywords database determines the specifically counterfeit object of financial class fishing webpage.
Preferably, in step s 2, the text message for extracting in webpage specific label to be measured, counts text message With sensitive keys dictionary mate number, calculate Web sensitive features value and specifically include:
S21:The text message of webpage HTML specific label to be measured is obtained, including in a label, h label and span label Text message, the label text acquired in pretreatment, and extract effective text message and its bar number i;
S22:First will be carried out first time is mated per bar text with sensitive keys dictionary, if mating, carries out next text Coupling, if mismatching, the key word for obtaining after this participle of provision is carried out second mating with sensitive keys dictionary, only There is a Keywords matching success, then carry out the coupling of next text;
S23:By mating once or twice text bar number j (j≤i) that may be matched, Web sensitive features are defined ValueThe size of β reflects the similarity degree of web page text feature to be measured and financial class web page text feature.
Preferably, in step s3, the survey grid page for the treatment of carries out fixed point sectional drawing, and sectional drawing is as far as possible with minimum area Logo icon comprising webpage to be measured is specifically included:
S31:Automated test tool is called, is automatically opened up browser and webpage to be measured is obtained, adjust browser window size Size is 800*600, the size Logo comprising webpage enough;
S32:Sectional drawing instrument is called to intercept automatically the region of 600*250 above browser page, the region can be with minimum Logo icon of the area comprising all webpages to be measured, minimum Logo sectional drawing can reduce characteristic point amount of calculation and reduce financial class Fishing webpage detects False Rate.
Preferably, in step s 4, the characteristic point for extracting Logo sectional drawing, by itself and webpage Logo icon characteristics Point rule base is contrasted, and the coupling number according to characteristic point calculates the similarity of two width Logo pictures and specifically includes:
S41:Characteristic point data D of Logo sectional drawing is extracted using image characteristic point extraction algorithm0, and obtain characteristic point Number k;
S42:Width Logo picture feature point data D is taken out from financial class Logo picture feature point rule base, and D has m Characteristic point;
S43:D is calculated using Image Feature Point Matching algorithm0Feature point number n (n≤k, m) for matching with D;
S44:The similarity for defining two width Logo pictures isLetter according to financial class Logo picture feature point rule base Breath determines the specifically counterfeit object of financial class fishing webpage.
The beneficial effects of the present invention is:The method that the present invention is provided can overcome similar approach in prior art to exist Problem, it is possible to increase the verification and measurement ratio to fishing webpage and detection efficiency, reduces False Rate, can be good at for very ferocious at present Rampant telecommunication fraud carries out fishing webpage detection.
Description of the drawings
In order that the purpose of the present invention, technical scheme and beneficial effect are clearer, the present invention provides drawings described below and carries out Explanation:
Fig. 1 is a kind of flow chart of the financial class fishing webpage detection method based on Web page region feature of the present invention;
Fig. 2 is a kind of detail flowchart of the financial class fishing webpage detection method based on Web page region feature of the present invention;
Fig. 3 is size and the position view of Web page Logo sectional drawing of the present invention.
Specific embodiment
Below in conjunction with accompanying drawing, the preferred embodiments of the present invention are described in detail.
The flow chart of the financial class fishing webpage detection method based on Web page region feature that Fig. 1 is provided for the present invention, this The execution of bright detection method is based on the first Title keywords database of financial class for pre-building, the 2nd Title keywords database, sensitivity Keywords database and webpage Logo picture feature point rule base.
Specifically, it is described in detail with the method for the 2nd Title keywords database to setting up a Title keywords database:
The Title label text of multiple finance class webpages and fishing webpage HTML is analyzed, is extracted in Title text The full name of the financial class mechanism of appearance, conventional abbreviation, form a Title keywords database, the such as conventional letter of the Industrial and Commercial Bank of China Cheng You industrial and commercial bank, industrial and commercial bank, work silver.Extract the everyday expressions that collocation is formed in Title text with a Title keywords database, A Title keywords database is formed, such as login, homepage, homepage, business etc..First Title keywords database and the 2nd Title are closed Keyword storehouse can form conventional webpage Title, such as Industrial and Commercial Bank of China's homepage, industrial and commercial bank's homepage etc..To a Title key word It is required for when adding key word respectively in storehouse, the 2nd Title keywords database carrying out deduplication operation, and the first keywords database and second The key word that keywords database does not overlap.
The method for setting up financial class sensitive keys dictionary is described in detail:
The key word of multiple finance class webpages and a label, h label and span label in fishing webpage HTML is carried out Analysis, selects sensitivity higher or key word that is characterizing financial class mechanism, remittance of such as transferring accounts, bank's card number, identification card number, Finance and money management, e-bank etc..
The method for setting up financial class webpage Logo picture feature point rule base is described in detail:
Using sectional drawing instrument, sectional drawing, the size of sectional drawing and Logo are carried out to the Logo of multiple finance class webpages and fishing webpage Size relevant.Using image characteristics extraction algorithm, feature point extraction is carried out to Logo sectional drawing, forms webpage Logo picture feature Point rule base.
Based on the first Title keywords database of financial class of above-mentioned foundation, the 2nd Title keywords database, sensitive keys dictionary And webpage Logo picture feature point rule base, with reference to the finance based on Web page region feature that Fig. 2, Fig. 2 are provided for the present embodiment Class fishing webpage detection method flow chart, specifically includes:
Step 201:Domain name is extracted from regular expression used in URL to be detected, be the efficiency for improving detection, at most extract The front three-level domain name of URL, is designated as domain name to be measured;
Step 202:Domain name to be measured is mated with domain name black list database, domain name blacklist by present by The fishing webpage domain name composition that the mechanisms such as state's anti-phishing website monitoring confirm, if including domain name to be measured in black list database, Then judge the domain name that domain name to be measured is used for fishing webpage, otherwise it is assumed that domain name to be measured is suspicious domain name, next step need to be carried out Detection;
Step 203:Domain name to be measured is mated with domain name white list database, domain name white list is by conventional webpage domain Name and financial class webpage domain name composition, if including domain name to be measured in white list database, judge domain name to be measured for normal operation in normal domain Name, otherwise it is assumed that domain name to be measured is suspicious domain name, need to carry out next step detection;
Step 204:Time-to-live based on current major part fishing webpage is short, can be filtered according to the liveness of domain name big Webpage is commonly used in part, and judges whether to need to do domain name to be measured to detect further.Domain name to be measured is crawled using web crawlers Alexa visit capacity rank value, if visit capacity return value is sky, then it is assumed that domain name to be measured is suspicious domain name, need to carry out next step inspection Survey.If not empty, averagely daily visit capacity ranking is calculated, when visit capacity ranking (means that visit capacity is arranged less than a certain threshold value Name is forward), then domain name to be measured is judged for normal domain name, otherwise it is assumed that domain name to be measured is suspicious domain name, need to carry out next step detection;
Step 205:The Web page of URL to be measured is obtained using reptile, extracts the text in the Title label of Web page to be measured This, carry out pretreatment to Title text, space such as in deletion text, underscore etc., and to pretreated Title text Participle is carried out using participle technique, obtain participle number N0.By the Title key word for obtaining after participle and Title key Key word in dictionary is mated, and obtains mating number N1.If coupling number N1For 0, then it is assumed that webpage to be measured is suspicious net Page, need to carry out next step detection.If coupling number N1More than or equal to 1, by Title key word again with the 2nd Title keywords database In key word mated, obtain mate number N2.Total Keywords matching number is obtained by coupling twice, calculates Title Keywords matching degree:
The size of α represents the similarity degree of webpage to be measured and financial class webpage Title.If N0=0 or N1=0, then define α=0.If matching degree is more than a certain threshold alpha*, then financial class fishing webpage is judged to, otherwise it is assumed that webpage to be measured is suspicious webpage, Next step detection need to be carried out.Normal finance class webpage is filtered in domain name white list or the detection of domain name liveness. α*Need to be trained using financial class fishing webpage, normal finance class webpage and the normal webpage of non-financial class, make verification and measurement ratio Optimum is reached with False Rate.While financial class fishing webpage can be determined according to the match condition with a Title keywords database Specifically counterfeit object.
Step 206:The text message of webpage HTML specific label to be measured is obtained, is marked including a label, h label and span Text message in label, the label text acquired in pretreatment, and extract effective text message and its bar number i.First will be per bar Complete text and sensitive keys dictionary carry out first time and mate, if coupling, carries out the coupling of next text, if not Join, then the key word for obtaining after this participle of provision is carried out second mating with sensitive keys dictionary, as long as there is a key The match is successful for word, then carry out the coupling of next text.By mating once or twice the text bar number j that may be matched (j≤i), calculates Web sensitive features value:
The size of β reflects the similarity degree of web page text feature to be measured and financial class web page text feature.If i=0, fixed Adopted β=0.If eigenvalue is more than a certain threshold value beta*, then financial class fishing webpage is judged to, otherwise it is assumed that webpage to be measured is suspicious net Page, need to carry out next step detection.Normal finance class webpage was carried out in domain name white list or the detection of domain name liveness Filter.β*Need to be trained using financial class fishing webpage, normal finance class webpage and the normal webpage of non-financial class, make detection Rate and False Rate reach optimum.
Step 207:Automated test tool is called, is automatically opened up browser and webpage to be measured is obtained, adjust browser window Size is 800*600, the size Logo icon comprising webpage enough.Sectional drawing instrument is called to intercept automatically browser page The region of top 600*250, with reference to Fig. 3, the region can be minimum with Logo icon of the minimum area comprising all webpages to be measured Logo sectional drawing can reduce characteristic point amount of calculation and reduce financial class fishing webpage detection False Rate.
Step 208:Characteristic point data D of Logo sectional drawing is extracted using image characteristic point extraction algorithm0, and obtain feature Point number k, the image characteristic point extraction algorithm is calculated with the feature point extraction that webpage Logo picture feature point rule base uses is set up Method is identical.Width Logo picture feature point data D is taken out from financial class Logo picture feature point rule base, and D has m feature Point.D is calculated using Image Feature Point Matching algorithm0Feature point number n (n≤k, m) for matching with D, calculates two width Logo figure The similarity of piece:
The size of γ reflects the similarity degree of webpage Logo and financial class webpage Logo to be measured.If eigenvalue is more than a certain threshold Value γ*, then it is judged to financial class fishing webpage, otherwise it is assumed that webpage to be measured is normal webpage.For improving detection efficiency, as long as when There is γ > γ*Just stop calculating the similarity of Logo picture.Normal finance class webpage is in domain name white list or domain name Filtered in liveness detection.γ*Need normal using financial class fishing webpage, normal finance class webpage and non-financial class Webpage is trained, and makes verification and measurement ratio and False Rate reach optimum.While can be according to financial class Logo picture feature point rule base Information determine the counterfeit financial institution's title of financial class fishing webpage.
Finally illustrate, preferred embodiment above is only unrestricted in order to technical scheme to be described, although logical Cross above preferred embodiment to be described in detail the present invention, it is to be understood by those skilled in the art that can be Various changes are made in form and to which in details, without departing from claims of the present invention limited range.

Claims (5)

1. a kind of financial class fishing webpage detection method based on Web page region feature, it is characterised in that:The execution of the method is based on The first Title keywords database of financial class, the 2nd Title keywords database, sensitive keys dictionary and the webpage Logo for pre-building Picture feature point rule base;The method specifically includes following steps:
S1:The HTML of the webpage to be measured for being obtained using reptile, extract Title label in text message, calculate text message with First Title keywords database, the matching degree of the 2nd Title key word, if matching degree is more than threshold value, judge webpage to be measured as fishing Webpage, otherwise, enters step S2 and treats survey grid page and do and detect further;
S2:The text message in webpage specific label to be measured is extracted, statistics text message mates number with sensitive keys dictionary, Web sensitive features value being calculated, if eigenvalue is more than threshold value, webpage to be measured is judged as fishing webpage, otherwise, enters step S3 pair Webpage to be measured does and detects further;
S3:Treating survey grid page carries out fixed point sectional drawing, and sectional drawing is as far as possible with Logo picture of the minimum area comprising webpage to be measured;
S4:The characteristic point of Logo sectional drawing is extracted, which is contrasted with webpage Logo picture feature point rule base, according to feature The coupling number of point calculates the similarity of two width Logo pictures, if similarity is more than threshold value, judges webpage to be measured as Fishing net Page, otherwise, it is determined that webpage to be measured is normal webpage.
2. the financial class fishing webpage detection method based on Web page region feature according to claim 1, it is characterised in that: In step sl, the HTML of the webpage to be measured that the use reptile obtains, extracts the text message in Title label, calculates text This information is specifically included with the matching degree of a Title keywords database, the 2nd Title key word:
S11:By spiders instrument, the Web page Title text message of URL to be measured is obtained, pre- place is done to Title text Reason, removes the space in Title text and the interference contents such as underscore, uses participle technique to pretreated Title text Participle is carried out, obtains participle number N0
S12:The key word for obtaining after participle is mated with a Title keywords database, if both mate number N1It is not less than 1, then mate with the 2nd Title keywords database, also obtain mating number N2
S13:By mating twice, total Keywords matching number is obtained, define Title Keywords matching degree The size of α represents the similarity degree of webpage to be measured and financial class webpage Title;Can according to a Title keywords database Match condition determines the specifically counterfeit object of financial class fishing webpage.
3. the financial class fishing webpage detection method based on Web page region feature according to claim 1, it is characterised in that: In step s 2, the text message for extracting in webpage specific label to be measured, statistics text message and sensitive keys dictionary Coupling number, calculates Web sensitive features value and specifically includes:
S21:The text message of webpage HTML specific label to be measured is obtained, including the text in a label, h label and span label This information, the label text acquired in pretreatment, and extract effective text message and its bar number i;
S22:First will be carried out first time is mated per bar text with sensitive keys dictionary, if mating, carry out next text Joining, if mismatching, the key word for obtaining being carried out second mating with sensitive keys dictionary, as long as having after this participle of provision One Keywords matching success, then carry out the coupling of next text;
S23:By mating once or twice text bar number j (j≤i) that may be matched, Web sensitive features value is definedThe size of β reflects the similarity degree of web page text feature to be measured and financial class web page text feature.
4. the financial class fishing webpage detection method based on Web page region feature according to claim 1, it is characterised in that: In step s3, the survey grid page for the treatment of carries out fixed point sectional drawing, and sectional drawing is as far as possible with minimum area comprising webpage to be measured Logo icon is specifically included:
S31:Automated test tool is called, is automatically opened up browser and webpage to be measured is obtained, adjust browser window size For 800*600, the size Logo comprising webpage enough;
S32:Sectional drawing instrument is called to intercept automatically the region of 600*250 above browser page, the region can be with minimum area Logo icon comprising all webpages to be measured, minimum Logo sectional drawing can reduce characteristic point amount of calculation and reduce financial class fishing Webpage detects False Rate.
5. the financial class fishing webpage detection method based on Web page region feature according to claim 1, it is characterised in that: In step s 4, the characteristic point for extracting Logo sectional drawing, which is contrasted with webpage Logo icon characteristics point rule base, Coupling number according to characteristic point calculates the similarity of two width Logo pictures and specifically includes:
S41:Characteristic point data D of Logo sectional drawing is extracted using image characteristic point extraction algorithm0, and obtain feature point number k;
S42:Width Logo picture feature point data D is taken out from financial class Logo picture feature point rule base, and D has m feature Point;
S43:D is calculated using Image Feature Point Matching algorithm0Feature point number n (n≤k, m) for matching with D;
S44:The similarity for defining two width Logo pictures isTrue according to the information of financial class Logo picture feature point rule base Deposit melts the specifically counterfeit object of class fishing webpage.
CN201610933083.9A 2016-10-31 2016-10-31 Financial fishing webpage detection method based on Web page characteristics Pending CN106453351A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610933083.9A CN106453351A (en) 2016-10-31 2016-10-31 Financial fishing webpage detection method based on Web page characteristics

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610933083.9A CN106453351A (en) 2016-10-31 2016-10-31 Financial fishing webpage detection method based on Web page characteristics

Publications (1)

Publication Number Publication Date
CN106453351A true CN106453351A (en) 2017-02-22

Family

ID=58177519

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610933083.9A Pending CN106453351A (en) 2016-10-31 2016-10-31 Financial fishing webpage detection method based on Web page characteristics

Country Status (1)

Country Link
CN (1) CN106453351A (en)

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107329981A (en) * 2017-06-01 2017-11-07 北京京东尚科信息技术有限公司 The method and apparatus of page detection
CN108595583A (en) * 2018-04-18 2018-09-28 平安科技(深圳)有限公司 Dynamic chart class page data crawling method, device, terminal and storage medium
CN108965245A (en) * 2018-05-31 2018-12-07 国家计算机网络与信息安全管理中心 Detection method for phishing site and system based on the more disaggregated models of adaptive isomery
CN108959928A (en) * 2018-06-29 2018-12-07 北京奇虎科技有限公司 A kind of detection method, device, equipment and the storage medium at webpage back door
CN109101657A (en) * 2018-08-30 2018-12-28 杭州安恒信息技术股份有限公司 Multiple level marketing referrer website identification method, device and equipment
CN110650108A (en) * 2018-06-26 2020-01-03 深信服科技股份有限公司 Fishing page identification method based on icon and related equipment
CN111107048A (en) * 2018-10-29 2020-05-05 中移(苏州)软件技术有限公司 Phishing website detection method and device and storage medium
CN111400705A (en) * 2020-03-04 2020-07-10 支付宝(杭州)信息技术有限公司 Application program detection method, device and equipment
CN111541683A (en) * 2020-04-20 2020-08-14 杭州安恒信息技术股份有限公司 Risk website propaganda main body detection method, device, equipment and medium
CN112287198A (en) * 2020-10-28 2021-01-29 上海云信留客信息科技有限公司 Spam short message detection method based on crawler technology
CN113127715A (en) * 2021-03-04 2021-07-16 微梦创科网络科技(中国)有限公司 Method and system for identifying gambling-related information
CN113192081A (en) * 2021-04-30 2021-07-30 北京达佳互联信息技术有限公司 Image recognition method and device, electronic equipment and computer-readable storage medium
CN113657362A (en) * 2021-07-27 2021-11-16 浙江大学 Webpage-side floating window closed channel detection method based on computer vision
CN113723980A (en) * 2020-05-26 2021-11-30 北京达佳互联信息技术有限公司 Method and device for detecting advertisement landing page, electronic equipment and storage medium
CN115051817A (en) * 2022-01-05 2022-09-13 中国互联网络信息中心 Phishing detection method and system based on multi-mode fusion features
CN116366338A (en) * 2023-03-30 2023-06-30 北京微步在线科技有限公司 Risk website identification method and device, computer equipment and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060123478A1 (en) * 2004-12-02 2006-06-08 Microsoft Corporation Phishing detection, prevention, and notification
CN103092861A (en) * 2011-11-02 2013-05-08 阿里巴巴集团控股有限公司 Method and system for selecting commodity representative picture
CN103324615A (en) * 2012-03-19 2013-09-25 哈尔滨安天科技股份有限公司 Method and system for detecting phishing website based on SEO (search engine optimization)
WO2014105919A1 (en) * 2012-12-27 2014-07-03 Microsoft Corporation Identifying web pages in malware distribution networks
CN104168293A (en) * 2014-09-05 2014-11-26 北京奇虎科技有限公司 Method and system for recognizing suspicious phishing web page in combination with local content rule base
CN104462152A (en) * 2013-09-23 2015-03-25 深圳市腾讯计算机系统有限公司 Webpage recognition method and device

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060123478A1 (en) * 2004-12-02 2006-06-08 Microsoft Corporation Phishing detection, prevention, and notification
CN103092861A (en) * 2011-11-02 2013-05-08 阿里巴巴集团控股有限公司 Method and system for selecting commodity representative picture
CN103324615A (en) * 2012-03-19 2013-09-25 哈尔滨安天科技股份有限公司 Method and system for detecting phishing website based on SEO (search engine optimization)
WO2014105919A1 (en) * 2012-12-27 2014-07-03 Microsoft Corporation Identifying web pages in malware distribution networks
CN104462152A (en) * 2013-09-23 2015-03-25 深圳市腾讯计算机系统有限公司 Webpage recognition method and device
CN104168293A (en) * 2014-09-05 2014-11-26 北京奇虎科技有限公司 Method and system for recognizing suspicious phishing web page in combination with local content rule base

Cited By (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107329981B (en) * 2017-06-01 2021-05-25 北京京东尚科信息技术有限公司 Page detection method and device
CN107329981A (en) * 2017-06-01 2017-11-07 北京京东尚科信息技术有限公司 The method and apparatus of page detection
CN108595583A (en) * 2018-04-18 2018-09-28 平安科技(深圳)有限公司 Dynamic chart class page data crawling method, device, terminal and storage medium
CN108965245A (en) * 2018-05-31 2018-12-07 国家计算机网络与信息安全管理中心 Detection method for phishing site and system based on the more disaggregated models of adaptive isomery
CN110650108A (en) * 2018-06-26 2020-01-03 深信服科技股份有限公司 Fishing page identification method based on icon and related equipment
CN108959928A (en) * 2018-06-29 2018-12-07 北京奇虎科技有限公司 A kind of detection method, device, equipment and the storage medium at webpage back door
CN109101657A (en) * 2018-08-30 2018-12-28 杭州安恒信息技术股份有限公司 Multiple level marketing referrer website identification method, device and equipment
CN111107048A (en) * 2018-10-29 2020-05-05 中移(苏州)软件技术有限公司 Phishing website detection method and device and storage medium
CN111107048B (en) * 2018-10-29 2021-11-30 中移(苏州)软件技术有限公司 Phishing website detection method and device and storage medium
CN111400705B (en) * 2020-03-04 2023-03-14 支付宝(杭州)信息技术有限公司 Application program detection method, device and equipment
CN111400705A (en) * 2020-03-04 2020-07-10 支付宝(杭州)信息技术有限公司 Application program detection method, device and equipment
CN111541683A (en) * 2020-04-20 2020-08-14 杭州安恒信息技术股份有限公司 Risk website propaganda main body detection method, device, equipment and medium
CN111541683B (en) * 2020-04-20 2022-04-19 杭州安恒信息技术股份有限公司 Risk website propaganda main body detection method, device, equipment and medium
CN113723980A (en) * 2020-05-26 2021-11-30 北京达佳互联信息技术有限公司 Method and device for detecting advertisement landing page, electronic equipment and storage medium
CN112287198A (en) * 2020-10-28 2021-01-29 上海云信留客信息科技有限公司 Spam short message detection method based on crawler technology
CN112287198B (en) * 2020-10-28 2023-12-01 上海云信留客信息科技有限公司 Junk short message detection method based on crawler technology
CN113127715A (en) * 2021-03-04 2021-07-16 微梦创科网络科技(中国)有限公司 Method and system for identifying gambling-related information
CN113192081A (en) * 2021-04-30 2021-07-30 北京达佳互联信息技术有限公司 Image recognition method and device, electronic equipment and computer-readable storage medium
CN113657362A (en) * 2021-07-27 2021-11-16 浙江大学 Webpage-side floating window closed channel detection method based on computer vision
CN113657362B (en) * 2021-07-27 2023-09-29 浙江大学 Webpage end floating window closing channel detection method based on computer vision
CN115051817B (en) * 2022-01-05 2023-11-24 中国互联网络信息中心 Phishing detection method and system based on multi-mode fusion characteristics
CN115051817A (en) * 2022-01-05 2022-09-13 中国互联网络信息中心 Phishing detection method and system based on multi-mode fusion features
CN116366338A (en) * 2023-03-30 2023-06-30 北京微步在线科技有限公司 Risk website identification method and device, computer equipment and storage medium
CN116366338B (en) * 2023-03-30 2024-02-06 北京微步在线科技有限公司 Risk website identification method and device, computer equipment and storage medium

Similar Documents

Publication Publication Date Title
CN106453351A (en) Financial fishing webpage detection method based on Web page characteristics
CN106789888B (en) Multi-feature fusion phishing webpage detection method
CN103559235B (en) A kind of online social networks malicious web pages detection recognition methods
US8695100B1 (en) Systems and methods for electronic fraud prevention
CN103544436B (en) System and method for distinguishing phishing websites
CN104077396B (en) Method and device for detecting phishing website
CN104462152B (en) A kind of recognition methods of webpage and device
US8000504B2 (en) Multimodal classification of adult content
CN104899508B (en) A kind of multistage detection method for phishing site and system
CN108337255B (en) Phishing website detection method based on web automatic test and width learning
CN105718577B (en) Method and system for automatically detecting phishing aiming at newly added domain name
CN105824822A (en) Method clustering phishing page to locate target page
CN112541476B (en) Malicious webpage identification method based on semantic feature extraction
Dadkhah et al. An introduction to journal phishings and their detection approach
CN109918621A (en) Newsletter archive infringement detection method and device based on digital finger-print and semantic feature
CN102999638A (en) Phishing website detection method excavated based on network group
CN109347786A (en) Detection method for phishing site
CN106230835A (en) Method based on the anti-malicious access that Nginx log analysis and IPTABLES forward
Khan Detection of phishing websites using deep learning techniques
CN110532784A (en) A kind of dark chain detection method, device, equipment and computer readable storage medium
Shyni et al. A multi-classifier based prediction model for phishing emails detection using topic modelling, named entity recognition and image processing
CN108694325B (en) Method and device for identifying specified type of website
Mohammed et al. Phishing Detection Using Machine Learning Algorithms
Kaur et al. Five-tier barrier anti-phishing scheme using hybrid approach
Montaruli et al. Raze to the Ground: Query-Efficient Adversarial HTML Attacks on Machine-Learning Phishing Webpage Detectors

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20170222

RJ01 Rejection of invention patent application after publication