CN101534306B - Detecting method and a device for fishing website - Google Patents

Detecting method and a device for fishing website Download PDF

Info

Publication number
CN101534306B
CN101534306B CN2009101065591A CN200910106559A CN101534306B CN 101534306 B CN101534306 B CN 101534306B CN 2009101065591 A CN2009101065591 A CN 2009101065591A CN 200910106559 A CN200910106559 A CN 200910106559A CN 101534306 B CN101534306 B CN 101534306B
Authority
CN
China
Prior art keywords
website
page
page info
characteristic
reference feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN2009101065591A
Other languages
Chinese (zh)
Other versions
CN101534306A (en
Inventor
林世飞
杨勇
马松松
陈欢
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Tencent Computer Systems Co Ltd
Original Assignee
Shenzhen Tencent Computer Systems Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Tencent Computer Systems Co Ltd filed Critical Shenzhen Tencent Computer Systems Co Ltd
Priority to CN2009101065591A priority Critical patent/CN101534306B/en
Publication of CN101534306A publication Critical patent/CN101534306A/en
Application granted granted Critical
Publication of CN101534306B publication Critical patent/CN101534306B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention discloses a detecting method and a device for fishing website. The method comprises the following steps: detecting device obtains the final page information according to the received URL request; analyzing the final page information of the website; as the DOM tree structure and building the index for the label according to the final page information in DOM tree structure; extracting the characteristic of the final page information n the DOM tree structure according to the built index, traveling the pre-set characteristic database and matching the characteristic of the final page information and the referred characteristic in the database, outputting the referred characteristic; travelling the pre-set model database and finding the referred object in the model database according to the referred characteristic to confirm whether the website is the fishing website. The invention processes the detection for the page information by the website, which avoids the misinformation phenomenon in judging the page information to enhance the judging accuracy of the fishing website.

Description

A kind of detection method of fishing website and device
Technical field
The present invention relates to Internet technical field, relate in particular to a kind of detection method and device of fishing website.
Background technology
Along with popularizing of the Internet, more and more users begins to exchange and business transaction through the Internet, and Internet services such as ecommerce, e-bank, online game also grow up thereupon.During the user capture website, need information such as input number of the account and password, when information such as number of the account of importing and password were correct, the user can get into the website and operate on the net.User's number of the account and password are that the user gets into the unique identification that operate these websites, if there is the people to usurp user's number of the account and password, pretend to be the user to get into the website, then possibly cause infringement to user's interests.As shown in Figure 1, in the Internet used, some lawless person was shown to the page of user class vraisemblance website through the fishing website mode, inveigled the user to input number of the account and password, and user's number of the account and password are usurped.Fishing website generally has only one or several page, and structure is comparatively simple, and the mode that fishing website imitates true website is also varied, the most commonly comprises following several kinds:
1, the domain name of the true website of personation
Illegal appropriator imitates true website URL (Uniform Resource Locator, URL) through one of registration and the true similar website of domain name, website, and it is true website that capital and small letter or the use spcial character that changes URL makes the user think by mistake.As utilizing " paypaI.com " personation " paypal.com ", utilize " barcIays.com " personation " barclays.com ", utilize " www.1cbc.com.cn " personation " www.icbc.com.cn " etc.Also can be to be inserted in the personation domain name, constitute the personation domain name through a part with true domain name.For example, utilize " ebay-members-security.com " personation " ebay.com ", utilize " users-paypal.com " personation " paypal.com ".
2, show with the IP address
Illegal appropriator is through showing the URL of fishing website with the form of IP address; Like http: // 210.93.131.250; Because the IP address of the URL of many true websites also comprises some opaque and not intelligible numerals, the URL of the fishing website that shows with the IP address is difficult for coming to light.
Owing to there is the attack pattern of above-mentioned multiple fishing website in the Internet, can user's interests be caused damage, therefore,, need the fishing website in numerous websites be detected in order to protect user's interests.Existing fishing website detection method mainly comprises following several kinds:
A, detect, replace host name when the URL of fishing website uses the IP address based on the IP of website, the time; These main frames are not applied for domain name usually; Therefore, can utilize DNS (Domain Name System, domain name system) that the domain name that reverse name resolution obtains the website is carried out in the IP address of website; If this website does not have domain name after resolving, then this website possibly be fishing website.
B, detect based on the domain name of website; Through setting up the address list of a true website; The domain name addresses (this tabulation is also referred to as white list) that comprises all true websites in this tabulation; Whether through the domain name addresses in the address list of the domain name of website and true website is compared, differentiating this website is fishing website.
C, detect based on the port of website, 80 ports are to be that HTTP (HyperText TransportProtocol, HTTP) is open, are mainly used in the agreement that transmission information is uploaded in WWW (World Wide Web, World Wide Web (WWW)) service; ISP (Internet Service Provider, ISP) is in order to prevent to set up privately the website, as sets up fishing website; Closed personal user's URL port, illegal appropriator then sets up fishing website through non-80 ports, therefore; Through detecting the port numbers of website; Whether be fishing website, if promptly the port numbers of website is non-80 ports, then possibly be fishing website if also can differentiate this website.
D, detecting based on the page info of website with vision similarity; Page info analysis based on the website is to detect through identity informations such as the website name among the HTML (HyperText Mark-up Language, hypertext makeup language) that obtains webpage, institutional affiliation names.Usually in true website; The HTML that comprises identity informations such as website name, institutional affiliation name comprises < TITLE>mark, the CONTENT attribute of < METANAME=" KEYWORDS "/" DESCRIPTION "/" COPYRIGHT ">mark, the ALT attribute of < IMG/INPUT/AREA/OBJECT>mark; Fishing website is for attracting user capture; The identity information that the Fishing net standing-meeting keeps in the true website is constant, quotes the page info of true website or the correct of true website is become fishing website, therefore; Fishing website has very big similarity with the page info of true website, detects through the page info of differentiating the website whether this website is fishing website.For with the relatively poor website of similarity, true website, the user thinks that these websites are not the websites that oneself needs, and do not operate these websites.Vision similarity mainly adopts many fingerprints comparison algorithm; Promptly through calculating the fingerprint group of tested website and true website; The fingerprint group that compares these two websites; If the degree of overlapping of fingerprint group is high more, think that then the two is similar more, the website high more with true Website page information similarity might be fishing website more.
Though the detection mode of above-mentioned fishing website is arranged; But in realizing process of the present invention; The inventor finds that there is following problem at least in prior art: based on the IP of website, method such as detect based on the domain name of website with based on the port of website; Owing to can not detect the page info of website, the phenomenon that may occur reporting by mistake in practice.And existing detection method based on Website page information also just detects to some particular problems, does not possess versatility, can't handle various emerging fishing websites in real time.
Summary of the invention
The invention provides a kind of detection method and device of fishing website,, judge whether this website is fishing website through utilizing property data base and model database and needing the website of inquiry to carry out matching detection.
The invention provides a kind of detection method of fishing website, may further comprise the steps:
A. according to the uniform resource position mark URL request that receives, obtain the final page surface information of website;
B. said final page surface information is resolved to DOM Document Object Model dom tree structure, and in said dom tree, set up index according to the label of said final page surface information;
C. according to the index of said foundation; Extract the characteristic of final page surface information described in the said dom tree; The property data base that traversal is provided with in advance matees the characteristic of said final page surface information and the reference feature in the said property data base, the said reference feature that output is hit;
D. travel through the model database that is provided with in advance, at the reference model that said model data library lookup hits, confirm whether said website is fishing website according to said reference feature of hitting.
Preferably, page info comprises information header and imformosome,
Said step a further comprises:
According to the information header and the imformosome of the said page info of said website, analyze said page info and have the redirect relation;
Confirm the URL of purpose website, obtain the page info of said purpose website, obtain said final page surface information according to the page info of said purpose website, said purpose website is for concerning the website after the redirect according to the redirect that said Website page information had.
Preferably, said analysis page info has the redirect relation, comprises that analyzing said website has the redirect through meta label or 30X message.
Said step a further comprises:
According to the information header and the imformosome of the page info of said website, analyze in the said page info and have inclusion relation;
Confirm the URL of each subpage frame in the said inclusion relation, obtain the page info of said subpage frame, obtain said final page surface information according to the page info of said subpage frame.
Preferably, said page info is described through the HTML html document, and said html document comprises the label of forming said website,
Said analysis page info has inclusion relation, comprises analyzing having the subpage frame that comprises through iframe or frameset label.
Preferably, said page info is described through html document, and said html document comprises the label of forming said website, and said html document comprises polytype page info,
Said step b further comprises:
The html document that will comprise said page info resolves to dom tree, and said dom tree comprises multiple node, one type page info in the corresponding said html document of each node difference;
Label according to the page info of each type in the said html document is set up index.
Preferably, before said step c, also comprise:
Set up said property data base, this step specifically comprises:
Extract the characteristic of known fishing website;
According to the characteristic frequency that in said fishing website, occurs and the proportion that accounts for of said known fishing website, confirm the corresponding weights of characteristic of said known fishing website;
Store the characteristic of said known fishing website and the weights of said correspondence into said property data base; Set up said property data base; The said said reference feature that is characterized as that stores in the said property data base, the weights of said correspondence are the weights of said reference feature.
Preferably, before said steps d, also comprise:
Said reference feature is classified according to the fishing website type;
The said reference feature of each type is formed a said reference model;
Set up the model database of said fishing website according to said reference model.
The present invention also provides a kind of checkout gear of fishing website, comprising:
Page acquisition module is used for obtaining the final page surface information of website according to the URL request that receives;
The page info processing module is connected with said page acquisition module, is used for said final page surface information is resolved to the dom tree structure, and in said dom tree, sets up index according to the label of said final page surface information;
Characteristic matching module; Be connected with said page info processing module; Be used for index, extract the characteristic of final page surface information described in the said dom tree, the property data base that traversal is provided with in advance according to said foundation; The characteristic of said final page surface information and the reference feature in the said property data base are mated the said reference feature that output is hit;
The Model Matching module is connected with said characteristic matching module, is used to travel through the model database that is provided with in advance, at the reference model that said model data library lookup hits, confirms whether said website is fishing website according to said reference feature of hitting.
Preferably; Said page info comprises information header and imformosome, and said page acquisition module also is used for information header and the imformosome according to the said page info of said website, analyzes said page info and has the redirect relation; Confirm the URL of purpose website; Obtain the page info of said purpose website, obtain said final page surface information according to the page info of said purpose website, said purpose website is for concerning the website after the redirect according to the redirect that said Website page information had.
Preferably; Said page acquisition module also is used for information header and the imformosome according to the page info of said website; Analyze said page info and have inclusion relation; Confirm the URL of each subpage frame in the said inclusion relation, obtain the page info of said subpage frame, obtain said final page surface information according to the page info of said subpage frame.
Preferably, said page info is described through html document, and said html document comprises the label of forming said website, and said html document comprises polytype page info,
Said page info processing module comprises:
Analyzing sub-module is used for the html document that comprises said page info is resolved to dom tree, and said dom tree comprises multiple node, one type page info in the corresponding said html document of each node difference;
Set up the index submodule, be connected, be used for setting up index according to the label of the page info of said html document each type with said analyzing sub-module.
Preferably, also comprise:
Property data base is set up module; Be connected with said characteristic matching module, be used to set up said fishing website property data base, said property data base is set up the characteristic that module specifically is used to extract known fishing website; Characteristic frequency that in said fishing website, occurs and the proportion that accounts for according to said known fishing website; Confirm the corresponding weights of characteristic of said known fishing website, store the characteristic of said known fishing website and the weights of said correspondence into said property data base, set up said property data base; The said said reference feature that is characterized as that stores in the said property data base, the weights of said correspondence are the weights of said reference feature.
Preferably; Comprise that also model database sets up module; Setting up module with said Model Matching module with said property data base is connected; Be used for said reference feature is classified according to the fishing website type, the said reference feature of each type is formed a said reference model, set up the model database of said fishing website according to said reference model.
Compared with prior art; The present invention has the following advantages: the present invention carries out overall merit to the page info of website; Result through estimating judges whether this website is fishing website; Avoid the wrong report phenomenon that causes owing to the judgement of ignoring in the existing detection method Website page information, improve the judging nicety rate of fishing website.Because being described in the configuration file of characteristic and model, the system that makes has extensibility preferably, can respond fast emerging fishing website in addition.
Description of drawings
Fig. 1 is the counterfeit true website of a fishing website sketch map in the prior art;
Fig. 2 is a fishing website detection method flow chart provided by the invention;
Fig. 3 is fishing website detection method flow chart provided by the invention in the practice;
Fig. 4 obtains the flow chart of page info for checkout gear of the present invention;
Fig. 5 is that website A redirect and the inclusion relation that the present invention relates to are analyzed sketch map;
Fig. 6 is the structural representation of the dom tree that the present invention relates to;
Fig. 7 is the inventive method practical implementation flow chart;
Fig. 8 is the sketch map of the page B view that the present invention relates to;
Fig. 9 is the structural representation of a kind of fishing website checkout gear provided by the invention.
Embodiment
In the technical scheme provided by the invention; Its core concept is for to analyze through the final page surface information to the website; Then extract the final page surface information of website and it is mated with property data base and the characteristic in the model database and the reference model of foundation in advance; According to the characteristic and the reference model of coupling, judge whether this website is fishing website.
Particularly, fishing website detection method provided by the invention, as shown in Figure 2, may further comprise the steps:
Step 201, checkout gear obtains the final page surface information of website according to the uniform resource position mark URL request that receives, and wherein said page info comprises information header and imformosome,
The said final page surface information that obtains the website further comprises:
According to the information header and the imformosome of said Website page information, analyze said page info and have the redirect relation;
Confirm the URL of purpose website, obtain the page info of said purpose website, obtain said final page surface information according to the page info of said purpose website, said purpose website is for concerning the website after the redirect according to the redirect that said Website page information had.Wherein said analysis page info has the redirect relation, comprises that analyzing said website has the redirect through meta label or 30X message.
The said final page surface information that obtains the website further comprises:
According to the information header and the imformosome of the page info of said website, analyze in the said page info and have inclusion relation;
Confirm the URL of each subpage frame in the said inclusion relation, obtain the page info of said subpage frame, obtain said final page surface information according to the page info of said subpage frame.Wherein said page info is described through the HTML html document; Said html document comprises the label of forming said website; Said analysis page info has inclusion relation, comprises analyzing having the subpage frame that comprises through iframe or frameset label.
Need to prove that if the page info of said website does not have redirect relation and inclusion relation, then the page info of said website is the final page surface information.
If, then obtain the final page surface information, particularly according to the redirect relation and/or the relation that comprises to further comprising the redirect relation after the page info analysis of obtaining the website and/or comprising relation:
Analysis page info when the website only has the redirect relation, then obtains the final page surface information according to the redirect relation, or
Analysis page info when the website only has inclusion relation, obtains the final page surface information according to inclusion relation; Or
Comprise inclusion relation again when the analysis page info of website promptly comprises the redirect relation, then analyze page info and analyze the redirect relation earlier, analyze the inclusion relation of the page info after the redirect again, obtain the final page surface information according to the inclusion relation of the page info after the redirect; Or analyze earlier page info and have inclusion relation, analyze the redirect relation after the inclusion relation again, obtain the final page surface information according to the relation of the redirect after the inclusion relation.
For above-mentioned redirect relation or the more Website page of inclusion relation; No matter be to analyze inclusion relation earlier, analyze the redirect relation after the inclusion relation again, still analyze the redirect relation earlier; Analyze the inclusion relation after the redirect again; Thereby the technical scheme of obtaining the final page surface information is described all comparatively loaded down with trivial details, therefore this is not described in detail in the present invention, but this technical scheme still belongs to protection range of the present invention.
Step 202; For can be in follow-up matching process the characteristic of rapid extraction final page surface information, the html document that needs to describe the final page surface information converts dom tree shape structure into, and dom tree is set up search index; Through search index, from dom tree, obtain the characteristic that needs.
This step comprises: said final page surface information is resolved to the dom tree structure, and in said dom tree, set up index according to the label of said final page surface information.Wherein said page info is described through html document, and said html document comprises the label of forming said website, and said html document comprises polytype page info, particularly:
Said dom tree comprises multiple node, one type page info in the corresponding said html document of each node difference;
Label according to the page info of each type in the said html document is set up index.
Step 203; According to the index of said foundation, extract the characteristic of page info described in the said dom tree, the property data base that traversal is provided with in advance; The characteristic of said final page surface information and the reference feature in the said data characteristics storehouse are mated the said reference feature that output is hit;
For this step can be used in practical application, before step 203, also comprise and set up said property data base in advance.Through setting up property data base, the reference feature in the traversal property data base is mated with the characteristic of final page surface information, the reference feature that output is hit.
Saidly set up said property data base and specifically comprise:
Extract the characteristic of known fishing website;
According to the characteristic frequency that in said fishing website, occurs and the proportion that accounts for of said known fishing website, confirm the corresponding weights of characteristic of said known fishing website;
Store the characteristic of said known fishing website and the weights of said correspondence into said property data base; Set up said property data base; The said said reference feature that is characterized as that stores in the said property data base, the weights of said correspondence are the weights of said reference feature.
Step 204, weights corresponding according to above-mentioned reference feature of hitting and said reference feature, the model database that traversal is provided with in advance in said model database is searched the reference model that hits.Situation for the reference model in the corresponding a plurality of model databases of the reference feature of hitting needs the relatively weights of reference feature, chooses and the immediate reference model in said website, and confirms whether said website is fishing website.
Wherein, can be able in actual use use, before step 204, also should set up the reference model storehouse in advance, specifically comprise for making this step:
Said reference feature is classified according to the fishing website type;
The said reference feature of each type is formed a said reference model;
Set up the model database of said fishing website according to said reference model.
The present invention carries out overall merit through the page info to the website; And judge according to the result who estimates whether this website is fishing website; Avoid the wrong report phenomenon that causes owing to the judgement of ignoring in the existing detection method Website page information, improve the judging nicety rate of fishing website.
Based on above-mentioned implementation method of the present invention, this method detects the workflow of fishing website in practice, and is specifically as shown in Figure 3, may further comprise the steps:
Step 300, the user imports the URL of the website that needs detection, and request detects this website.
Step 301, checkout gear obtains the page info of website according to the URL request that receives,
As shown in Figure 4, in this step, further comprise:
Step 3011, checkout gear receive the website URL detection request that the user sends;
Step 3012, according to the requirement of http protocol, checkout gear sends GET request message, the page info of this website of acquisition request to Internet WEB;
Step 3013, WEB return response message after receiving this GET request message, comprise the page info of this website in this response message.
Particularly; Page info is through HTML (HyperText Markup Language; HTML) document is described; Html document comprises the descriptive text of being made up of the HTML order, and the descriptive text of forming through the HTML order describes the literal in the website, figure, animation, sound, form, link.
The html document structure then comprises head (Head), main body (Body); The head of html document is used to describe the required information of browser; The information header of corresponding page info comprises the agreement name, host name, port, object path of this page info etc., main body comprise the particular content that will explain; The imformosome of corresponding page info, the HTML code when checking source file like Website page.Html document comprises a series of labels of forming the website; Form the html text document through these labels; Some label is used to explain how the page of website is formatd, and some label is used for the literal display mode of instruction page, also has some labels to be provided at the information that does not show on the page.Html tag all occurs with even numbers, and one of them label is used for using, and another label is used to close this use.
For example<hx>Be heading label, be used to represent the title of document,, be divided into first order document according to the rank of title<h1>, second level document<h2>, each heading label collocation has an end-tag that is used to finish to use this label, as</h1>,</h2>
Title like first order document is " physical culture ", and second level Document Title is " Olympic Games ", and then the title of first order document is described as in the html document:<h1>Physical culture</h1>
<h2>The Olympic Games</h2>
Html document comprises following label:
Basic label is used to create html document, and Document Title and other information that on the WEB webpage, does not show are set, and the visible part of document is set;
Heading label is used to be provided with the exercise question of document;
Document integrity attribute label is used for the integrity attribute of document is provided with, as background color, text color, link color are set, the color of the color of the link used, the link of being hit etc.;
Text label is used for the text of webpage is edited, as create text, be provided with header size, font type and color, establishment is quoted etc.;
Link label is used to carry out the operation relevant with web page interlinkage, sends link automatically as creating hyperlink, Email, creates and be positioned at the inner target position of document, create and point to the link that is positioned at document inside target position etc.;
The format setting label is used for webpage is set type, as create new paragraph, be provided with paragraph alignment thereof, insert the new line symbol, create the definition tabulation;
The graphic element label is used for webpage is provided with figure, as adding image, aligned image, establishing framing mask etc.;
The form label is used for webpage is provided with form, comprises each row, each form in the beginning delegation of creating in form, the beginning form, that form is set is first-class;
The form attributes label is used for the specific object of form is provided with, comprise table border is set, form space size is set, the form width is set, be provided with form alignment thereof, columns that form accounts for and the line number that accounts for etc. are set;
The window frame label is used for window frame is carried out corresponding operating, comprises the position that window frame is set, and definition window frame line number and columns, definition window frame be regional, be defined in and how show in the browser of not supporting window frame etc.,
The window frame attribute tags; Be used for the specific object of window frame is provided with, comprise being provided with that window frame shows what html document, name window frame or zone so that other window frame can point to it, the blank size in definition window frame edge, window frame is set whether scrolling bar is arranged, is provided with and forbids that the user adjusts window frame size etc.
Step 302 according to the page info of this website, is analyzed the redirect relation of this Website page.Redirect is meant through certain action or code and makes the page of a website enter into the process of another page automatically.Page info according to this website; The redirect relation that this Website page information of checkout gear analysis has; Judge in the html document of page info of this website whether comprise the redirect label, as whether the redirect through meta label (a kind of label that is used for describing the html document attribute) or 30X message is arranged.If comprise the redirect label, then carry out redirect according to the redirect label, change step 303 after the redirect; If do not comprise the redirect label, then do not carry out redirect, change step 304.
Step 303 is called the website, source with this website, and the website after this website redirect is called the purpose website, the URL of purpose website after the calculating redirect.This checkout gear calculates the URL of purpose website according to the URL of redirect label and website, source, then changes step 301, obtains the page info of purpose website through step 301.
Step 304 is analyzed the inclusion relation of this website.Checkout gear is through whether having the inclusion relation that link label is confirmed this website in the html document of analyzing this Website page information, and is nested as analyzing through iframe (document in the document) or frameset (frame element container) label.If have link label, then this website comprises other subpage frames, changes step 305; If do not have link label, then this website does not comprise other subpage frames, then changes step 306.
Step 305 is called parent page with the page of this website, and the link page that then this parent page comprised is called subpage frame, is set membership between parent page and the subpage frame, preserves the set membership of this parent page and subpage frame, calculates the URL of subpage frame.Change step 301.Obtain the page info of subpage frame through step 301.If this website has a plurality of subpage frames, then need calculate the URL of each subpage frame respectively, obtain the page info of all subpage frames.
Step 306 according to the set membership of Website page after the redirect relation of this Website page and the redirect, is calculated the final page surface information.
Particularly, said final page surface information calculates all subpage frames that obtain final jump page and this page and realizes through after the limit recurrence of step 301-305.Wherein in the redirect relation; The final page surface information comprises the page info of purpose website; In the set membership of the page, the final page surface information comprises the page info of all subpage frames of this purpose Website page, therefore after redirect; According to the set membership of the page after the redirect relation of this website and the redirect, the final page surface information that this checkout gear calculates gained comprises the page info of purpose website and the page info that this purpose Website page comprises subpage frame.
In order to understand this step better, as shown in Figure 5 for example, after website A carries out the redirect analysis; Have jump page B, jump to page B, calculate the URL of this page B; Continuation is carried out the redirect analysis to jump page B, carry out limit redirect recurrence after, confirm that final jump page is C; After jumping to page C, calculate the URL of page C, analyze this page C and have subpage frame C1, C2, C3; Obtain the page info of these subpage frames, comprise page info and subpage frame C1, the C2 of page C, the page info of C3 according to the final page surface information of redirect relation and inclusion relation gained.
Step 307 resolves to the dom tree structure with the final page surface information, and sets up tab indexes.
For can be in follow-up matching process the characteristic of rapid extraction final page surface information; Avoid directly using the html source file of html document complicacy; The html document that needs to describe the final page surface information converts dom tree shape structure into; And dom tree set up search index, through search index, from dom tree, obtain the characteristic that needs.
Because DOM is a kind of concrete application based on API (Application Programming Interface, application programming interfaces), in application program; DOM converts html document the set of an object model to, and (like JavaScript, C++ etc.) calls the document object through program language; Information in the html document is carried out access; And utilize program language that the information of obtaining is done further processing, as shown in Figure 6, the set of this object model just is called dom tree.Dom tree comprises father node and the child node that is positioned at the father node downstream; Difference according to final page surface information type in the html document; Dom tree comprises that dissimilar nodes is corresponding with it, i.e. a kind of node of the corresponding dom tree of the final page surface information of the each type of html document difference.
Node type mainly comprises:
Element: node element is the basic composition unit of html document, is used to describe the essential information of html document.
Attribute: attribute node comprises the information about node element, is used for the attribute of descriptive element.
Text: text node comprises the text message of node element and attribute node.
Document: document node is the father node of other all nodes in the entire document, and application program is operated html document through this node.
Note: the note node is used for other relevant nodes are described and note.
In dom tree, the corresponding document node of entire document, the corresponding node element of each html tag is included in the corresponding text node of text message in the html element element, the corresponding attribute node of each html attribute, the note of HTML belongs to the note node.Dom tree originates in document node, and extends the dom tree branch thus, and till extending to text node, text node is other node of lowermost level in the dom tree.Application program visits the final page surface information in the html document through the dom tree structure, and through DOM is operated, the dynamic creation html document travels through, increases, deletes or revise document content, and realization is to the operation of html document.
Because the dom tree structure is comparatively loose, all must the whole dom tree of traversal when extracting the page info of part of nodes in the dom tree at every turn, lose time.Therefore, need set up index,, dwindle the dom tree query context, save time through search index according to the label of html document.
Step 308 is mated the characteristic of final page surface information and the reference feature in the property data base, the reference feature that output is hit.
This step specifically comprises: inquire about the index of dom tree, comprise the characteristic of final page surface information according to respective nodes in the index extraction dom tree; Reference feature in the property data base is dumped in the internal memory of checkout gear, the characteristic and the reference feature that dumps in the internal memory of final page surface information are mated, export the reference feature of hitting, and export the weights of the reference feature of hitting.
For the characteristic and the reference feature that make the final page surface information are mated, before this step, also comprise: set up property data base.Particularly,
Setting up property data base is exactly through extracting the URL of fishing websites all in the blacklist, extracting the characteristic of these fishing website page infos.Wherein blacklist is to comprise the existing url list of detected fishing website, and corresponding with blacklist is white list, and white list is the url list that comprises existing true website;
In addition, according to the frequency and the amount of information proportion of these characteristics in the website that the fishing website characteristic occurs, confirm the weights of this characteristic; These characteristics and corresponding weights are stored in the property data base according to ordering, set up property data base, these characteristics are exactly the reference feature of mating; The weights of fishing website characteristic are exactly the weights of this reference feature; The frequency that reference feature occurs is high more, and amount of information shared in fishing website is big more, and then the weights of this reference feature are just high more; For each reference feature is provided with an id ident value, reference feature is sorted according to the id ident value.
For example like certain website A, with reference feature (a1, a2, a3, a4, a5) carry out one by one relatively after, the reference feature that website A matees and be with weights that these reference feature are complementary:
{(a1,6),(a2,1),(a3,4),(a4,8),(a5,0)}
Hence one can see that, and in the reference feature of website A coupling, the maximum reference feature of weights is a4, and weights are 8, secondly are a1, and the weights of characteristic a5 are 0, explain less than the reference feature that is complementary with this characteristic a5.
Before step 308, possibly also comprise: blacklist/white list that traversal is provided with in advance; Website in blacklist/white list and this website are compared, if having this website in the blacklist, then this website is a fishing website; If do not have this website in the blacklist; Then this website is not existing fishing website, and this website possibly be the fishing website that does not have in the blacklist to put down in writing, or is true website; If have this website in the white list, then this website is true website, if do not have this website in the white list, then this website is not existing true website, and this website possibly be the true website of not putting down in writing in the white list, or is fishing website.Through traveling through the blacklist/white list that is provided with in advance; If this website is existing true website or a fishing website on blacklist/white list; Then not needing to carry out follow-up reference feature coupling and reference model matees and judges that whether this website is fishing website, has simplified operating procedure again.Need to prove, blacklist/white list that traversal is provided with in advance, the step that the website in blacklist/white list and this website are compared belongs to the present invention program's optional step, and not having this step does not influence protection scope of the present invention.
After the reference feature that output is hit; Also comprise: the reference feature id ident value that hits is input in the Hash table corresponding with model database; Quote when supplying Model Matching, Hash table is used for setting up the reference feature of property data base and the corresponding relation of the reference model in the model database.
Step 309 according to the reference feature of hitting, is selected and the immediate reference model in this website, the relatively similarity of the characteristic relation of this website model and reference model.This step specifically comprises: the id ident value in the inquiry Hash table; Obtain the corresponding reference feature of id ident value; According to reference feature of hitting and the corresponding weights of reference feature; Choose in the model database and the maximally related reference model of these reference feature, the relatively similarity of this reference model and this website.
For this step can be used in reality; Before this step, also comprise and set up model database; Specifically comprise: dissimilar according to fishing website, a kind of reference model corresponding with this type of fishing website is set respectively, this reference model comprises the reference feature of such fishing website.All these reference models that comprise a certain type of fishing website reference feature are stored in the model database, set up model database.The reference feature that reference model comprises can be single reference feature or one group of reference feature, also can comprise reference feature carry out with or the reference feature set that calculate to form.
For making the public be more readily understood the present invention, this place is that example is explained with above-mentioned instance still, if comprise reference model 1 and reference model 2 in the model database; Wherein the reference feature that comprises of reference model 1 for (a1, a2, a3); The reference feature that reference model 2 comprises for (a2, a3, a4); Because the characteristic of website A (a2, weights a3) are identical, and the weights of a1 are less than the weights of a4; Therefore reference model 2 matees with the model of website A the most, selects for use reference model 2 and website A to compare, relatively individual features relation among reference model 2 and the website A.If the reference feature that reference model 2 comprises is (a2, a3, a4 or a5), because the characteristic of this website has been hit reference feature a2, a3, a4, a4 and a5 be or concern, though so do not hit reference feature a5, reference model 2 is still thought the reference model that hits.
Step 310 according to the reference model of coupling, judges that this website is a fishing website.If the reference feature that hit this website has the reference model that correspondence is hit in model database, then this website is a fishing website.
In addition, owing to when setting up property data base, reference feature and corresponding weights are described through configuration file, the configuration file stores that will comprise reference feature and corresponding weights is in characteristic.When setting up model database, reference model is described through configuration file, the configuration file stores that will comprise reference model is in model database.Therefore; Through configuration file reference feature and reference model are described, through new configuration file more property data base and model database are upgraded, the system that makes has extensibility preferably; Therefore, can respond fast emerging fishing website.
Need to prove that the present invention is for carry out the website recurrence endless loop that recurrence causes owing to inclusion relation or recurrence relation, inclusion relation and recurrence relation in the page info of website and can't confirm that the final page surface information do not consider at this.
In order to understand the inventive method better, be example then below with the fishing website of counterfeit QQ website, the detection method of fishing website is described, as shown in Figure 7, may further comprise the steps:
Step 701, the user is to the URL of checkout gear input website A;
After step 702, checkout gear receive the URL of website A, send GET solicited message, the page info of acquisition request website A to the Internet;
Step 703, checkout gear receives the POST information that includes website A page info that the Internet returns;
Step 704, checkout gear are according to this page info, and the redirect of analyzing web site A concerns;
Step 705, the checkout gear analysis result has the redirect through the meta label for website A, jumps to page B according to the meta label,
Step 706 is proceeded the redirect analysis to page B, and page B does not have redirect.
Step 707, the inclusion relation of analysis page B, analysis result has 1 sub-pages B1 for page B,
Step 708 is calculated the URL value of subpage frame B1, and obtains the page info of subpage frame B1.Subpage frame B1 is carried out after recurrence redirect analysis and inclusion relation analyze, and analysis result is that subpage frame B1 does not have jump page and other subpage frames.
Step 709, the final page surface information that the page info of the page info of subpage frame B1 and page B is formed resolves to the dom tree structure, and sets up tab indexes.
Step 710, the characteristic of extraction final page surface information is mated the reference feature in these characteristics and the property data base characteristic that output is hit.
In order to understand this step better, the process of this step being carried out characteristic matching below in conjunction with accompanying drawing is described in detail, and is illustrated in figure 8 as the sketch map of page B, by seeing that page-out B and subpage frame thereof comprise 5 characteristics, comprising among the figure:
1) with the same title " QQ invites you to add " in QQ website;
2) with similar " please import the QQ number " text in QQ website, and the QQ number input frame after the text;
3) with similar " please input password " text in QQ website, and the password input frame after the text;
4) indicate with the same QQ penguin in QQ website;
5) mineral water advertisement.
If the reference feature in the database comprises:
1. the title similar or same with the QQ website, weights are 5;
2. " please import QQ number " text similar or same, and the QQ number input frame after the text with the QQ website, weights are 8;
3. " please input password " text similar or same, and the password input frame after the text with the QQ website, weights are 8;
4. the QQ penguin sign similar or same with the QQ website, weights are 4;
5. the QQ game money sign similar or same with the QQ website, weights are 4;
6. the QQ recreation head portrait sign similar or same with the QQ website, weights are 4.
The reference feature of hitting is 1,2,3,4.
Step 711, the reference model in the traversal model database finds reference model 1, and reference model 1 comprises that the set of reference feature is (1,2,3,4 or 5 or 6), and final page surface information and reference model 1 coupling confirm that website A is a fishing website.
The present invention carries out overall merit to the page info of website; Result through estimating judges whether this website is fishing website; Avoid the wrong report phenomenon that causes owing to the judgement of ignoring in the existing detection method Website page information, improve the judging nicety rate of fishing website.
The present invention also provides a kind of checkout gear of the detection fishing website corresponding with the inventive method, and is as shown in Figure 9, comprising:
Page acquisition module 910 is used for obtaining the final page surface information of website according to the URL request that receives;
Said page info comprises information header and imformosome, and said page acquisition module 910 also is used for information header and the imformosome according to the page info of said website, analyze said page info have redirect relation with or comprise relation,
When said analysis page info has redirect when concerning; Confirm the URL of said purpose website; Said page acquisition module 910 is used to obtain the page info of purpose website; Page info according to said purpose website obtains said final page surface information, and said purpose website is for concerning the website after the redirect according to the redirect that said Website page information had.
When having, said analysis page info comprises when concerning; Said page acquisition module 910 is used for information header and the imformosome according to the page info of said website; Analyze said page info and have inclusion relation; Confirm the URL of each subpage frame in the said inclusion relation, obtain the page info of said subpage frame, obtain said final page surface information according to the page info of said subpage frame.
Page info processing module 920 is connected with said page acquisition module 910, is used for said final page surface information is resolved to the dom tree structure, and in said dom tree, sets up index according to the label of said final page surface information;
Said page info is described through html document, and said html document comprises polytype page info, and said html document comprises the label of forming said website, and wherein the form of label and content are explained owing to the front, do not give unnecessary details at this.
Said page info processing module 920 further comprises:
Analyzing sub-module 921 is used for the said html document of said page info that comprises is resolved to dom tree, and said dom tree comprises multiple node, one type page info in the corresponding said html document of each node difference;
Set up index submodule 922, be connected, be used for setting up index according to the label of the page info of said html document each type with said analyzing sub-module 921.
Characteristic matching module 930; Be connected with said page info processing module 920; Be used for index, extract the characteristic of final page surface information described in the said dom tree, the property data base that traversal is provided with in advance according to said foundation; The characteristic of said final page surface information and the reference feature in the said property data base are mated the said reference feature that output is hit;
Said characteristic matching module 930 also is used to export the weights of said reference feature of hitting, and said Model Matching module 940 is searched the reference model that hits according to said reference feature and the corresponding weights of said reference feature in said model database.
Model Matching module 940 is connected with said characteristic matching module 930, is used to travel through the model database that is provided with in advance, at the reference model that said model data library lookup hits, confirms whether said website is fishing website according to said reference feature of hitting.
The id ident value of said reference feature of hitting is put into said Hash table; The corresponding id ident value of each reference feature in the said property data base; Said Hash table is used for the id ident value of said reference feature corresponding with said reference model; The said Hash table of Model Matching module 940 inquiries obtains the corresponding reference feature of said id ident value.
Said device also comprises:
Property data base is set up module 950; Be connected with said characteristic matching module 930; Be used to set up said fishing website property data base; Said property data base is set up the characteristic that module 950 specifically is used to extract known fishing website, according to the characteristic frequency that in said fishing website, occurs and the proportion that accounts for of said known fishing website, confirms the corresponding weights of characteristic of said known fishing website; Store the characteristic of said known fishing website and the weights of said correspondence into said property data base; Set up said property data base, the said said reference feature that is characterized as that stores in the said property data base, the weights of said correspondence are the weights of said reference feature.
Said device also comprises:
Model database is set up module 960; Setting up module 950 with said Model Matching module 940 and said property data base is connected; Be used for said reference feature is classified according to the fishing website type; The said reference feature of each type is formed a said reference model, set up the model database of said fishing website according to said reference model.
The present invention carries out overall merit to the page info of website; Result through estimating judges whether this website is fishing website; Avoid the wrong report phenomenon that causes owing to the judgement of ignoring in the existing detection method Website page information, improve the judging nicety rate of fishing website.
Through the description of above execution mode, those skilled in the art can be well understood to the present invention and can realize by the mode that software adds essential general hardware platform, can certainly pass through hardware, but the former is better execution mode under a lot of situation.Based on such understanding; The part that technical scheme of the present invention contributes to prior art in essence in other words can be come out with the embodied of software product; This computer software product is stored in the storage medium; Comprise some instructions with so that computer equipment (can be personal computer, server, the perhaps network equipment etc.) carry out the described method of each embodiment of the present invention.
More than disclosedly be merely several specific embodiment of the present invention, still, the present invention is not limited thereto, any those skilled in the art can think variation all should fall into protection scope of the present invention.

Claims (14)

1. the detection method of a fishing website is characterized in that, may further comprise the steps:
A. according to the uniform resource position mark URL request that receives, obtain the final page surface information of website;
B. said final page surface information is resolved to DOM Document Object Model dom tree structure, and in said dom tree, set up index according to the label of said final page surface information;
C. according to the index of said foundation; Extract the characteristic of final page surface information described in the said dom tree; The property data base that traversal is provided with in advance matees the characteristic of said final page surface information and the reference feature in the said property data base, the reference feature that output is hit;
D. travel through the model database that is provided with in advance, at the reference model that said model data library lookup hits, confirm whether said website is fishing website according to said reference feature of hitting.
2. the method for claim 1 is characterized in that, page info comprises information header and imformosome,
Said step a further comprises:
According to the information header and the imformosome of the page info of said website, analyze said page info and have the redirect relation;
Confirm the URL of purpose website, obtain the page info of said purpose website, obtain said final page surface information according to the page info of said purpose website, said purpose website is for concerning the website after the redirect according to the redirect that said Website page information had.
3. method as claimed in claim 2 is characterized in that, said analysis page info has the redirect relation, comprises that analyzing said website has the redirect through meta label or 30X message.
4. according to claim 1 or claim 2 method is characterized in that said step a further comprises:
According to the information header and the imformosome of the page info of said website, analyze in the said page info and have inclusion relation;
Confirm the URL of each subpage frame in the said inclusion relation, obtain the page info of said subpage frame, obtain said final page surface information according to the page info of said subpage frame.
5. method as claimed in claim 4 is characterized in that, said page info is described through the HTML html document, and said html document comprises the label of forming said website,
Said analysis page info has inclusion relation, comprises analyzing having the subpage frame that comprises through iframe or frameset label.
6. the method for claim 1 is characterized in that, said page info is described through html document, and said html document comprises the label of forming said website, and said html document comprises polytype page info,
Said step b further comprises:
The html document that will comprise said page info resolves to dom tree, and said dom tree comprises multiple node, one type page info in the corresponding said html document of each node difference;
Label according to the page info of each type in the said html document is set up index.
7. the method for claim 1 is characterized in that, before said step c, also comprises:
Set up said property data base, this step specifically comprises:
Extract the characteristic of known fishing website;
According to the characteristic frequency that in said fishing website, occurs and the proportion that accounts for of said known fishing website, confirm the corresponding weights of characteristic of said known fishing website;
Store the characteristic of said known fishing website and the weights of said correspondence into said property data base; Set up said property data base; The said said reference feature that is characterized as that stores in the said property data base, the weights of said correspondence are the weights of said reference feature.
8. the method for claim 1 is characterized in that, before said steps d, also comprises:
Said reference feature is classified according to the fishing website type;
The said reference feature of each type is formed a said reference model;
Set up the model database of said fishing website according to said reference model.
9. the checkout gear of a fishing website is characterized in that, comprising:
Page acquisition module is used for obtaining the final page surface information of website according to the URL request that receives;
The page info processing module is connected with said page acquisition module, is used for said final page surface information is resolved to the dom tree structure, and in said dom tree, sets up index according to the label of said final page surface information;
Characteristic matching module; Be connected with said page info processing module; Be used for index, extract the characteristic of final page surface information described in the said dom tree, the property data base that traversal is provided with in advance according to said foundation; The characteristic of said final page surface information and the reference feature in the said property data base are mated the said reference feature that output is hit;
The Model Matching module is connected with said characteristic matching module, is used to travel through the model database that is provided with in advance, at the reference model that said model data library lookup hits, confirms whether said website is fishing website according to said reference feature of hitting.
10. device as claimed in claim 9; It is characterized in that page info comprises information header and imformosome, said page acquisition module; Also be used for information header and imformosome according to the page info of said website; Analyze said page info and have the redirect relation, confirm the URL of purpose website, obtain the page info of said purpose website; Page info according to said purpose website obtains said final page surface information, and said purpose website is for concerning the website after the redirect according to the redirect that said Website page information had.
11. like claim 9 or 10 described devices; It is characterized in that said page acquisition module also is used for information header and imformosome according to the page info of said website; Analyze said page info and have inclusion relation; Confirm the URL of each subpage frame in the said inclusion relation, obtain the page info of said subpage frame, obtain said final page surface information according to the page info of said subpage frame.
12. device as claimed in claim 9 is characterized in that, said page info is described through html document, and said html document comprises the label of forming said website, and said html document comprises polytype page info,
Said page info processing module comprises:
Analyzing sub-module is used for the html document that comprises said page info is resolved to dom tree, and said dom tree comprises multiple node, one type page info in the corresponding said html document of each node difference;
Set up the index submodule, be connected, be used for setting up index according to the label of the page info of said html document each type with said analyzing sub-module.
13. device as claimed in claim 9 is characterized in that, also comprises:
Property data base is set up module; Be connected with said characteristic matching module, be used to set up said fishing website property data base, said property data base is set up the characteristic that module specifically is used to extract known fishing website; Characteristic frequency that in said fishing website, occurs and the proportion that accounts for according to said known fishing website; Confirm the corresponding weights of characteristic of said known fishing website, store the characteristic of said known fishing website and the weights of said correspondence into said property data base, set up said property data base; The said said reference feature that is characterized as that stores in the said property data base, the weights of said correspondence are the weights of said reference feature.
14. device as claimed in claim 13; It is characterized in that; Comprise that also model database sets up module, set up module with said Model Matching module with said property data base and be connected, be used for said reference feature is classified according to the fishing website type; The said reference feature of each type is formed a said reference model, set up the model database of said fishing website according to said reference model.
CN2009101065591A 2009-04-14 2009-04-14 Detecting method and a device for fishing website Active CN101534306B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2009101065591A CN101534306B (en) 2009-04-14 2009-04-14 Detecting method and a device for fishing website

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2009101065591A CN101534306B (en) 2009-04-14 2009-04-14 Detecting method and a device for fishing website

Publications (2)

Publication Number Publication Date
CN101534306A CN101534306A (en) 2009-09-16
CN101534306B true CN101534306B (en) 2012-01-11

Family

ID=41104694

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2009101065591A Active CN101534306B (en) 2009-04-14 2009-04-14 Detecting method and a device for fishing website

Country Status (1)

Country Link
CN (1) CN101534306B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102737183A (en) * 2012-06-12 2012-10-17 腾讯科技(深圳)有限公司 Method and device for webpage safety access

Families Citing this family (64)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101510887B (en) * 2009-03-27 2012-01-25 腾讯科技(深圳)有限公司 Method and device for identifying website
CN101820366B (en) * 2010-01-27 2012-09-05 南京邮电大学 Pre-fetching-based fishing web page detection method
CN101968813B (en) * 2010-10-25 2012-05-23 华北电力大学 Method for detecting counterfeit webpage
CN102063484B (en) * 2010-12-29 2013-04-10 北京安天电子设备有限公司 Discovery method and device of third-party WEB application program
CN102082792A (en) 2010-12-31 2011-06-01 成都市华为赛门铁克科技有限公司 Phishing webpage detection method and device
CN102902686A (en) * 2011-07-27 2013-01-30 腾讯科技(深圳)有限公司 Web page detection method and system
CN102316099B (en) * 2011-07-28 2014-10-22 中国科学院计算机网络信息中心 Network fishing detection method and apparatus thereof
CN102957664B (en) * 2011-08-17 2015-10-14 阿里巴巴集团控股有限公司 A kind of method and device identifying fishing website
CN102314494B (en) * 2011-08-24 2014-04-02 百度在线网络技术(北京)有限公司 Method and equipment for processing webpage contents
CN102340428B (en) * 2011-09-29 2014-01-15 哈尔滨安天科技股份有限公司 URL (Uniform Resource Locator) detection and interception method and system based on network packet loss
CN103106576A (en) * 2011-11-15 2013-05-15 腾讯科技(深圳)有限公司 Payment method and payment system based on client side and payment client side
CN103136251A (en) * 2011-11-29 2013-06-05 星云融创(北京)科技有限公司 Method and device of webpage identification
CN102523210B (en) * 2011-12-06 2014-11-05 中国科学院计算机网络信息中心 Phishing website detection method and device
CN103179095B (en) * 2011-12-22 2016-03-30 阿里巴巴集团控股有限公司 A kind of method and client terminal device detecting fishing website
CN102571783A (en) * 2011-12-29 2012-07-11 北京神州绿盟信息安全科技股份有限公司 Phishing website detection method, device and system as well as website
CN104077353B (en) * 2011-12-30 2017-08-25 北京奇虎科技有限公司 A kind of method and device of detecting black chain
CN102436563B (en) * 2011-12-30 2014-07-09 奇智软件(北京)有限公司 Method and device for detecting page tampering
CN102647408A (en) * 2012-02-27 2012-08-22 珠海市君天电子科技有限公司 Method for judging phishing website based on content analysis
CN102638448A (en) * 2012-02-27 2012-08-15 珠海市君天电子科技有限公司 Method for judging phishing websites based on non-content analysis
CN102663291B (en) * 2012-03-23 2015-02-25 北京奇虎科技有限公司 Information prompting method and information prompting device for e-mails
CN102682097A (en) * 2012-04-27 2012-09-19 北京神州绿盟信息安全科技股份有限公司 Method and equipment for detecting secrete links in web page
CN103428186A (en) * 2012-05-24 2013-12-04 中国移动通信集团公司 Method and device for detecting phishing website
CN103457924B (en) * 2012-06-05 2016-08-03 珠海市君天电子科技有限公司 Detect the method and system of coming into force property type fishing website point-to-point, instantaneous
CN102724186B (en) * 2012-06-06 2015-10-21 珠海市君天电子科技有限公司 Phishing website detection system and detection method
CN102710645B (en) * 2012-06-06 2015-10-21 珠海市君天电子科技有限公司 Phishing website detection method and detection system thereof
CN102710646B (en) * 2012-06-06 2016-08-03 珠海市君天电子科技有限公司 Method and system for collecting phishing websites
CN103546446B (en) 2012-07-17 2015-03-25 腾讯科技(深圳)有限公司 Phishing website detection method, device and terminal
CN102769632A (en) * 2012-07-30 2012-11-07 珠海市君天电子科技有限公司 Method and system for grading detection and prompt of fishing website
CN103685158A (en) * 2012-09-04 2014-03-26 珠海市君天电子科技有限公司 accurate collection method and system based on phishing website propagation
CN103685157A (en) * 2012-09-04 2014-03-26 珠海市君天电子科技有限公司 Method and system for collecting phishing websites based on payment
CN102902792B (en) * 2012-09-29 2015-10-21 北京奇虎科技有限公司 list page identification system and method
CN103812673A (en) * 2012-11-07 2014-05-21 江苏仕德伟网络科技股份有限公司 Method for automatically recognizing multiple IP changes in website
CN102945286A (en) * 2012-11-27 2013-02-27 深圳中兴网信科技有限公司 Data index device and data index method
CN104125258B (en) * 2013-04-28 2016-03-30 腾讯科技(深圳)有限公司 Method for page jump, terminal, server and system
CN104144146B (en) * 2013-05-10 2017-11-03 中国电信股份有限公司 A kind of method and system of access website
CN103279455B (en) * 2013-06-28 2016-06-01 中国农业银行股份有限公司 The pattern treatment process of electrical form and device
CN103401835A (en) * 2013-07-01 2013-11-20 北京奇虎科技有限公司 Method and device for presenting safety detection results of microblog page
CN103442014A (en) * 2013-09-03 2013-12-11 中国科学院信息工程研究所 Method and system for automatic detection of suspected counterfeit websites
CN103577547B (en) * 2013-10-12 2017-11-10 优视科技有限公司 Webpage type identification method and device
CN103685289B (en) * 2013-12-19 2017-02-08 北京奇虎科技有限公司 Method and device for detecting phishing website
CN103685307B (en) * 2013-12-25 2017-08-11 北京奇虎科技有限公司 The method and system of feature based storehouse detection fishing fraud webpage, client, server
CN103685308B (en) * 2013-12-25 2017-04-26 北京奇虎科技有限公司 Detection method and system of phishing web pages, client and server
CN104852883A (en) * 2014-02-14 2015-08-19 腾讯科技(深圳)有限公司 Method and system for protecting safety of account information
CN104980404B (en) * 2014-04-10 2020-04-14 腾讯科技(深圳)有限公司 Method and system for protecting account information security
CN104503962B (en) * 2014-06-18 2017-11-03 北京邮电大学 A kind of dark chain detection method of webpage
CN104301299B (en) * 2014-08-04 2018-10-23 北京奇虎科技有限公司 A kind of method and apparatus detecting the website that there is fishing risk of fraud
CN104301300B (en) * 2014-08-04 2018-10-30 北京奇虎科技有限公司 A kind of method, client and the system of detection phishing scam risk
CN104158828B (en) * 2014-09-05 2018-05-18 北京奇虎科技有限公司 The method and system of suspicious fishing webpage are identified based on cloud content rule base
CN104268269A (en) * 2014-10-13 2015-01-07 宁波公众信息产业有限公司 Database operating method
CN104239582A (en) * 2014-10-14 2014-12-24 北京奇虎科技有限公司 Method and device for identifying phishing webpage based on feature vector model
CN104965901A (en) * 2015-06-30 2015-10-07 北京奇虎科技有限公司 Method and apparatus for grabbing content of target page
CN105025017A (en) * 2015-07-03 2015-11-04 汉柏科技有限公司 Horse hanging prevention method based on firewall, and firewall
CN105069169B (en) * 2015-08-31 2019-03-05 国家计算机网络与信息安全管理中心 A kind of detection method and device of website mirroring
CN107491453B (en) * 2016-06-13 2022-09-02 北京搜狗科技发展有限公司 Method and device for identifying cheating web pages
CN107146082B (en) * 2017-05-27 2021-01-29 北京小米移动软件有限公司 Transaction record information acquisition method and device and computer readable storage medium
CN107358208B (en) * 2017-07-14 2018-07-13 北京神州泰岳软件股份有限公司 A kind of PDF document structured message extracting method and device
CN108306878A (en) * 2018-01-30 2018-07-20 平安科技(深圳)有限公司 Detection method for phishing site, device, computer equipment and storage medium
CN108304584A (en) * 2018-03-06 2018-07-20 百度在线网络技术(北京)有限公司 Illegal page detection method, apparatus, intruding detection system and storage medium
CN108650249B (en) * 2018-04-26 2021-07-27 平安科技(深圳)有限公司 POC attack detection method and device, computer equipment and storage medium
CN108600247A (en) * 2018-05-02 2018-09-28 尚谷科技(天津)有限公司 A kind of website fishing camouflage recognition methods
CN109302383B (en) * 2018-08-31 2022-04-29 平安科技(深圳)有限公司 URL monitoring method and device
CN109450844B (en) * 2018-09-18 2022-05-10 华为云计算技术有限公司 Method and device for triggering vulnerability detection
CN109284613B (en) * 2018-09-30 2020-09-22 北京神州绿盟信息安全科技股份有限公司 Method, device, equipment and storage medium for identification detection and counterfeit site detection
CN110413930B (en) * 2019-07-31 2022-03-15 杭州安恒信息技术股份有限公司 Data analysis method, device and equipment and readable storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101183415A (en) * 2007-12-19 2008-05-21 腾讯科技(深圳)有限公司 Method and device for preventing sensitive information from leakage
CN101325495A (en) * 2008-07-10 2008-12-17 华为技术有限公司 Method, apparatus and system for detecting hacker server

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101183415A (en) * 2007-12-19 2008-05-21 腾讯科技(深圳)有限公司 Method and device for preventing sensitive information from leakage
CN101325495A (en) * 2008-07-10 2008-12-17 华为技术有限公司 Method, apparatus and system for detecting hacker server

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
JP特开2007-334759A 2007.12.27
梁雪松.基于浏览器的钓鱼网站检测技术研究.《信息安全与通信保密》.2007,(第11期),第53-55页. *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102737183A (en) * 2012-06-12 2012-10-17 腾讯科技(深圳)有限公司 Method and device for webpage safety access
CN102737183B (en) * 2012-06-12 2014-08-13 腾讯科技(深圳)有限公司 Method and device for webpage safety access

Also Published As

Publication number Publication date
CN101534306A (en) 2009-09-16

Similar Documents

Publication Publication Date Title
CN101534306B (en) Detecting method and a device for fishing website
CN104766014B (en) For detecting the method and system of malice network address
CN104054055B (en) The system and method that networked devices are managed based on association between identifier
CN102708174B (en) Method and device for displaying rich media information in browser
CN103218431B (en) A kind ofly can identify the system that info web gathers automatically
CN102473190B (en) Keyword assignment to a web page
CN102436564A (en) Method and device for identifying falsified webpage
KR102026956B1 (en) System for monitoring digital works distribution
CN108566399B (en) Phishing website identification method and system
US10311120B2 (en) Method and apparatus for identifying webpage type
CN104021172A (en) Advertisement filtering method and advertisement filtering device
TW200842608A (en) System and method for related information search and presentation from user interface content
CN103210387B (en) Conjunctive word calling mechanism, information processor, conjunctive word register method and conjunctive word register system
CN106446115A (en) Mobile Internet user classification method and device
JP2010510601A (en) Method for recommending product information and system for executing the method
JP6247745B2 (en) Information processing apparatus, information processing method, and information processing program
CN105763543A (en) Phishing site identification method and device
CN101114284B (en) Method for displaying web page content relevant information and system
CN105718559A (en) Method and device for finding transforming relationship of form pages and target pages
US20180336279A1 (en) Computer-implemented methods of website analysis
CN104090923A (en) Method and device for displaying rich media information in browser
Thao et al. Hunting brand domain forgery: a scalable classification for homograph attack
CN104036190A (en) Method and device for detecting page tampering
Tabassum et al. Xiaomi invades the smartphone market in India
CN113505317A (en) Illegal advertisement identification method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant