CN106685936A - Webpage defacement detection method and apparatus - Google Patents
Webpage defacement detection method and apparatus Download PDFInfo
- Publication number
- CN106685936A CN106685936A CN201611158763.4A CN201611158763A CN106685936A CN 106685936 A CN106685936 A CN 106685936A CN 201611158763 A CN201611158763 A CN 201611158763A CN 106685936 A CN106685936 A CN 106685936A
- Authority
- CN
- China
- Prior art keywords
- webpage
- text
- detected
- website
- eigenvector
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/14—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
- H04L63/1408—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
- H04L63/1416—Event detection, e.g. attack signature detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/951—Indexing; Web crawling techniques
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/14—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
- H04L63/1441—Countermeasures against malicious traffic
- H04L63/1483—Countermeasures against malicious traffic service impersonation, e.g. phishing, pharming or web spoofing
Abstract
The invention discloses a webpage defacement detection method. The method includes the following steps: obtaining a text characteristic vector of a webpage to be detected and a text characteristic vector of a website to which the webpage to be detected belongs; calculating text similarity between the webpage to be detected and the website on the basis of the obtained text characteristic vector of the webpage to be detected and the text characteristic vector of the website; determining whether the text similarity is smaller than a preset threshold value; and if the text similarity is smaller than the preset threshold value, determining that the webpage to be detected is a defaced webpage. The invention also discloses a webpage defacement detection apparatus. The webpage defacement detection accuracy and efficiency can be improved.
Description
Technical field
The present invention relates to technical field of network security, more particularly to the detection method and device of webpage tamper.
Background technology
Webpage tamper is that a kind of malicious act that rear attacker is carried out is captured in website, and attacker would generally create new net
Page simultaneously writes hostile content, or the web page portions or full content that have existed are revised as hostile content.Webpage tamper is not only
Have impact on website normally to run, and a large amount of invalid informations to public propagation, endanger huge.At present webpage tamper detection have with
Lower two methods:
1) blacklist keyword detection:The key word blacklist of hostile content is set up, by whether containing in inspection webpage
Key word in blacklist is judging whether webpage is tampered.This method may be due to the key word that includes in blacklist not
It is enough to produce and fail to report comprehensively, wrong report may be additionally produced, such as public security department of government issues certain bulletin strike illegal act, its
In contain illegal key word, wrong report can be produced if the illegal key word is in blacklist, because this webpage is actually
Normal webpage.
2) webpage digital finger-print is compared:The digital finger-print (such as md5 values) of each webpage of website is precalculated, and sets up fingerprint
Storehouse, is then separated by the digital finger-print for recalculating each webpage for a period of time, if in front and back the digital finger-print of same webpage is different,
Illustrate that the webpage is tampered.This method needs website to set up fingerprint base in advance before being not tampered with, per subnormal modification and increase
Web page files must also update fingerprint base, cumbersome and less efficient;In addition this detecting system needs website webmaster to exist
Local disposition is carried out on Website server, it is impossible to be applied to the Internet and detect on a large scale.
The content of the invention
Present invention is primarily targeted at proposing a kind of detection method and device of webpage tamper, it is intended to improve webpage tamper
The accuracy rate and efficiency of detection.
For achieving the above object, the present invention provides a kind of detection method of webpage tamper, and methods described comprises the steps:
Obtain the Text eigenvector of webpage to be detected and the Text eigenvector of the webpage affiliated web site to be detected;
Calculated according to the Text eigenvector of the webpage described to be detected for getting and the Text eigenvector of the website
Text similarity between the webpage to be detected and the website;
Judge the text similarity whether less than predetermined threshold value;
If so, then judge the webpage to be detected as the webpage being tampered.
Alternatively, the text of the Text eigenvector and the webpage affiliated web site to be detected for obtaining webpage to be detected
The step of characteristic vector, includes:
The text feature collection of webpage to be detected and the text feature collection of the webpage affiliated web site to be detected are obtained, wherein,
The text feature collection of the webpage to be detected and the text feature collection of the website include identical key word;
The word frequency and weight concentrated in the text feature of the webpage to be detected according to the key word is calculated, and is obtained
The Text eigenvector of the webpage to be detected;
The word frequency and weight concentrated according to text feature of the key word in the website is calculated, and obtains the net
The Text eigenvector stood.
Alternatively, the text of the text feature collection for obtaining webpage to be detected and the webpage affiliated web site to be detected is special
The step of collection, includes:
Obtain the text of the webpage affiliated web site to be detected;
The text to getting carries out Chinese word segmentation and goes stop words to process;
Some key words are extracted from result, the text feature collection of the website is obtained;
Using the text feature collection of the website as the webpage to be detected text feature collection.
Alternatively, the text of the Text eigenvector of the webpage described to be detected that the basis gets and the website is special
Levying the step of vector calculates the text similarity between the webpage to be detected and the website includes:
Calculate the cosine value of the Text eigenvector of the webpage to be detected and the Text eigenvector angle of the website;
Using result of calculation as the text similarity between the webpage to be detected and the website.
Alternatively, the text of the Text eigenvector and the webpage affiliated web site to be detected for obtaining webpage to be detected
Before the step of characteristic vector, also include:
Default webpage to be detected is crawled by crawlers timing;
Or, when network access request is detected, using the corresponding webpage of the network access request as survey grid to be checked
Page.
Additionally, for achieving the above object, the present invention also provides a kind of detection means of webpage tamper, and described device includes:
Acquisition module, for obtaining the Text eigenvector of webpage to be detected and the text of the webpage affiliated web site to be detected
Eigen vector;
Computing module, for according to the Text eigenvector and the text of the website of the webpage described to be detected for getting
Characteristic vector calculates the text similarity between the webpage to be detected and the website;
Judge module, for judging the text similarity whether less than predetermined threshold value;If so, then judge described to be detected
Webpage is the webpage being tampered.
Alternatively, the acquisition module includes:
Acquiring unit, for obtaining the text feature collection of webpage to be detected and the text of the webpage affiliated web site to be detected
Feature set, wherein, the text feature collection of the webpage to be detected and the text feature collection of the website include identical key word;
First computing unit, for according to the key word the webpage to be detected text feature concentrate word frequency and
Weight is calculated, and obtains the Text eigenvector of the webpage to be detected;
Second computing unit, the word frequency and weight concentrated according to text feature of the key word in the website is counted
Calculate, obtain the Text eigenvector of the website.
Alternatively, the acquiring unit is additionally operable to:
Obtain the text of the webpage affiliated web site to be detected;
The text to getting carries out Chinese word segmentation and goes stop words to process;
Some key words are extracted from result, the text feature collection of the website is obtained;
Using the text feature collection of the website as the webpage to be detected text feature collection.
Alternatively, the computing module is additionally operable to:
Calculate the cosine value of the Text eigenvector of the webpage to be detected and the Text eigenvector angle of the website;
Using result of calculation as the text similarity between the webpage to be detected and the website.
Alternatively, described device also includes:
Module is crawled, for crawling default webpage to be detected by crawlers timing;
The acquisition module is additionally operable to when network access request is detected, by the corresponding webpage of the network access request
As webpage to be detected.
The present invention obtains the Text eigenvector of webpage to be detected and the text feature of the webpage affiliated web site to be detected
Vector;According to the Text eigenvector of the webpage described to be detected for getting and the Text eigenvector of the website are calculated
Text similarity between webpage to be detected and the website;Judge the text similarity whether less than predetermined threshold value;If so,
Then judge the webpage to be detected as the webpage being tampered.The present invention detects whether webpage is tampered by text similarity,
Wrong report relative to existing blacklist keyword detection, it is not necessary to carry out blacklist key word collection, to webpage tamper detection
It is fewer with failing to report, improve the accuracy rate of webpage tamper detection;Compare relative to existing webpage digital finger-print, without the need for carrying out
Local disposition, can carry out long-range extensive detection, improve the efficiency of webpage tamper detection.
Description of the drawings
Fig. 1 is the schematic flow sheet of the detection method first embodiment of webpage tamper of the present invention;
Fig. 2 is the refinement step schematic diagram of step S100 in Fig. 1;
Fig. 3 is the refinement step schematic diagram of step S110 in Fig. 2;
Fig. 4 is the schematic flow sheet of the detection method second embodiment of webpage tamper of the present invention;
Fig. 5 is the angled relationships between the Text eigenvector Dk and the Text eigenvector D0 of webpage affiliated web site of webpage
Schematic diagram;
Fig. 6 is the schematic flow sheet of the detection method 3rd embodiment of webpage tamper of the present invention;
Fig. 7 is the high-level schematic functional block diagram of the detection means first embodiment of webpage tamper of the present invention;
Fig. 8 is the refinement high-level schematic functional block diagram of acquisition module in Fig. 7;
Fig. 9 is the high-level schematic functional block diagram of the detection means second embodiment of webpage tamper of the present invention.
The realization of the object of the invention, functional characteristics and advantage will be described further referring to the drawings in conjunction with the embodiments.
Specific embodiment
It should be appreciated that specific embodiment described herein is not intended to limit the present invention only to explain the present invention.
The present invention provides a kind of detection method of webpage tamper.
With reference to Fig. 1, Fig. 1 is the schematic flow sheet of the detection method first embodiment of webpage tamper of the present invention.Methods described
Comprise the steps:
Step S100, obtains the Text eigenvector of webpage to be detected and the text spy of the webpage affiliated web site to be detected
Levy vector;
In the present embodiment, webpage can be carried out by the application firewall being arranged between Web browser and Web server to usurp
Change detection.Application firewall obtains the Text eigenvector of webpage to be detected and the text of the webpage affiliated web site to be detected is special
Vector is levied, so as to set up vector space model.
In vector space model, text (Document is represented with D) refers to various machine-readable records, characteristic item
(Term is represented with t) refers to the basic language unit that occur in text D and can represent text content, mainly by word
Or phrase is constituted.Text can be D (T1, T2 ..., Tn) with characteristic item set representations, and wherein Tk is characteristic item, 1<=k<=n,
For example there are tetra- characteristic items of a, b, c, d in one document, then this document can just be expressed as D (a, b, c, d).
Further, with reference to Fig. 2, Fig. 2 is the refinement step schematic diagram of step S100 in Fig. 1.As a kind of embodiment,
Step S100 can include:
Step S110, obtains the text feature collection of webpage to be detected and the text feature of the webpage affiliated web site to be detected
Collection, wherein, the text feature collection of the webpage to be detected and the text feature collection of the website include identical key word;
Step S120, the word frequency and weight concentrated in the text feature of the webpage to be detected according to the key word is carried out
Calculate, obtain the Text eigenvector of the webpage to be detected;
Step S130, the word frequency and weight concentrated according to text feature of the key word in the website is calculated,
Obtain the Text eigenvector of the website.
First, application firewall obtains the text feature collection of webpage to be detected and the text spy of webpage affiliated web site to be detected
Collection, is to ensure that the two text feature collection have comparability, and the two text feature collection include identical key word.Such as,
The text feature for obtaining webpage affiliated web site to be detected integrates as D (T1, T2 ..., Tm), then obtain the text feature of webpage to be detected
It also should be D (T1, T2 ..., Tm) to integrate, and wherein T1, T2 ..., Tm is characterized item, i.e. key word, and m is the quantity of key word.Net
Network management personnel can pre-set text feature in the case where web site contents are familiar with according to the main contents of accessed website
The key word of concentration, in most of the cases, application firewall is processed to by the web page text to being accessed for website
Automatically key word is obtained.
After key word is got, application firewall is respectively calculated further according to the word frequency and weight of key word, obtains
The Text eigenvector of the Text eigenvector of webpage to be detected and webpage affiliated web site to be detected.The present embodiment mainly passes through
TF-IDF (term frequency-inverse document frequency, word frequency -- reverse text frequency) technologies are counting
Calculation obtains Text eigenvector, and its principle is:Word frequency is calculated with reference to TF formula:TF=N/M, i.e., in the article in a M word
In have N number of key word, then TF=N/M is word frequency of the key word in this article;Reverse text frequency is for weighing
The index of keyword weight, can be calculated by formula IDF=log (D/Dw) and be obtained, and wherein D is the total number of documents of corpus, and Dw is
The number of files that key word occurred, Dw is bigger, illustrates that the key word occurred in more documents, and the key word is more not enough
To become the distinguishing characteristicss item of this document, thus its weight is less.Calculate based on the Weighted Term Frequency of IDF, i.e., with key word Tx's
Word frequency is multiplied by the reverse text frequency (Wx=TF (Tx) * IDF (Tx)) of Tx, you can obtain text feature collection D (T1, T2 ..., Tm)
Corresponding Text eigenvector D (W1, W2 ..., Wm).
It is according to the process of the Text eigenvector of above-mentioned principle calculating webpage to be detected:Obtain the text of webpage to be detected
Dk, word sum calculates word frequency of each key word in Dk in the number of times occurred in Dk according to key word and Dk, then will meter
The word frequency for obtaining is weighted based on IDF, finally give webpage to be detected Text eigenvector Dk (Wk1, Wk2 ...,
Wkm).Especially, the Weighted Term Frequency Wkx without the key word Tx for occurring in webpage to be detected is 0.
It is according to the process of the Text eigenvector of the whole website of above-mentioned principle calculating:By the text of all webpages of website
Merge, obtain total text D0, word sum calculates each key word in D0 in the number of times occurred in D0 according to key word and D0
In word frequency, then calculated word frequency is weighted based on IDF, finally give the Text eigenvector D0 of whole website
(W01, W02 ..., W0m).
Step S200, according to the Text eigenvector and the text feature of the website of the webpage described to be detected for getting
Vector calculates the text similarity between the webpage to be detected and the website;
It should be noted that the webpage being tampered with is probably browser and browsing access and being evident that, it is also possible to no
Detectable dark chain, the webpage being generally tampered accounts for the sub-fraction of whole website and webpage, and the web page contents being tampered
And have bigger difference in whole website, and the similarity degree between text is generally and the content of text is height correlation
, therefore text similarity can be compared by above-mentioned vector space model.
Specifically, after the Text eigenvector of the Text eigenvector for getting webpage to be detected and the website,
Application firewall calculates the text similarity between webpage to be detected and website according to the relation between the two characteristic vectors, than
The distance between two characteristic vectors, angle are such as calculated, using result of calculation as the text between webpage to be detected and website
Similarity.
Whether step S300, judge the text similarity less than predetermined threshold value;
Step S400, if the text similarity is less than predetermined threshold value, judges that the webpage to be detected is tampered
Webpage.
Whether application firewall judges calculated text similarity less than predetermined threshold value, wherein, default text phase
Self study classification can be carried out like degree threshold value by the webpage of the website to having occurred and that webpage tamper in a large number to obtain, network management
Person can also flexibly be arranged according to actual needs to it.If text similarity is less than predetermined threshold value, application firewall
The webpage for detecting is judged as the webpage being tampered, now testing result can be reported and be prevented user from accessing by application firewall
The webpage;Otherwise judge the webpage for detecting as normal webpage.
In the present embodiment, application firewall obtains the Text eigenvector of webpage to be detected and the webpage institute to be detected
The Text eigenvector of category website;According to the Text eigenvector and the text of the website of the webpage described to be detected for getting
Characteristic vector calculates the text similarity between the webpage to be detected and the website;Judge whether the text similarity is little
In predetermined threshold value;If so, then judge the webpage to be detected as the webpage being tampered.The present embodiment is examined by text similarity
Whether survey grid page is tampered, relative to existing blacklist keyword detection, it is not necessary to blacklist key word collection is carried out, to net
Page tampering detection wrong report and fail to report it is fewer, improve webpage tamper detection accuracy rate;Relative to existing webpage numeral
Fingerprint comparison, without the need for carrying out local disposition, can carry out long-range extensive detection, improve the efficiency of webpage tamper detection.
Further, with reference to Fig. 3, Fig. 3 is the refinement step schematic diagram of step S110 in Fig. 2.Based on the above embodiments,
Step S110 can include:
Step S111, obtains the text of the webpage affiliated web site to be detected;
Step S112, the text to getting carries out Chinese word segmentation and goes stop words to process;
Step S113, extracts some key words from result, obtains the text feature collection of the website;
Step S114, using the text feature collection of the website as the webpage to be detected text feature collection.
In the present embodiment, to make the extraction result of key word more accurate, application firewall owns first to website
Webpage carries out pretreatment, and removing is included including HTML (HyperText Markup Language, HTML) code
All codes, only retain the word content of webpage, form text D1, D2 ..., Dn (wherein n is webpage quantity), by these texts
This merging, obtains the text D0 of whole website;Then, D0 is carried out Chinese word segmentation and going stop words to process, Chinese word segmentation be by
One Chinese character sequence is cut into single word one by one, go stop words be according to disable vocabulary in word by language material to text
This content recognition has little significance but the very high word of the frequency of occurrences, symbol, punctuate and mess code etc. remove, as ", and, it is, this "
Occur nearly in any Chinese text Deng word, but they are to the almost no any contribution of the meaning expressed by text,
These words are arranged in deactivation vocabulary, it is possible to remove these words for not having practical significance in text according to vocabulary is disabled.
Thus, the pre-processed results of the text D0 of whole website have been obtained.
Application firewall can calculate the word frequency of the word in pre-processed results, if the word frequency of certain word reaches one presetting
Thus value, then extract all key words of text D0 using the word as the key word of text D0, and then obtains the text of website
Feature set D (T1, T2 ..., Tm), text feature set is simultaneously as the text feature collection of webpage to be detected.
Further, with reference to Fig. 4, Fig. 4 is the schematic flow sheet of the detection method second embodiment of webpage tamper of the present invention.
Based on the embodiment shown in above-mentioned Fig. 1, step S200 can include:
Step S210, calculates the Text eigenvector of the webpage to be detected and the Text eigenvector angle of the website
Cosine value;
Step S220, using result of calculation as the text similarity between the webpage to be detected and the website.
In the present embodiment, application firewall calculate the Text eigenvector of webpage to be detected and the text feature of website to
The cosine value of amount angle, if the Text eigenvector of website is D0 (W01, W02 ..., W0m), the Text eigenvector of webpage is
Dk (Wk1, Wk2 ..., Wkm), wherein k are k-th webpage, then the cosine value computing formula of the angle of vector D0 and vector Dk is:
Using above-mentioned cosine value as the text similarity value between webpage to be detected and whole website, the value is bigger, then to
The angle of amount D0 and vector Dk is less, represents that webpage to be detected is higher with the text similarity of website;The value is less, then vector D0
It is bigger with the angle of vectorial Dk, represent that webpage to be detected is lower with the text similarity of website.As shown in figure 5, Fig. 5 is webpage
Angled relationships schematic diagram between Text eigenvector Dk and the Text eigenvector D0 of webpage affiliated web site.
The present embodiment is by between the Text eigenvector and the Text eigenvector of webpage affiliated web site that calculate webpage
The cosine value of angle, can realize the text similarity of quantitative analyses webpage to be detected and whole website, and analysis mode is more closed
Reason is reliable.
Further, with reference to Fig. 6, Fig. 6 is the schematic flow sheet of the detection method 3rd embodiment of webpage tamper of the present invention.
Based on the above embodiments, before step S100, can also include:
Step S500, by crawlers timing default webpage to be detected is crawled;
Or step S600, when network access request is detected, using the corresponding webpage of the network access request as
Webpage to be detected.
In the present embodiment, application firewall can carry out the active detecting of webpage tamper.Specifically, can be in application firewall
One crawlers of middle setting, crawlers according to set crawl target, the webpage on timer access WWW to it is related
Link, and web page contents are downloaded, wherein, the crawl target of crawlers can be the net related to a certain particular topic content
Page, it is also possible to expand crawl scope as needed, can be in advance configured by network management personnel in being embodied as.Afterwards, should
The webpage crawled crawlers with fire wall judges one by one whether these webpages are tampered as webpage to be detected.
Additionally, application firewall can also carry out the passive detection of webpage tamper.Specifically, application firewall is detecting net
During network access request, using the corresponding webpage of the network access request as webpage to be detected, so, user accesses the flow of website
During by application firewall, it is possible to which real-time detection goes out whether the webpage of user's current accessed is tampered.In more embodiments
In, it is to improve passive detection efficiency, passive detection can also depend on the testing result of active detecting, application firewall carrying out
During active detecting, the information such as website text feature collection, website Text eigenvector are stored in default text feature data base,
When user accesses Web server, HTTP (HyperText Transfer Protocol, HTML (Hypertext Markup Language)) flow leads to
Cross application firewall, fire wall record URL (Uniform Resoure Locator, URL) and accordingly
Http response content, and the Text eigenvector of the corresponding webpage of http response content is obtained, by the text of the webpage for getting
Characteristic vector carries out text similarity comparison with the Text eigenvector of corresponding website in text feature data base, to judge the net
Whether page is tampered.
In the present embodiment, set webpage is crawled by arranging crawlers timing, and then carries out the master of webpage tamper
Dynamic detection, without the need for manual intervention, and can carry out long-range extensive detection, improve the efficiency of webpage tamper detection;Pass through
Using the webpage of user's current accessed as webpage to be detected, the real-time of webpage tamper detection is realized.
The present invention also provides a kind of detection means of webpage tamper.
With reference to Fig. 7, Fig. 7 is the high-level schematic functional block diagram of the detection means first embodiment of webpage tamper of the present invention.It is described
Device includes:
Acquisition module 10, for obtaining the Text eigenvector and the webpage affiliated web site to be detected of webpage to be detected
Text eigenvector;
In the present embodiment, webpage can be carried out by the application firewall being arranged between Web browser and Web server to usurp
Change detection.Acquisition module 10 obtains the Text eigenvector of webpage to be detected and the text of the webpage affiliated web site to be detected is special
Vector is levied, so as to set up vector space model.
In vector space model, text (Document is represented with D) refers to various machine-readable records, characteristic item
(Term is represented with t) refers to the basic language unit that occur in text D and can represent text content, mainly by word
Or phrase is constituted.Text can be D (T1, T2 ..., Tn) with characteristic item set representations, and wherein Tk is characteristic item, 1<=k<=n,
For example there are tetra- characteristic items of a, b, c, d in one document, then this document can just be expressed as D (a, b, c, d).
With reference to Fig. 8, Fig. 8 is the refinement high-level schematic functional block diagram of acquisition module in Fig. 7.It is described as a kind of embodiment
Acquisition module 10 can include:
Acquiring unit 11, for obtaining the text feature collection of webpage to be detected and the text of the webpage affiliated web site to be detected
Eigen collection, wherein, the text feature collection of the webpage to be detected and the text feature collection of the website are crucial comprising identical
Word;
First computing unit 12, for the word frequency concentrated in the text feature of the webpage to be detected according to the key word
Calculated with weight, obtained the Text eigenvector of the webpage to be detected;
Second computing unit 13, the word frequency and weight concentrated according to text feature of the key word in the website is carried out
Calculate, obtain the Text eigenvector of the website.
First, acquiring unit 11 obtains the text feature collection of webpage to be detected and the text spy of webpage affiliated web site to be detected
Collection, is to ensure that the two text feature collection have comparability, and the two text feature collection include identical key word.Such as,
The text feature for obtaining webpage affiliated web site to be detected integrates as D (T1, T2 ..., Tm), then obtain the text feature of webpage to be detected
It also should be D (T1, T2 ..., Tm) to integrate, and wherein T1, T2 ..., Tm is characterized item, i.e. key word, and m is the quantity of key word.Net
Network management personnel can pre-set text feature in the case where web site contents are familiar with according to the main contents of accessed website
The key word of concentration, in most of the cases, application firewall is processed to by the web page text to being accessed for website
Automatically key word is obtained.
After key word is got, the word frequency and power of the first computing unit 12 and the second computing unit 13 further according to key word
Be respectively calculated again, obtain the Text eigenvector of webpage to be detected and the text feature of webpage affiliated web site to be detected to
Amount.The present embodiment mainly by TF-IDF (term frequency inverse document frequency, word frequency -- it is inverse
To text frequency) being calculated Text eigenvector, its principle is technology:Word frequency is calculated with reference to TF formula:TF=N/M, i.e.,
There is N number of key word in the article in a M word, then TF=N/M is word frequency of the key word in this article;Inversely
Text frequency is the index for weighing keyword weight, can be calculated by formula IDF=log (D/Dw) and be obtained, and wherein D is language material
The total number of documents in storehouse, the number of files that Dw occurred for key word, Dw is bigger, illustrates that the key word occurs in more documents
Cross, the key word more is not enough to become the distinguishing characteristicss item of this document, thus its weight is less.Calculate based on the weighting of IDF
Word frequency, i.e., be multiplied by the reverse text frequency (Wx=TF (Tx) * IDF (Tx)) of Tx, you can obtain text with the word frequency of key word Tx
The corresponding Text eigenvector D (W1, W2 ..., Wm) of feature set D (T1, T2 ..., Tm).
According to above-mentioned principle, the process of the Text eigenvector that the first computing unit 12 calculates webpage to be detected is:Obtain
The text Dk of webpage to be detected, word sum calculates each key word in Dk in the number of times occurred in Dk according to key word and Dk
In word frequency, then calculated word frequency is weighted based on IDF, finally give the Text eigenvector Dk of webpage to be detected
(Wk1, Wk2 ..., Wkm).Especially, the Weighted Term Frequency Wkx without the key word Tx for occurring in webpage to be detected is 0.
According to above-mentioned principle, the process of the Text eigenvector that the second computing unit 13 calculates whole website is:By website
All webpages text merge, obtain total text D0, in the number of times occurred in D0 according to key word and D0 word sum counts
Word frequency of each key word in D0 is calculated, then calculated word frequency is weighted based on IDF, finally give whole website
Text eigenvector D0 (W01, W02 ..., W0m).
Computing module 20, for according to the Text eigenvector and the text of the website of the webpage described to be detected for getting
Eigen vector calculates the text similarity between the webpage to be detected and the website;
It should be noted that the webpage being tampered with is probably browser and browsing access and being evident that, it is also possible to no
Detectable dark chain, the webpage being generally tampered accounts for the sub-fraction of whole website and webpage, and the web page contents being tampered
And have bigger difference in whole website, and the similarity degree between text is generally and the content of text is height correlation
, therefore text similarity can be compared by above-mentioned vector space model.
Specifically, the text feature of the Text eigenvector of webpage to be detected and the website is got in acquisition module 10
After vector, computing module 20 calculates the text between webpage to be detected and website according to the relation between the two characteristic vectors
Similarity, such as calculate the distance between two characteristic vectors, angle etc., using result of calculation as webpage to be detected and website it
Between text similarity.
Judge module 30, for judging the text similarity whether less than predetermined threshold value;If so, then judge described to be checked
Survey grid page is the webpage being tampered.
Whether judge module 30 judges calculated text similarity less than predetermined threshold value, wherein, default text phase
Self study classification can be carried out like degree threshold value by the webpage of the website to having occurred and that webpage tamper in a large number to obtain, network management
Person can also flexibly be arranged according to actual needs to it.If text similarity is less than predetermined threshold value, judge module 30
The webpage for detecting is judged as the webpage being tampered, now testing result can be reported and be prevented user from accessing by application firewall
The webpage;Otherwise judge the webpage for detecting as normal webpage.
In the present embodiment, acquisition module 10 obtains the Text eigenvector of webpage to be detected and the webpage institute to be detected
The Text eigenvector of category website;Computing module 20 is according to the Text eigenvector of the webpage described to be detected for getting and described
The Text eigenvector of website calculates the text similarity between the webpage to be detected and the website;Judge module 30 judges
Whether the text similarity is less than predetermined threshold value;If so, then judge the webpage to be detected as the webpage being tampered.This enforcement
Example detects whether webpage is tampered by text similarity, relative to existing blacklist keyword detection, it is not necessary to carry out
Blacklist key word collect, to webpage tamper detection wrong report and fail to report it is fewer, improve webpage tamper detection accuracy rate;
Compare relative to existing webpage digital finger-print, without the need for carrying out local disposition, long-range extensive detection can be carried out, improve
The efficiency of webpage tamper detection.
Further, with continued reference to Fig. 8, the acquiring unit 11 is additionally operable to:Obtain the webpage affiliated web site to be detected
Text;The text to getting carries out Chinese word segmentation and goes stop words to process;Some keys are extracted from result
Word, obtains the text feature collection of the website;The text feature collection of the website is special as the text of the webpage to be detected
Collection.
In the present embodiment, to make the extraction result of key word more accurate, acquiring unit 11 owns first to website
Webpage carries out pretreatment, and removing is included including HTML (HyperText Markup Language, HTML) code
All codes, only retain the word content of webpage, form text D1, D2 ..., Dn (wherein n is webpage quantity), by these texts
This merging, obtains the text D0 of whole website;Then, D0 is carried out Chinese word segmentation and going stop words to process, Chinese word segmentation be by
One Chinese character sequence is cut into single word one by one, go stop words be according to disable vocabulary in word by language material to text
This content recognition has little significance but the very high word of the frequency of occurrences, symbol, punctuate and mess code etc. remove, as ", and, it is, this "
Occur nearly in any Chinese text Deng word, but they are to the almost no any contribution of the meaning expressed by text,
These words are arranged in deactivation vocabulary, it is possible to remove these words for not having practical significance in text according to vocabulary is disabled.
Thus, the pre-processed results of the text D0 of whole website have been obtained.
Acquiring unit 11 can calculate the word frequency of the word in pre-processed results, if the word frequency of certain word reaches one presetting
Thus value, then extract all key words of text D0 using the word as the key word of text D0, and then obtains the text of website
Feature set D (T1, T2 ..., Tm), text feature set is simultaneously as the text feature collection of webpage to be detected.
Further, with continued reference to Fig. 7, the computing module 20 is additionally operable to:The text for calculating the webpage to be detected is special
Levy the cosine value of the Text eigenvector angle of website described in vector sum;Using result of calculation as the webpage to be detected and described
Text similarity between website.
In the present embodiment, computing module 20 calculate the Text eigenvector of webpage to be detected and the text feature of website to
The cosine value of amount angle, if the Text eigenvector of website is D0 (W01, W02 ..., W0m), the Text eigenvector of webpage is
Dk (Wk1, Wk2 ..., Wkm), wherein k are k-th webpage, then the cosine value computing formula of the angle of vector D0 and vector Dk is:
Using above-mentioned cosine value as the text similarity value between webpage to be detected and whole website, the value is bigger, then to
The angle of amount D0 and vector Dk is less, represents that webpage to be detected is higher with the text similarity of website;The value is less, then vector D0
It is bigger with the angle of vectorial Dk, represent that webpage to be detected is lower with the text similarity of website.As shown in figure 5, Fig. 5 is webpage
Angled relationships schematic diagram between Text eigenvector Dk and the Text eigenvector D0 of webpage affiliated web site.
The present embodiment is by between the Text eigenvector and the Text eigenvector of webpage affiliated web site that calculate webpage
The cosine value of angle, can realize the text similarity of quantitative analyses webpage to be detected and whole website, and analysis mode is more closed
Reason is reliable.
Further, with reference to Fig. 9, Fig. 9 is that the functional module of the detection means second embodiment of webpage tamper of the present invention is shown
It is intended to.Based on the above embodiments, described device can also include:
Module 40 is crawled, for crawling default webpage to be detected by crawlers timing;
The acquisition module 10 is additionally operable to when network access request is detected, by the corresponding net of the network access request
Page is used as webpage to be detected.
In the present embodiment, application firewall can carry out the active detecting of webpage tamper.Specifically, can be in application firewall
One crawlers of middle setting, crawlers according to set crawl target, the webpage on timer access WWW to it is related
Link, and web page contents are downloaded, wherein, the crawl target of crawlers can be the net related to a certain particular topic content
Page, it is also possible to expand crawl scope as needed, can be in advance configured by network management personnel in being embodied as.Afterwards, should
The webpage crawled crawlers with fire wall judges one by one whether these webpages are tampered as webpage to be detected.
Additionally, application firewall can also carry out the passive detection of webpage tamper.Specifically, acquisition module 10 is detecting net
During network access request, using the corresponding webpage of the network access request as webpage to be detected, so, user accesses the flow of website
During by application firewall, it is possible to which real-time detection goes out whether the webpage of user's current accessed is tampered.In more embodiments
In, it is to improve passive detection efficiency, the passive detection can also depend on the testing result of active detecting, application firewall entering
During row active detecting, the information such as website text feature collection, website Text eigenvector are stored in into default text feature data base
In, when user accesses Web server, HTTP (HyperText Transfer Protocol, HTML (Hypertext Markup Language)) flow
By application firewall, fire wall record URL (Uniform Resoure Locator, URL) and accordingly
Http response content, and the Text eigenvector of the corresponding webpage of http response content is obtained, by the text of the webpage for getting
Characteristic vector carries out text similarity comparison with the Text eigenvector of corresponding website in text feature data base, to judge the net
Whether page is tampered.
In the present embodiment, set webpage is crawled by arranging crawlers timing, and then carries out the master of webpage tamper
Dynamic detection, without the need for manual intervention, and can carry out long-range extensive detection, improve the efficiency of webpage tamper detection;Pass through
Using the webpage of user's current accessed as webpage to be detected, the real-time of webpage tamper detection is realized.
The preferred embodiments of the present invention are these are only, the scope of the claims of the present invention is not thereby limited, it is every using this
Equivalent structure or equivalent flow conversion that bright description and accompanying drawing content are made, or directly or indirectly it is used in other related skills
Art field, is included within the scope of the present invention.
Claims (10)
1. a kind of detection method of webpage tamper, it is characterised in that methods described comprises the steps:
Obtain the Text eigenvector of webpage to be detected and the Text eigenvector of the webpage affiliated web site to be detected;
According to the Text eigenvector of the webpage described to be detected for getting and the Text eigenvector of the website are calculated
Text similarity between webpage to be detected and the website;
Judge the text similarity whether less than predetermined threshold value;
If so, then judge the webpage to be detected as the webpage being tampered.
2. the method for claim 1, it is characterised in that the Text eigenvector of acquisition webpage to be detected and described
The step of Text eigenvector of webpage affiliated web site to be detected, includes:
The text feature collection of webpage to be detected and the text feature collection of the webpage affiliated web site to be detected are obtained, wherein, it is described
The text feature collection of webpage to be detected and the text feature collection of the website include identical key word;
The word frequency and weight concentrated in the text feature of the webpage to be detected according to the key word is calculated, and obtains described
The Text eigenvector of webpage to be detected;
The word frequency and weight concentrated according to text feature of the key word in the website is calculated, and obtains the website
Text eigenvector.
3. method as claimed in claim 2, it is characterised in that the text feature collection of acquisition webpage to be detected and described treat
The step of text feature collection of detection webpage affiliated web site, includes:
Obtain the text of the webpage affiliated web site to be detected;
The text to getting carries out Chinese word segmentation and goes stop words to process;
Some key words are extracted from result, the text feature collection of the website is obtained;
Using the text feature collection of the website as the webpage to be detected text feature collection.
4. the method as described in any one of claims 1 to 3, it is characterised in that the survey grid described to be checked that the basis gets
The Text eigenvector of page and the Text eigenvector of the website calculate the text between the webpage to be detected and the website
The step of this similarity, includes:
Calculate the cosine value of the Text eigenvector of the webpage to be detected and the Text eigenvector angle of the website;
Using result of calculation as the text similarity between the webpage to be detected and the website.
5. method as claimed in claim 4, it is characterised in that the Text eigenvector of acquisition webpage to be detected and described
Before the step of Text eigenvector of webpage affiliated web site to be detected, also include:
Default webpage to be detected is crawled by crawlers timing;
Or, when network access request is detected, using the corresponding webpage of the network access request as webpage to be detected.
6. a kind of detection means of webpage tamper, it is characterised in that described device includes:
Acquisition module, for obtaining the Text eigenvector of webpage to be detected and the text spy of the webpage affiliated web site to be detected
Levy vector;
Computing module, for according to the Text eigenvector and the text feature of the website of the webpage described to be detected for getting
Vector calculates the text similarity between the webpage to be detected and the website;
Judge module, for judging the text similarity whether less than predetermined threshold value;If so, the webpage to be detected is then judged
For the webpage being tampered.
7. device as claimed in claim 6, it is characterised in that the acquisition module includes:
Acquiring unit, for obtaining the text feature collection of webpage to be detected and the text feature of the webpage affiliated web site to be detected
Collection, wherein, the text feature collection of the webpage to be detected and the text feature collection of the website include identical key word;
First computing unit, for according to the key word the webpage to be detected text feature concentrate word frequency and weight
Calculated, obtained the Text eigenvector of the webpage to be detected;
Second computing unit, the word frequency and weight concentrated according to text feature of the key word in the website is calculated,
Obtain the Text eigenvector of the website.
8. device as claimed in claim 7, it is characterised in that the acquiring unit is additionally operable to:
Obtain the text of the webpage affiliated web site to be detected;
The text to getting carries out Chinese word segmentation and goes stop words to process;
Some key words are extracted from result, the text feature collection of the website is obtained;
Using the text feature collection of the website as the webpage to be detected text feature collection.
9. the device as described in any one of claim 6 to 8, it is characterised in that the computing module is additionally operable to:
Calculate the cosine value of the Text eigenvector of the webpage to be detected and the Text eigenvector angle of the website;
Using result of calculation as the text similarity between the webpage to be detected and the website.
10. device as claimed in claim 9, it is characterised in that described device also includes:
Module is crawled, for crawling default webpage to be detected by crawlers timing;
The acquisition module is additionally operable to when network access request is detected, using the corresponding webpage of the network access request as
Webpage to be detected.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611158763.4A CN106685936B (en) | 2016-12-14 | 2016-12-14 | Webpage tampering detection method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611158763.4A CN106685936B (en) | 2016-12-14 | 2016-12-14 | Webpage tampering detection method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106685936A true CN106685936A (en) | 2017-05-17 |
CN106685936B CN106685936B (en) | 2020-07-31 |
Family
ID=58868121
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201611158763.4A Active CN106685936B (en) | 2016-12-14 | 2016-12-14 | Webpage tampering detection method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106685936B (en) |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107301355A (en) * | 2017-06-20 | 2017-10-27 | 深信服科技股份有限公司 | A kind of webpage tamper monitoring method and device |
CN107566415A (en) * | 2017-10-25 | 2018-01-09 | 国家电网公司 | Homepage method for pushing and device |
CN107580075A (en) * | 2017-10-25 | 2018-01-12 | 国家电网公司 | Homepage method for pushing and system |
CN108306878A (en) * | 2018-01-30 | 2018-07-20 | 平安科技(深圳)有限公司 | Detection method for phishing site, device, computer equipment and storage medium |
CN108520185A (en) * | 2018-04-16 | 2018-09-11 | 深信服科技股份有限公司 | Detect method, apparatus, equipment and the computer readable storage medium of webpage tamper |
CN109165529A (en) * | 2018-08-14 | 2019-01-08 | 杭州安恒信息技术股份有限公司 | A kind of dark chain altering detecting method, device and computer readable storage medium |
CN109981555A (en) * | 2017-12-28 | 2019-07-05 | 腾讯科技(深圳)有限公司 | To the processing method of web data, device, equipment, terminal and storage medium |
CN110134901A (en) * | 2019-04-30 | 2019-08-16 | 哈尔滨英赛克信息技术有限公司 | A kind of multilink webpage tamper determination method based on flow analysis |
CN110532784A (en) * | 2019-09-04 | 2019-12-03 | 杭州安恒信息技术股份有限公司 | A kind of dark chain detection method, device, equipment and computer readable storage medium |
CN111563276A (en) * | 2019-01-25 | 2020-08-21 | 深信服科技股份有限公司 | Webpage tampering detection method, detection system and related equipment |
CN113806732A (en) * | 2020-06-16 | 2021-12-17 | 深信服科技股份有限公司 | Webpage tampering detection method, device, equipment and storage medium |
EP3703329B1 (en) * | 2017-10-26 | 2024-03-20 | New H3C Security Technologies Co., Ltd. | Webpage request identification |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR100622129B1 (en) * | 2005-04-14 | 2006-09-19 | 한국전자통신연구원 | Dynamically changing web page defacement validation system and method |
CN102170446A (en) * | 2011-04-29 | 2011-08-31 | 南京邮电大学 | Fishing webpage detection method based on spatial layout and visual features |
CN102708186A (en) * | 2012-05-11 | 2012-10-03 | 上海交通大学 | Identification method of phishing sites |
CN102999638A (en) * | 2013-01-05 | 2013-03-27 | 南京邮电大学 | Phishing website detection method excavated based on network group |
CN103077348A (en) * | 2012-12-28 | 2013-05-01 | 华为技术有限公司 | Method and device for vulnerability scanning of Web site |
CN103927480A (en) * | 2013-01-14 | 2014-07-16 | 腾讯科技(深圳)有限公司 | Method, device and system for identifying malicious web page |
CN104166725A (en) * | 2014-08-26 | 2014-11-26 | 哈尔滨工业大学(威海) | Phishing website detection method |
-
2016
- 2016-12-14 CN CN201611158763.4A patent/CN106685936B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR100622129B1 (en) * | 2005-04-14 | 2006-09-19 | 한국전자통신연구원 | Dynamically changing web page defacement validation system and method |
CN102170446A (en) * | 2011-04-29 | 2011-08-31 | 南京邮电大学 | Fishing webpage detection method based on spatial layout and visual features |
CN102708186A (en) * | 2012-05-11 | 2012-10-03 | 上海交通大学 | Identification method of phishing sites |
CN103077348A (en) * | 2012-12-28 | 2013-05-01 | 华为技术有限公司 | Method and device for vulnerability scanning of Web site |
CN102999638A (en) * | 2013-01-05 | 2013-03-27 | 南京邮电大学 | Phishing website detection method excavated based on network group |
CN103927480A (en) * | 2013-01-14 | 2014-07-16 | 腾讯科技(深圳)有限公司 | Method, device and system for identifying malicious web page |
CN104166725A (en) * | 2014-08-26 | 2014-11-26 | 哈尔滨工业大学(威海) | Phishing website detection method |
Cited By (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107301355A (en) * | 2017-06-20 | 2017-10-27 | 深信服科技股份有限公司 | A kind of webpage tamper monitoring method and device |
CN107301355B (en) * | 2017-06-20 | 2021-07-02 | 深信服科技股份有限公司 | Webpage tampering monitoring method and device |
CN107566415A (en) * | 2017-10-25 | 2018-01-09 | 国家电网公司 | Homepage method for pushing and device |
CN107580075A (en) * | 2017-10-25 | 2018-01-12 | 国家电网公司 | Homepage method for pushing and system |
CN107580075B (en) * | 2017-10-25 | 2021-07-20 | 国家电网公司 | Homepage pushing method and system |
EP3703329B1 (en) * | 2017-10-26 | 2024-03-20 | New H3C Security Technologies Co., Ltd. | Webpage request identification |
CN109981555A (en) * | 2017-12-28 | 2019-07-05 | 腾讯科技(深圳)有限公司 | To the processing method of web data, device, equipment, terminal and storage medium |
CN108306878A (en) * | 2018-01-30 | 2018-07-20 | 平安科技(深圳)有限公司 | Detection method for phishing site, device, computer equipment and storage medium |
CN108520185A (en) * | 2018-04-16 | 2018-09-11 | 深信服科技股份有限公司 | Detect method, apparatus, equipment and the computer readable storage medium of webpage tamper |
CN109165529A (en) * | 2018-08-14 | 2019-01-08 | 杭州安恒信息技术股份有限公司 | A kind of dark chain altering detecting method, device and computer readable storage medium |
CN111563276A (en) * | 2019-01-25 | 2020-08-21 | 深信服科技股份有限公司 | Webpage tampering detection method, detection system and related equipment |
CN111563276B (en) * | 2019-01-25 | 2024-04-09 | 深信服科技股份有限公司 | Webpage tampering detection method, detection system and related equipment |
CN110134901A (en) * | 2019-04-30 | 2019-08-16 | 哈尔滨英赛克信息技术有限公司 | A kind of multilink webpage tamper determination method based on flow analysis |
CN110134901B (en) * | 2019-04-30 | 2023-06-16 | 哈尔滨英赛克信息技术有限公司 | Multilink webpage tampering judging method based on flow analysis |
CN110532784A (en) * | 2019-09-04 | 2019-12-03 | 杭州安恒信息技术股份有限公司 | A kind of dark chain detection method, device, equipment and computer readable storage medium |
CN113806732B (en) * | 2020-06-16 | 2023-11-03 | 深信服科技股份有限公司 | Webpage tampering detection method, device, equipment and storage medium |
CN113806732A (en) * | 2020-06-16 | 2021-12-17 | 深信服科技股份有限公司 | Webpage tampering detection method, device, equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN106685936B (en) | 2020-07-31 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106685936A (en) | Webpage defacement detection method and apparatus | |
CN104572977B (en) | A kind of agricultural product quality and safety event online test method | |
CN104077396A (en) | Method and device for detecting phishing website | |
CN103544436B (en) | System and method for distinguishing phishing websites | |
CN104899508B (en) | A kind of multistage detection method for phishing site and system | |
CN103685174B (en) | A kind of detection method for phishing site of independent of sample | |
Hara et al. | Visual similarity-based phishing detection without victim site information | |
CN102436563B (en) | Method and device for detecting page tampering | |
US8561185B1 (en) | Personally identifiable information detection | |
CN107241352A (en) | A kind of net security accident classificaiton and Forecasting Methodology and system | |
CN102591965B (en) | Method and device for detecting black chain | |
CN102446255B (en) | Method and device for detecting page tamper | |
CN110233849A (en) | The method and system of network safety situation analysis | |
CN104156490A (en) | Method and device for detecting suspicious fishing webpage based on character recognition | |
CN102833270A (en) | Method and device for detecting SQL (structured query language) injection attacks and firewall with device | |
CN103679053B (en) | A kind of detection method of webpage tamper and device | |
CN110727766A (en) | Method for detecting sensitive words | |
CN108337269A (en) | A kind of WebShell detection methods | |
CN109918621A (en) | Newsletter archive infringement detection method and device based on digital finger-print and semantic feature | |
Katragadda et al. | Framework for real-time event detection using multiple social media sources | |
CN104158828A (en) | Method and system for identifying doubtful phishing webpage on basis of cloud content rule base | |
CN104036190A (en) | Method and device for detecting page tampering | |
Liu et al. | Multi-scale semantic deep fusion models for phishing website detection | |
Mythreya et al. | Prediction and prevention of malicious URL using ML and LR techniques for network security: machine learning | |
CN101471781A (en) | Method and system for processing script injection event |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
CB02 | Change of applicant information |
Address after: Nanshan District Xueyuan Road in Shenzhen city of Guangdong province 518052 No. 1001 Nanshan Chi Park building A1 layer Applicant after: SANGFOR TECHNOLOGIES Inc. Address before: Nanshan District Xueyuan Road in Shenzhen city of Guangdong province 518052 No. 1001 Nanshan Chi Park building A1 layer Applicant before: Sangfor Technologies Co.,Ltd. |
|
CB02 | Change of applicant information | ||
GR01 | Patent grant | ||
GR01 | Patent grant |