CN110134901A - A kind of multilink webpage tamper determination method based on flow analysis - Google Patents

A kind of multilink webpage tamper determination method based on flow analysis Download PDF

Info

Publication number
CN110134901A
CN110134901A CN201910364169.8A CN201910364169A CN110134901A CN 110134901 A CN110134901 A CN 110134901A CN 201910364169 A CN201910364169 A CN 201910364169A CN 110134901 A CN110134901 A CN 110134901A
Authority
CN
China
Prior art keywords
webpage
node
distorted
text
conclusion
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910364169.8A
Other languages
Chinese (zh)
Other versions
CN110134901B (en
Inventor
杨武
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin Talent Information Technology Co Ltd
Original Assignee
Harbin Talent Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin Talent Information Technology Co Ltd filed Critical Harbin Talent Information Technology Co Ltd
Priority to CN201910364169.8A priority Critical patent/CN110134901B/en
Publication of CN110134901A publication Critical patent/CN110134901A/en
Application granted granted Critical
Publication of CN110134901B publication Critical patent/CN110134901B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/958Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking
    • G06F16/986Document structures and storage, e.g. HTML extensions
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention discloses a kind of multilink webpage tamper determination method based on flow analysis, described method includes following steps: Step 1: configuration website rule;Step 2: capturing webpage in multiple link nodes, and compare history web pages and current web page using similarity alignment algorithm, obtaining the conclusion whether webpage is tampered;Step 3: the conclusion of multiple link nodes is summarized, and comprehensive analysis, show that webpage is to be distorted by flow or source is distorted.The present invention fully considers the characteristic of structure of web page, proposes layer weight concept;In conjunction with structure of web page and network upgrade and feature is distorted, proposes element classification concept, element influences factor concept and important attribute concept;In conjunction with web page contents and network upgrade and feature is distorted, proposes the abnormal judgment criteria concept of properties collection.The rate of false alarm of the method for the present invention is much smaller than the webpage tamper decision technology of other modes, and is able to detect whether webpage is kidnapped by flow.

Description

A kind of multilink webpage tamper determination method based on flow analysis
Technical field
The present invention relates to a kind of webpage tamper decision technologies, and in particular to a kind of multilink webpage based on flow analysis is usurped Change determination method.
Background technique
According to the position of deployment, existing webpage tamper decision technology can be divided into two major classes: webpage tamper is locally sentenced Determine technology and the long-range decision technology of webpage tamper.Current webpage tamper-resistant software distorts decision technology using local mostly, existing Some, which is distorted, to be determined in software, the WebGuard of Tianjin StarNet, the InforGuard of middleware company, CVIC SE, Shanghai Its iGuard deposited, the barracuda of Barracuda company, E-lock company Tripwire webalarm take local distort Decision technology, and it is less using the software for remotely distorting decision technology and maturation at present.
In the webpage tamper decision technology based on webpage similarity, the similarity comparison method based on editing distance for For static Web page, can simply and quickly obtain the similarity of webpage, and according to similarity judge webpage whether by It distorts;But for dynamic web page, this method is also it can be concluded that similarity, but the value of similarity does not have reference price but Value, can not accurately be inferred to normal network upgrade or webpage is tampered.Similarity comparison method based on structure of web page Accurate judgement can be made to the change of structure of web page, similarity algorithm is mainly for structure of web page, it may be assumed that from history web pages Similarity judge whether identical namely webpage is tampered two webpages in structure.It is distorted if applying the method in On decision technology, then this method that seems is considered not exclusively to tampering, if rogue program is only to the text shown on webpage It is distorted, then this method will cannot get correct conclusion.Similarity comparison method based on semantic analysis is suitable for theme Clearly demarcated webpage, if this method is applied on webpage tamper decision technology, then for example for not clearly demarcated enough the website of theme News or notification type, which will cannot get correct conclusion.Even the clearly demarcated webpage of theme, if interior after distorting Hold it is consistent with the content topic before distorting, then the tampering will not be found.
Summary of the invention
For the above problem existing for existing webpage tamper determination method, and a kind of energy low the present invention provides rate of false alarm Enough whether detection webpage is by the multilink webpage tamper determination method based on flow analysis of flow abduction.
The purpose of the present invention is what is be achieved through the following technical solutions:
A kind of multilink webpage tamper determination method based on flow analysis, includes the following steps:
Step 1: configuration website rule;
Step 2: capture webpage in multiple link nodes, and using similarity alignment algorithm by history web pages with work as Preceding webpage compares, and obtains the conclusion whether webpage is tampered;
Step 3: the conclusion of multiple link nodes is summarized, and comprehensive analysis, show that webpage is to be distorted by flow also It is that source is distorted.
Compared with the prior art, the present invention has the advantage that
1, the characteristic of structure of web page is fully considered, structure of web page is an inverted tree construction, closer to the member of root node Element, the influence to entire structure of web page is bigger, if father node changes, son's node has a possibility that very big that can change, can also be with Saying is that son's node can change with father node, so different weights are assigned to different layers in webpage tree construction, from And propose layer weight concept, i.e. influence degree of this layer to entire structure of web page.
2, in conjunction with structure of web page and network upgrade and feature is distorted, proposes element classification concept, is i.e. certain several element belong to Same class expresses the same meaning;It is proposed element influences factor concept, i.e. influence degree of the change of some element to structure; It is proposed important attribute concept, i.e. certain attributes of element are the attributes being often tampered, significant to the judgement distorted.
3, in conjunction with web page contents and network upgrade and feature is distorted, proposes the abnormal judgment criteria concept of properties collection, i.e., Changing for properties collection, the change which belongs to when website normally updates illustrated, which belongs to change when webpage tamper, and And rate of false alarm is much smaller than the webpage tamper decision technology of other modes.
Detailed description of the invention
Fig. 1 is webpage tamper decision model.
Fig. 2 is the flow chart of webpage similarity alignment algorithm.
Fig. 3 is the classification chart of text collection update status.
Fig. 4 is webpage tamper decision model --- the network topology compared based on multiple spot similarity.
Specific embodiment
Further description of the technical solution of the present invention with reference to the accompanying drawing, and however, it is not limited to this, all to this Inventive technique scheme is modified or replaced equivalently, and without departing from the spirit and scope of the technical solution of the present invention, should all be covered Within the protection scope of the present invention.
The present invention provides a kind of multilink webpage tamper determination method based on flow analysis, as shown in Figure 1, the side Method includes the following steps:
Step 1: addition targeted website and configuration website rule.
Webpage is carried out to the division of dynamic area and fixed area, during network upgrade, the content of dynamic area is In variation, and the content of fixed area is basically unchanged.The purpose for configuring website rule is mainly to determine the region of webpage, website There are two types of regular configuration modes:
1) artificial designated mode: which is partially fixed area in artificial specified target webpage, which is partially dynamic Region.The content needed to configure is: which properties collection is fixed area have in webpage dom tree;Dynamic area is in webpage DOM There is which properties collection in tree.
2) without artificial designated mode: by grab the website preceding M webpage (webpage grabbed each time with it is previous Webpage is different), the difference of comparison front and back webpage twice respectively, the dynamic area for obtaining webpage and fixed area are (if the value of M Too small, the judgement of fixed area may be wrong, to influence final conclusion), the content needed to configure is: the value of M.
Step 2: the website to configuration crawls, and it is compared using webpage similarity alignment algorithm.
After grabbing a webpage, the related data of webpage is first extracted, then by searching for the URL of the webpage, judgement is The related data of current web page and historical information are carried out similarity comparison if having, obtained by the no historical information for having the webpage The value of similarity, and be compared with the similarity a reference value of the webpage, it was therefore concluded that.Conclusion is summarized, so as to next Step carries out comprehensive Analysis of conclusion.If obtaining update conclusion, historical information is replaced with to current related data.The correlation of webpage Data include: virtual DOM, the location information of fixed area, the location information of dynamic area and similarity a reference value.Such as figure Shown in 2, specific step is as follows for webpage similarity alignment algorithm:
The first step, initialization: two sufficiently large queue q of initialization1, q2, for traversing the node in virtual DOM; A map<string is initialized, int>map_tag_affectoi is used to store impact factor for 2 element, i.e. in β conjunction Element;A map<string is initialized, int>map_tag_classify is for storage element classification;Initialization one is enough Big integer array array1, each element store one layer of variation ratio, i.e., the part that internal layer is summed in formula (4);Just Beginningization one sufficiently large two-dimentional integer array arrayb, stores corresponding text collection and belongs to fixed area or dynamic area Domain;Two vector<text>v1, v2 are initialized, for storing text collection;Initialize an integer array array2, array In each element store one set variation ratio, i.e., in formula internal layer sum part;Initialize double-precision floating points Nu records the cumulative and i.e. α in formula of the element changed in one layeri,j*Xi,jIt is cumulative and;Initialize double-precision floating points De, for recording the cumulative and i.e. α in formula of all elements in one layeri,jIt is cumulative and, execute second step.
Second step presses layer two virtual DOM of traversal simultaneously, and level number attribute and father are added into node in ergodic process Node serial number attribute, two root nodes join the team and (during joining the team, only focus on element, text, src attribute and href attribute), hold Row third step.
Third step goes out team: if q1, q2In one be sky, another be not it is empty, then execute the 9th step;If two queues are all Sky then executes the tenth step;Otherwise, q1And q2Team out, team's node is N1, N2 out, and son's node of N1 and N2 is entered in order respectively Team executes the 4th step.
4th step, comparison two node N1, N2: history father node serial number and current father node serial number are compared (if Without history father node number, then without comparison), if it is different, then executing the 5th step;Otherwise, the 6th step is executed.
5th step, comparison two text collection v1, v2: by set variation proportional recording to array array2, the 6th is executed Step.The text collection alignment algorithm used in this step is as shown in table 1:
Specific step is as follows for text collection alignment algorithm:
Step 1: two text collection entry sum textN1, textN2 are taken to initialize if textN1==textN2 Update number of entries textU is textN1, and executes step 2;Otherwise step 10 is executed.
Step 2: the front pointer (forward iteration device can be used) of v1 and v2 and rear pointer is taken (to can be used reversed Iterator), oldhead, newhead, oldrear, newrear execute step 3.
Step 3: if only one entry in v1, and the length of the entry is not less than static_len (static Web page text Length), then follow the steps nine;If oldhead < oldrear, and newhead < newrear, then follow the steps four;Otherwise, it executes Step 11.
Step 4: comparison oldrear and newrear is marked two entries if signified entry is identical, Oldrear and newrear subtracts one simultaneously, and textU subtracts one, executes step 3;Otherwise, step 5 is executed.
Step 5: if oldhead < oldrear, and newhead < newrear, then follow the steps six;If oldhead== Oldrear, and newhead==newrear, then follow the steps 11.
Step 6: comparison oldhead and newhead is marked two entries if signified entry is identical, Oldhead and newhead increases one, textU simultaneously and subtracts one, executes step 5;Otherwise, step 7 is executed.
Step 7: if all entries are labeled in v1 or comparison terminates one by one, step 11 is executed;Otherwise, by v1 not Labeled entry is not compared one by one in labeled entry and v2, if they are the same, then two entries is marked, TextU subtracts one, executes step 7;If it is different, thening follow the steps eight;
Step 8: comparing the two entries in a manner of simple editing distance, if meet ldmatch (text1, Text2)=1, then two entries are marked, execute step 7.
Step 9: editing distance algorithm is used, to calculate the similarity of entire webpage, entire alignment algorithm terminates.
Step 10: determine that the webpage is tampered, text collection alignment algorithm terminates.
Step 11: determine that the webpage is not tampered with, text collection alignment algorithm terminates.
6th step, recording layer change ratio: by the text text1 in node, text2 is stored in v1, v2 respectively, by history layer Number and current level number compare (if without history level number, without comparison), if it is different, then this layer of structure is become according to nu and de Change proportional recording into array array1, executes the 7th step.
7th step, comparison important attribute propl1, propl2: if propl1 and propl2 exist, and being all src or same For href, and attribute value is identical or propl1 and propl2 is not present, then executes the 8th step;Otherwise the 9th step is executed.
Two 8th step, comparison element tag1、tag2If: tag1And tag2It is not sky, and is not belonging to identity element classification (according to map_tag_classify) then executes the 9th step;Otherwise, the cumulative (according to map_tag_ of nu and de is carried out Affectoi), and third step is executed.
9th step determines that the webpage is to be tampered, and algorithm terminates.
Tenth step determines that the webpage is to be not tampered with, and algorithm terminates.
Use following parameter and related definition in algorithm: node, element classification, element variation degree, the element influences factor, Layer weight, text collection, gathers variation degree, important attribute, content similarity, similarity a reference value at structural similarity herein, Wherein:
Node: including element tag, text text and attribute prop in given one tree T, tree a height of H, node N, wherein One node N, which may include at most element a tag, a node N, may include at most text a text, a node N It may include any attribute prop, but node N can not be for sky.Node N can be embodied as in tree construction: Ni,j, Wherein i is level number, and the minimum value of i is 1, maximum value H;J is ordinal position of the node in i-th layer, and the minimum value of j is 1, most Big value is numNi, numNiIndicate the n-th umNiThe number of node in layer.
Element classification: element common in webpage is subjected to artificial classification, classifying rules are as follows: if the meaning phase of element representation Closely, then these elements are classified as one kind.
Element variation degree: two element tag are given1、tag2, variation degree X are as follows: if two elements are identical, then changing Degree is 0;If two elements are different and belong to same classification, then variation degree is 0.5;If tag1Or tag2There is a presence Another is not present, then variation degree is 1;If two elements are different and are not belonging to same classification, then being directly judged to usurping Change.Variation degree X may be expressed as:
The element influences factor: giving an element tag, if tag belongs to set β, the impact factor α of that identical element element is 2;It is no It is then 1.Element influences factor-alpha is expressed as follows:
Wherein, i is level number, and the minimum value of i is 1, maximum value H;J is ordinal position of the node in i-th layer, and j is most Small value is 1, maximum value numNi, β set is comprising { div, table, form, tr, td } and belonging to same point with these elements The element of class.
Layer weight: given one tree T sets a height of H, and layer weight W can be indicated are as follows:
Wherein, i is level number, and the minimum value of i is 1, maximum value H.
Structural similarity: two tree T are given1, T2, two similarity Sss of the tree in structure can indicate are as follows:
Text collection: providing one section of text text, and leaf node a node, node include text, if node have it is other Sibling, and have text text in other siblings1, text2..., textn, then text, text1, text2..., textnFor a text collection textS;Otherwise text text is individually for a text collection textS.
Text collection variation degree: two text collection text are given1, text2, two set number of entries be textN1, textN2, two set, which compare, show that the quantity for updating entry is textU, and the editing distance of certain two entry is ld (text1,text2), text size is len (text).When text collection is fixed area, if textN1=textN2, and TextU=0, then text collection impact factor β=0;Otherwise directly determine that the webpage is tampered.When text collection is dynamic area Domain, if textN1=textN2, textU=0, then text collection impact factor β=0;If textN1≠textN2, then directly It connects and is determined as that the webpage is tampered;If textN1=textN2, textU ≠ 0, and ldmatch (text is not present1,text2)=1 (a part of some entry is tampered in set, it may be assumed that textS1In there are text1, textS2In there are text2, ld (text1, text2) > 1/3max (len (text1),len(text2)), ld (text1,text2) < 1/2max (len (text1),len (text2)), which is expressed as ldmatch (text1,text2)=1), then text collection impact factor β=0.5;If textN1=textN2, textU ≠ 0, ldmatch (text1,text2)=1, then text collection impact factor β=1.Text Set variation degree β can be indicated are as follows:
Fixed area:
Dynamic area:
Wherein, i is level number, and the minimum value of i is 1, maximum value H;K is ordinal position of the text collection in i-th layer, j Minimum value be 1, maximum value numTi, numTiIndicate the number of text collection in i-th layer.
Important attribute: the certain attributes played an important role in webpage tamper judgement for including in element are referred to as to attach most importance to Want attribute.In the case where text or constant picture, if important attribute changes, directly determine that the webpage is tampered.Important category Property set Y={ src, href }.
Content similarity: two tree T are provided1, T2, two tree similarity Sc in terms of content can indicate are as follows:
Wherein, i is level number, and w is the quantity of text collection in i-th layer, textSi,wU indicates the item changed in text collection Mesh number amount, textSi,wN indicates entry total quantity in text collection, numSiIndicate the quantity gathered herein in i-th layer.
Similarity a reference value: structural similarity a reference value and content similarity a reference value formula are as follows:
Step 3: the conclusion obtained to multiple nodes is analyzed.
The reason of being deployed in multinode is: after similarity alignment algorithm is drawn a conclusion, if only one section of whole system Point, then we can not judge that the webpage is distorted or be tampered in intermediate line link by source.If there are multiple nodes, can lead to It crosses and compares the conclusions of multiple nodes and do further judgement, obtain the conclusion whether webpage is distorted in intermediate line link by flow.It is false Equipped with n node, then node can be expressed as k1, k2, k3..., kn, target website server is expressed as s, specifically compares Scheme are as follows:
If 1) k1, k2, k3..., knConclusion is all non-and distorts, then final conclusion is: the feelings that no webpage flow is distorted Condition;
2) if certain several node conclusion is to distort (such as k1, k2Conclusion is to distort), other node conclusions are distorted to be non-, then Final conclusion is: from s to distorting node (k1And k2) chain road all there is the case where webpage flow is distorted;As shown in figure 4, if K2, k3 node conclusion be distort, then k2, k3 to s chain on the road be held as a hostage.
If 3) k1, k2, k3..., knConclusion is to distort, then it is possible that webpage is distorted by source, it is also possible to be in s to k1, k2, k3..., knChain road all there is the case where webpage flow is distorted.

Claims (6)

1. a kind of multilink webpage tamper determination method based on flow analysis, it is characterised in that the method includes walking as follows It is rapid:
Step 1: configuration website rule;
Step 2: capturing webpage in multiple link nodes, and use similarity alignment algorithm by history web pages and current net Page compares, and obtains the conclusion whether webpage is tampered;
Step 3: the conclusion of multiple link nodes is summarized, and comprehensive analysis, show that webpage is to be distorted by flow or source It distorts.
2. the multilink webpage tamper determination method according to claim 1 based on flow analysis, it is characterised in that described In step 1, there are two types of the configuration modes of website rule:
1) artificial designated mode: which is partially fixed area in artificial specified target webpage, which is partially dynamic area, The content needed to configure is: which properties collection is fixed area have in webpage dom tree, and dynamic area is in webpage dom tree There is which properties collection;
2) without artificial designated mode: the preceding M webpage by grabbing the website, the difference of comparison front and back webpage twice, obtains respectively The dynamic area of webpage and fixed area out, the content needed to configure is: the value of M.
3. the multilink webpage tamper determination method according to claim 1 based on flow analysis, it is characterised in that described Specific step is as follows for step 2:
After grabbing a webpage, the related data of webpage is first extracted, then by searching for the URL of the webpage, is judged whether there is The related data of current web page and historical information are carried out similarity comparison if having, obtained similar by the historical information of the webpage The value of degree, and be compared with the similarity a reference value of the webpage, it was therefore concluded that;If update conclusion is obtained, by historical information Replace with current related data.
4. the multilink webpage tamper determination method according to claim 3 based on flow analysis, it is characterised in that described The related data of webpage includes: virtual DOM, the location information of fixed area, the location information of dynamic area and similarity A reference value.
5. the multilink webpage tamper determination method according to claim 1 based on flow analysis, it is characterised in that described Specific step is as follows for similarity alignment algorithm:
The first step, initialization: two sufficiently large queue q of initialization1, q2, for traversing the node in virtual DOM;Initially Change a map<string, int>map_tag_affectoi is used to store the element that impact factor is 2;Initialize a map < String, int > map_tag_classify are for storage element classification;A sufficiently large integer array array1 is initialized, Each element stores one layer of variation ratio;A sufficiently large two-dimentional integer array arrayb is initialized, is stored corresponding Text collection belongs to fixed area or dynamic area;Two vector<text>v1, v2 are initialized, for storing text set It closes;An integer array array2 is initialized, each element stores the variation ratio of a set in array;The double essences of initialization Spend floating number nu, the element changed in one layer of record cumulative and;Double-precision floating points de is initialized, for recording in one layer The cumulative and execution second step of all elements;
Second step presses layer two virtual DOM of traversal simultaneously, and level number attribute and father node are added into node in ergodic process Serial number attribute, two root nodes are joined the team, and third step is executed;
Third step goes out team: if q1, q2In one be sky, another be not it is empty, then execute the 9th step;If two queues are all sky, Execute the tenth step;Otherwise, q1And q2Team out, team's node is N1, N2 out, and son's node of N1 and N2 is joined the team in order respectively, is held The 4th step of row;
4th step, comparison two node N1, N2: history father node serial number and current father node serial number are compared, if it is different, Then execute the 5th step;Otherwise, the 6th step is executed;
5th step, comparison two text collection v1, v2: by set variation proportional recording to array array2, the 6th step is executed;
6th step, recording layer change ratio: by the text text1 in node, text2 is stored in v1, v2 respectively, by history level number and Current level number compares, if it is different, then according to nu and de by this layer of structure change proportional recording into array array1, execute 7th step;
7th step, comparison important attribute propl1, propl2: if propl1 and propl2 exist, and it is all src or is all Href, and attribute value is identical or propl1 and propl2 is not present, then executes the 8th step;Otherwise the 9th step is executed;
Two 8th step, comparison element tag1、tag2If: tag1And tag2It is not sky, and is not belonging to identity element classification, then holds The 9th step of row;Otherwise, the cumulative of nu and de is carried out, and executes third step;
9th step determines that the webpage is to be tampered, and algorithm terminates;
Tenth step determines that the webpage is to be not tampered with, and algorithm terminates.
6. the multilink webpage tamper determination method according to claim 1 based on flow analysis, it is characterised in that described In step 3, judge that webpage is as follows by flow is distorted or source is distorted method:
Assuming that having n node, then node is expressed as k1, k2, k3..., kn, target website server is expressed as s, specifically compares Scheme are as follows:
If 1) k1, k2, k3..., knConclusion is all non-and distorts, then final conclusion is: the case where no webpage flow is distorted;
If 2) certain several node conclusion be distort, other node conclusions be it is non-distort, then final conclusion is: from s to distort section All there is the case where webpage flow is distorted in the chain road of point;
If 3) k1, k2, k3..., knConclusion is to distort, then it is possible that webpage is distorted by source, it is also possible to be in s to k1, k2, k3..., knChain road all there is the case where webpage flow is distorted.
CN201910364169.8A 2019-04-30 2019-04-30 Multilink webpage tampering judging method based on flow analysis Active CN110134901B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910364169.8A CN110134901B (en) 2019-04-30 2019-04-30 Multilink webpage tampering judging method based on flow analysis

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910364169.8A CN110134901B (en) 2019-04-30 2019-04-30 Multilink webpage tampering judging method based on flow analysis

Publications (2)

Publication Number Publication Date
CN110134901A true CN110134901A (en) 2019-08-16
CN110134901B CN110134901B (en) 2023-06-16

Family

ID=67575909

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910364169.8A Active CN110134901B (en) 2019-04-30 2019-04-30 Multilink webpage tampering judging method based on flow analysis

Country Status (1)

Country Link
CN (1) CN110134901B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110995732A (en) * 2019-12-12 2020-04-10 杭州安恒信息技术股份有限公司 Webpage tampering detection method and related device
CN111262842A (en) * 2020-01-10 2020-06-09 恒安嘉新(北京)科技股份公司 Webpage tamper-proofing method and device, electronic equipment and storage medium
CN114978710A (en) * 2022-05-25 2022-08-30 中国农业银行股份有限公司 Webpage data tamper-proof processing method and device and electronic equipment

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102436564A (en) * 2011-12-30 2012-05-02 奇智软件(北京)有限公司 Method and device for identifying falsified webpage
WO2013097742A1 (en) * 2011-12-30 2013-07-04 北京奇虎科技有限公司 Methods and devices for identifying tampered webpage and identifying hijacked website
US20140359760A1 (en) * 2013-05-31 2014-12-04 Adi Labs, Inc. System and method for detecting phishing webpages
CN104462142A (en) * 2013-09-24 2015-03-25 联想(北京)有限公司 Method and device for searching for content in webpage
CN105205820A (en) * 2015-09-21 2015-12-30 昆明理工大学 Improved characteristic similarity image quality evaluating method
CN106301934A (en) * 2016-08-23 2017-01-04 成都科来软件有限公司 A kind of based on multilink data bag excavation search method and device
CN106656991A (en) * 2016-10-28 2017-05-10 上海百太信息科技有限公司 Network threat detection system and detection method
CN106685936A (en) * 2016-12-14 2017-05-17 深圳市深信服电子科技有限公司 Webpage defacement detection method and apparatus
US20180121558A1 (en) * 2016-11-03 2018-05-03 Institute For Information Industry Webpage data extraction device and webpage data extraction method thereof
CN108021692A (en) * 2017-12-18 2018-05-11 北京天融信网络安全技术有限公司 A kind of method of web page monitored, server and computer-readable recording medium
CN108073828A (en) * 2016-11-16 2018-05-25 阿里巴巴集团控股有限公司 A kind of webpage integrity assurance, apparatus and system
CN109597972A (en) * 2018-12-10 2019-04-09 杭州全维技术股份有限公司 A kind of webpage dynamic change and altering detecting method based on web page frame

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102436564A (en) * 2011-12-30 2012-05-02 奇智软件(北京)有限公司 Method and device for identifying falsified webpage
WO2013097742A1 (en) * 2011-12-30 2013-07-04 北京奇虎科技有限公司 Methods and devices for identifying tampered webpage and identifying hijacked website
US20140359760A1 (en) * 2013-05-31 2014-12-04 Adi Labs, Inc. System and method for detecting phishing webpages
CN104462142A (en) * 2013-09-24 2015-03-25 联想(北京)有限公司 Method and device for searching for content in webpage
CN105205820A (en) * 2015-09-21 2015-12-30 昆明理工大学 Improved characteristic similarity image quality evaluating method
CN106301934A (en) * 2016-08-23 2017-01-04 成都科来软件有限公司 A kind of based on multilink data bag excavation search method and device
CN106656991A (en) * 2016-10-28 2017-05-10 上海百太信息科技有限公司 Network threat detection system and detection method
US20180121558A1 (en) * 2016-11-03 2018-05-03 Institute For Information Industry Webpage data extraction device and webpage data extraction method thereof
CN108073828A (en) * 2016-11-16 2018-05-25 阿里巴巴集团控股有限公司 A kind of webpage integrity assurance, apparatus and system
CN106685936A (en) * 2016-12-14 2017-05-17 深圳市深信服电子科技有限公司 Webpage defacement detection method and apparatus
CN108021692A (en) * 2017-12-18 2018-05-11 北京天融信网络安全技术有限公司 A kind of method of web page monitored, server and computer-readable recording medium
CN109597972A (en) * 2018-12-10 2019-04-09 杭州全维技术股份有限公司 A kind of webpage dynamic change and altering detecting method based on web page frame

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
王涛等: "基于HTTP会话过程跟踪的网页挂马攻击检测方法", 《计算机研究与发展》 *
王涛等: "基于HTTP会话过程跟踪的网页挂马攻击检测方法", 《计算机研究与发展》, 15 October 2012 (2012-10-15) *
魏文晗;邓一贵;: "基于局部变化性的网页篡改识别模型及方法", 计算机应用, no. 02 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110995732A (en) * 2019-12-12 2020-04-10 杭州安恒信息技术股份有限公司 Webpage tampering detection method and related device
CN111262842A (en) * 2020-01-10 2020-06-09 恒安嘉新(北京)科技股份公司 Webpage tamper-proofing method and device, electronic equipment and storage medium
CN111262842B (en) * 2020-01-10 2022-09-06 恒安嘉新(北京)科技股份公司 Webpage tamper-proofing method and device, electronic equipment and storage medium
CN114978710A (en) * 2022-05-25 2022-08-30 中国农业银行股份有限公司 Webpage data tamper-proof processing method and device and electronic equipment

Also Published As

Publication number Publication date
CN110134901B (en) 2023-06-16

Similar Documents

Publication Publication Date Title
CN108629633A (en) A kind of method and system for establishing user&#39;s portrait based on big data
KR101017016B1 (en) Method, system and computer-readable recording medium for providing information on goods based on image matching
CN112564988B (en) Alarm processing method and device and electronic equipment
CN107578292B (en) User portrait construction system
CN110134901A (en) A kind of multilink webpage tamper determination method based on flow analysis
CN107111625A (en) Realize the method and system of the efficient classification and exploration of data
CN107844533A (en) A kind of intelligent Answer System and analysis method
US9020879B2 (en) Intelligent data agent for a knowledge management system
CN105868366B (en) Based on concept related concept space air navigation aid
CN101251857B (en) System, device and method for information storage and research
CN103226609A (en) Searching method for WEB focus searching system
US20140114949A1 (en) Knowledge Management System
CN111597422A (en) Buried point mapping method and device, computer equipment and storage medium
Qian et al. Mining logical clones in software: Revealing high-level business and programming rules
US9305261B2 (en) Knowledge management engine for a knowledge management system
US9720984B2 (en) Visualization engine for a knowledge management system
Wu et al. Extracting knowledge from web tables based on DOM tree similarity
CN100449534C (en) Information storage and research
CN108549727A (en) User&#39;s profit information-pushing method based on web crawlers and big data analysis
Yang et al. Fastpm: An approach to pattern matching via distributed stream processing
CN115470489A (en) Detection model training method, detection method, device and computer readable medium
CN115878877A (en) Concept drift-based visual detection method for access crawler of aviation server
Annam et al. Entropy based informative content density approach for efficient web content extraction
US20130218893A1 (en) Executing in-database data mining processes
Ye et al. Detecting and Partitioning Data Objects in Complex Web Pages

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant