CN102624713A - Website tampering identification method and website tampering identification device - Google Patents

Website tampering identification method and website tampering identification device Download PDF

Info

Publication number
CN102624713A
CN102624713A CN2012100491292A CN201210049129A CN102624713A CN 102624713 A CN102624713 A CN 102624713A CN 2012100491292 A CN2012100491292 A CN 2012100491292A CN 201210049129 A CN201210049129 A CN 201210049129A CN 102624713 A CN102624713 A CN 102624713A
Authority
CN
China
Prior art keywords
website
frame structure
structure information
distorted
identification
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2012100491292A
Other languages
Chinese (zh)
Other versions
CN102624713B (en
Inventor
李艳坤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sangfor Technologies Co Ltd
Original Assignee
Sangfor Network Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sangfor Network Technology Shenzhen Co Ltd filed Critical Sangfor Network Technology Shenzhen Co Ltd
Priority to CN201210049129.2A priority Critical patent/CN102624713B/en
Publication of CN102624713A publication Critical patent/CN102624713A/en
Application granted granted Critical
Publication of CN102624713B publication Critical patent/CN102624713B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Image Analysis (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention discloses a website tampering identification method and a website tampering identification device. The method includes the following steps: acquiring the framework structure information of website pages; comparing the acquired framework structure information with server-mirrored framework structure information; and according to a comparison result, carrying out identification processing. The invention can carry out similarity comparison according to the page framework structure information extracted from the response packet of a Web server and the stored server-mirrored framework structure information to judge whether the website pages are tempered, thus enhancing the effect of tampering identification.

Description

The method and the device of identification distorted in the website
Technical field
The present invention relates to the website falsification-proof technology, specially refer to method and device that identification is distorted in a kind of website.
Background technology
Country's the Internet emergency center (CNCERT/CC) publish data is in the recent period claimed: in September, 2011; The website quantity of being distorted within the border is 2227; According to the type statistics of website, by the quantity of being distorted maximum be commercial website, netizen's property safety is caused a significant threat.
At present general to distort identifying schemes a lot, mainly contain two kinds reliably, and a kind of is embedding technique in the core, and another kind is based on gateway or bridge static state and distorts recognition technology.
In the embedding technique, distort detection part and run on WEB server inside in the core.When issue guarded website, distort detection part and can calculate the unique encrypted watermark of each page.When each webpage is browsed, the current watermark and the issue watermark of webpage are compared, in real time the website is protected.This kind technical disadvantages: need distort detection part at the Web server internal operation, increase keeper's operation, simultaneously watermark all carried out in the website of each outflow and calculate and compare, take huge resource, cause very big burden to Web server.
Distort recognition technology based on gateway or bridge static state and normally whole webpage is carried out buffer memory, and cache contents is carried out Hash calculate its cryptographic hash of acquisition.When the page is browsed, calculate the cryptographic hash of the page and the cryptographic hash comparison of issue, Web server is protected.But for dynamic website, the webpage change frequency is very big, and dynamic website webpage cryptographic hash also often changes, and relies on hash value relatively also unreliable, can strengthen erroneous judgement, and recognition effect is undesirable.
Summary of the invention
Main purpose of the present invention is the method that provides a kind of website to distort identification, promoted the effect that identification is distorted in the website.
The present invention proposes the method that identification is distorted in a kind of website, comprises step:
Obtain the frame structure information of Website page;
Frame structure information of obtaining and server mirroring frame structure information are compared;
Discern processing according to comparative result.
Preferably, the said step of obtaining the frame structure information of Website page specifically comprises:
Grasp server customer in response end data bag, extract the info web and the preservation that meet the frame information storehouse.
Preferably, the said step that the frame structure information of obtaining and server mirroring frame structure information are compared specifically comprises:
According to frame structure information of obtaining and server mirroring frame structure information, carry out similarity and calculate, obtain similarity numerical value.
Preferably, saidly discern processed steps according to comparative result and specifically comprise:
Said similarity numerical value and predetermined threshold value are compared, judge whether Website page is distorted;
When website and webpage are distorted, website and webpage are redirected and alarm;
When website and webpage are not distorted, the clearance data.
Preferably, said server mirroring frame structure information obtains through the web crawlers mode.
The present invention also proposes the device that identification is distorted in a kind of website, comprising:
The frame structure extraction unit is used to obtain the frame structure information of Website page;
The similarity comparing unit is used for frame structure information of obtaining and server mirroring frame structure information are compared;
The identification processing unit is used for discerning processing according to comparative result.
Preferably, said frame structure extraction unit specifically is used for:
Grasp server customer in response end data bag, extract the info web and the preservation that meet the frame information storehouse.
Preferably, said similarity comparing unit specifically is used for:
According to frame structure information of obtaining and server mirroring frame structure information, carry out similarity and calculate, obtain similarity numerical value.
Preferably, said identification processing unit specifically comprises:
Relatively judge module is used for said similarity numerical value and predetermined threshold value are compared, and judges whether Website page is distorted;
Be redirected and alarm module, be used for when website and webpage are distorted, website and webpage being redirected and alarming;
The clearance module is used for when website and webpage are not distorted the clearance data.
Preferably, said server mirroring frame structure information obtains through the web crawlers mode.
The present invention can be according to the page frame shelf structure information of extracting in the web server response packet, and the frame structure information of the server mirroring of preserving, and carries out the similarity comparison, judges that whether Website page is distorted, and so can promote the effect of distorting identification.
Description of drawings
Fig. 1 is that the steps flow chart sketch map among method one embodiment of identification is distorted in website of the present invention;
Fig. 2 is that the identification treatment step schematic flow sheet among method one embodiment of identification is distorted in website of the present invention;
Fig. 3 is that the structural representation among device one embodiment of identification is distorted in website of the present invention;
Fig. 4 is that the identification processing unit structural representation among device one embodiment of identification is distorted in website of the present invention.
The realization of the object of the invention, functional characteristics and advantage will combine embodiment, further specify with reference to accompanying drawing.
Embodiment
Should be appreciated that specific embodiment described herein only in order to explanation the present invention, and be not used in qualification the present invention.
With reference to Fig. 1, propose a kind of website of the present invention and distort an embodiment of the method for identification.This method can comprise:
Step S10, obtain the frame structure information of Website page;
Step S11, frame structure information of obtaining and server mirroring frame structure information are compared;
Step S12, discern processing according to comparative result.
(Website Tamper-Preventing System WTPS) sets up between Web server and the client, and this website falsification-proof system can be configured to gateway, network bridge mode etc. in the website falsification-proof system of present embodiment.
Obtaining of said frame structural information can be extracted the info web and the preservation that meet the frame information storehouse through above-mentioned each packet of website falsification-proof system grabs server customer in response end.It is that Website page (such as dynamic page) variation frequency is less that this frame structure information bank extracts the page info standard.
Then, this website falsification-proof system can carry out similarity and calculate according to frame structure information of obtaining and server mirroring frame structure information, obtains similarity numerical value.This server mirroring frame structure information obtains through modes such as web crawlers.This similarity is calculated the algorithm that adopts can comprise Shingle algorithm, Simhash algorithm and Bloom filter algorithm etc.
With reference to Fig. 2, above-mentioned steps S12 can specifically comprise:
Step S121, with said similarity numerical value and predetermined threshold value relatively judges whether Website page is distorted; When website and webpage are distorted, carry out step S122; When website and webpage are not distorted, carry out step S123;
Step S122, website and webpage are redirected and alarm;
Step S123, clearance data.
Above-mentioned predetermined threshold value can be set as the case may be, and can setting above-mentioned similarity numerical value, to be lower than this predetermined threshold value be that the decidable Website page is distorted.When judging that website and webpage are distorted, website and webpage are redirected and alarm; When judging that website and webpage are not distorted, but this Website page secure access is described, the data of can letting pass.
The method of identification is distorted in above-mentioned website, to existing recognition technology performance, the not good situation of recognition effect, proposed based on gateway or bridge dynamically/static Web page distorts RM.Can the website falsification-proof system be erected between client and the Web server.The response data packet of Web server is delivered to client through the website falsification-proof system; Page frame shelf structure information to the web server response packet is extracted; Extract corresponding frame structure information (backing up) according to the server mirroring of preserving simultaneously; Two parts of frame structure information are carried out the similarity comparison, judge that whether Website page is distorted, and so can promote the performance and the effect of distorting identification.Because what similarity identification was directed against is web page frame, therefore can supports the identification that the Web page of any kinds such as static website or dynamic website is distorted, and have good identification effect.
With reference to Fig. 3, the embodiment that the device 20 of identification is distorted in a kind of website of the present invention is proposed.This device 20 can comprise: frame structure extraction unit 21, similarity comparing unit 22 and identification processing unit 23; This frame structure extraction unit 21 is used to obtain the frame structure information of Website page; This similarity comparing unit 22 is used for frame structure information of obtaining and server mirroring frame structure information are compared; This discerns processing unit 23, is used for discerning processing according to comparative result.
Said frame structure extraction unit 21 specifically is used for: grasp each packet of server customer in response end, extract the info web and the preservation that meet the frame information storehouse.It is that Website page (such as dynamic page) variation frequency is less that this frame structure information bank extracts the page info standard.
Above-mentioned similarity comparing unit 22 specifically is used for: according to frame structure information of obtaining and server mirroring frame structure information, carry out similarity and calculate, obtain similarity numerical value.This server mirroring frame structure information obtains through modes such as web crawlers.This similarity is calculated the algorithm that adopts can comprise Shingle algorithm, Simhash algorithm and Bloom filter algorithm etc.
With reference to Fig. 4, above-mentioned identification processing unit 23 specifically comprises: compare judge module 231, be redirected and alarm module 232 and clearance module 233; This is judge module 231 relatively, is used for said similarity numerical value and predetermined threshold value are compared, and judges whether Website page is distorted; Should be redirected and alarm module 232, and be used for when website and webpage are distorted, website and webpage being redirected and alarming; This clearance module 233 is used for when website and webpage are not distorted the clearance data.
Above-mentioned predetermined threshold value can be set as the case may be, and can setting above-mentioned similarity numerical value, to be lower than this predetermined threshold value be that the decidable Website page is distorted.When judging that website and webpage are distorted, website and webpage are redirected and alarm; When judging that website and webpage are not distorted, but this Website page secure access is described, the data of can letting pass.
The device 20 of identification is distorted in above-mentioned website, to existing recognition technology performance, the not good situation of recognition effect, proposed based on gateway or bridge dynamically/static Web page distorts RM.Can the device 20 that identification is distorted in this website be erected between client and the Web server.The device 20 that the response data packet of Web server is distorted identification through the website is delivered to client; Page frame shelf structure information to the web server response packet is extracted; Extract corresponding frame structure information (backing up) according to the server mirroring of preserving simultaneously; Two parts of frame structure information are carried out the similarity comparison, judge that whether Website page is distorted, and so can promote the performance and the effect of distorting identification.Because what similarity identification was directed against is web page frame, therefore can supports the identification that the Web page of any kinds such as static website or dynamic website is distorted, and have good identification effect.
The above is merely the preferred embodiments of the present invention; Be not so limit claim of the present invention; Every equivalent structure or equivalent flow process conversion that utilizes specification of the present invention and accompanying drawing content to be done; Or directly or indirectly be used in other relevant technical fields, all in like manner be included in the scope of patent protection of the present invention.

Claims (10)

1. the method for identification is distorted in a website, it is characterized in that, comprises step:
Obtain the frame structure information of Website page;
Frame structure information of obtaining and server mirroring frame structure information are compared;
Discern processing according to comparative result.
2. the method for identification is distorted in website according to claim 1, it is characterized in that, the said step of obtaining the frame structure information of Website page specifically comprises:
Grasp server customer in response end data bag, extract the info web and the preservation that meet the frame information storehouse.
3. the method for identification is distorted in website according to claim 1, it is characterized in that, the said step that the frame structure information of obtaining and server mirroring frame structure information are compared specifically comprises:
According to frame structure information of obtaining and server mirroring frame structure information, carry out similarity and calculate, obtain similarity numerical value.
4. the method for identification is distorted in website according to claim 3, it is characterized in that, saidly discerns processed steps according to comparative result and specifically comprises:
Said similarity numerical value and predetermined threshold value are compared, judge whether Website page is distorted;
When website and webpage are distorted, website and webpage are redirected and alarm;
When website and webpage are not distorted, the clearance data.
5. distort the method for identification according to claim 3 or 4 described websites, it is characterized in that said server mirroring frame structure information obtains through the web crawlers mode.
6. the device of identification is distorted in a website, it is characterized in that, comprising:
The frame structure extraction unit is used to obtain the frame structure information of Website page;
The similarity comparing unit is used for frame structure information of obtaining and server mirroring frame structure information are compared;
The identification processing unit is used for discerning processing according to comparative result.
7. the device of identification is distorted in website according to claim 6, it is characterized in that, said frame structure extraction unit specifically is used for:
Grasp server customer in response end data bag, extract the info web and the preservation that meet the frame information storehouse.
8. the device of identification is distorted in website according to claim 6, it is characterized in that, said similarity comparing unit specifically is used for:
According to frame structure information of obtaining and server mirroring frame structure information, carry out similarity and calculate, obtain similarity numerical value.
9. the device of identification is distorted in website according to claim 8, it is characterized in that, said identification processing unit specifically comprises:
Relatively judge module is used for said similarity numerical value and predetermined threshold value are compared, and judges whether Website page is distorted;
Be redirected and alarm module, be used for when website and webpage are distorted, website and webpage being redirected and alarming;
The clearance module is used for when website and webpage are not distorted the clearance data.
According to Claim 8 or 9 described websites distort the device of identification, it is characterized in that said server mirroring frame structure information obtains through the web crawlers mode.
CN201210049129.2A 2012-02-29 2012-02-29 The method of website tamper Detection and device Active CN102624713B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210049129.2A CN102624713B (en) 2012-02-29 2012-02-29 The method of website tamper Detection and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210049129.2A CN102624713B (en) 2012-02-29 2012-02-29 The method of website tamper Detection and device

Publications (2)

Publication Number Publication Date
CN102624713A true CN102624713A (en) 2012-08-01
CN102624713B CN102624713B (en) 2016-01-06

Family

ID=46564398

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210049129.2A Active CN102624713B (en) 2012-02-29 2012-02-29 The method of website tamper Detection and device

Country Status (1)

Country Link
CN (1) CN102624713B (en)

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103577526A (en) * 2013-08-01 2014-02-12 星云融创(北京)信息技术有限公司 Method and system as well as browser for verifying page modification
CN103812673A (en) * 2012-11-07 2014-05-21 江苏仕德伟网络科技股份有限公司 Method for automatically recognizing multiple IP changes in website
CN104008131A (en) * 2014-04-30 2014-08-27 广州市动景计算机科技有限公司 Processing method and device for web page data
CN105975395A (en) * 2016-05-30 2016-09-28 深圳市华傲数据技术有限公司 Website state reconnaissance method and device
CN107204960A (en) * 2016-03-16 2017-09-26 阿里巴巴集团控股有限公司 Web page identification method and device, server
CN107301355A (en) * 2017-06-20 2017-10-27 深信服科技股份有限公司 A kind of webpage tamper monitoring method and device
CN107566354A (en) * 2017-08-22 2018-01-09 北京小米移动软件有限公司 Web page contents detection method, device and storage medium
CN107835191A (en) * 2017-11-29 2018-03-23 中科信息安全共性技术国家工程研究中心有限公司 A kind of method and apparatus for detecting webpage malicious and distorting
CN108021692A (en) * 2017-12-18 2018-05-11 北京天融信网络安全技术有限公司 A kind of method of web page monitored, server and computer-readable recording medium
CN108171082A (en) * 2017-12-06 2018-06-15 新华三信息安全技术有限公司 A kind of webpage detection method and device
CN110708292A (en) * 2019-09-11 2020-01-17 光通天下网络科技股份有限公司 IP processing method, device, medium and electronic equipment
CN111159517A (en) * 2019-12-12 2020-05-15 深信服科技股份有限公司 Information processing method, device, system and computer storage medium
CN113348655A (en) * 2019-04-11 2021-09-03 深圳市欢太科技有限公司 Anti-hijacking method and device for browser, electronic equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110167108A1 (en) * 2008-07-11 2011-07-07 Xueli Chen Web page tamper-froof device, method and system
CN102129528A (en) * 2010-01-19 2011-07-20 北京启明星辰信息技术股份有限公司 WEB page tampering identification method and system
CN102176722A (en) * 2011-03-16 2011-09-07 中国科学院软件研究所 Method and system for preventing page tampering based on front-end gateway
CN102316081A (en) * 2010-06-30 2012-01-11 北京启明星辰信息技术股份有限公司 Method and device for identifying similar webpage

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110167108A1 (en) * 2008-07-11 2011-07-07 Xueli Chen Web page tamper-froof device, method and system
CN102129528A (en) * 2010-01-19 2011-07-20 北京启明星辰信息技术股份有限公司 WEB page tampering identification method and system
CN102316081A (en) * 2010-06-30 2012-01-11 北京启明星辰信息技术股份有限公司 Method and device for identifying similar webpage
CN102176722A (en) * 2011-03-16 2011-09-07 中国科学院软件研究所 Method and system for preventing page tampering based on front-end gateway

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
姚滢: "网页防篡改系统的研究与设计方案", 《计算机安全》, no. 6, 30 June 2010 (2010-06-30) *
阮宏伟 等: "基于快照轮询和文本检测的批量网页防篡改系统", 《广西大学学报(自然科学版)》, vol. 36, no. 1, 31 October 2011 (2011-10-31) *

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103812673A (en) * 2012-11-07 2014-05-21 江苏仕德伟网络科技股份有限公司 Method for automatically recognizing multiple IP changes in website
CN103577526A (en) * 2013-08-01 2014-02-12 星云融创(北京)信息技术有限公司 Method and system as well as browser for verifying page modification
CN104008131A (en) * 2014-04-30 2014-08-27 广州市动景计算机科技有限公司 Processing method and device for web page data
CN107204960A (en) * 2016-03-16 2017-09-26 阿里巴巴集团控股有限公司 Web page identification method and device, server
CN105975395A (en) * 2016-05-30 2016-09-28 深圳市华傲数据技术有限公司 Website state reconnaissance method and device
CN107301355A (en) * 2017-06-20 2017-10-27 深信服科技股份有限公司 A kind of webpage tamper monitoring method and device
CN107301355B (en) * 2017-06-20 2021-07-02 深信服科技股份有限公司 Webpage tampering monitoring method and device
CN107566354A (en) * 2017-08-22 2018-01-09 北京小米移动软件有限公司 Web page contents detection method, device and storage medium
CN107835191A (en) * 2017-11-29 2018-03-23 中科信息安全共性技术国家工程研究中心有限公司 A kind of method and apparatus for detecting webpage malicious and distorting
CN108171082B (en) * 2017-12-06 2021-04-30 新华三信息安全技术有限公司 Webpage detection method and device
CN108171082A (en) * 2017-12-06 2018-06-15 新华三信息安全技术有限公司 A kind of webpage detection method and device
CN108021692A (en) * 2017-12-18 2018-05-11 北京天融信网络安全技术有限公司 A kind of method of web page monitored, server and computer-readable recording medium
CN108021692B (en) * 2017-12-18 2022-03-11 北京天融信网络安全技术有限公司 Method for monitoring webpage, server and computer readable storage medium
CN113348655A (en) * 2019-04-11 2021-09-03 深圳市欢太科技有限公司 Anti-hijacking method and device for browser, electronic equipment and storage medium
CN110708292A (en) * 2019-09-11 2020-01-17 光通天下网络科技股份有限公司 IP processing method, device, medium and electronic equipment
CN111159517A (en) * 2019-12-12 2020-05-15 深信服科技股份有限公司 Information processing method, device, system and computer storage medium

Also Published As

Publication number Publication date
CN102624713B (en) 2016-01-06

Similar Documents

Publication Publication Date Title
CN102624713A (en) Website tampering identification method and website tampering identification device
US11126723B2 (en) Systems and methods for remote detection of software through browser webinjects
US9935967B2 (en) Method and device for detecting malicious URL
CN101901221B (en) Method and device for detecting cross site scripting
CN102737183B (en) Method and device for webpage safety access
CN102467633A (en) Method and system for safely browsing webpage
CN103607413B (en) Method and device for detecting website backdoor program
CN110035075A (en) Detection method, device, computer equipment and the storage medium of fishing website
CN104462152A (en) Webpage recognition method and device
US8548917B1 (en) Detection of child frames in web pages
CN103209177A (en) Detection method and device for network phishing attacks
CN107016298B (en) Webpage tampering monitoring method and device
WO2014131306A1 (en) Method and system for detecting network link
CN101539936A (en) Detecting method for sham websites and device thereof
CN107835191A (en) A kind of method and apparatus for detecting webpage malicious and distorting
CN110474889A (en) One kind being based on the recognition methods of web graph target fishing website and device
CN102891861A (en) Client-based phishing website detecting method and device
CN107180194B (en) Method and device for vulnerability detection based on visual analysis system
CN106911635A (en) A kind of method and device of detection website with the presence or absence of backdoor programs
CN101741645A (en) Method, device and system for detecting storage-type cross-site scripting attack and attack detector
CN101901307B (en) Method and device for detecting whether database is attacked by cross-site script
CN111125704B (en) Webpage Trojan horse recognition method and system
CN107995167B (en) Equipment identification method and server
CN108021951A (en) A kind of method of document detection, server and computer-readable recording medium
CN113722641A (en) AI-based injection request protection method, device, terminal equipment and medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20200616

Address after: Nanshan District Xueyuan Road in Shenzhen city of Guangdong province 518000 No. 1001 Nanshan Chi Park building A1 layer

Patentee after: SANGFOR TECHNOLOGIES Inc.

Address before: 518000 Nanshan Science and Technology Pioneering service center, No. 1 Qilin Road, Guangdong, Shenzhen 418, 419,

Patentee before: Shenxin network technology (Shenzhen) Co.,Ltd.