CN107835191A - A kind of method and apparatus for detecting webpage malicious and distorting - Google Patents

A kind of method and apparatus for detecting webpage malicious and distorting Download PDF

Info

Publication number
CN107835191A
CN107835191A CN201711220764.1A CN201711220764A CN107835191A CN 107835191 A CN107835191 A CN 107835191A CN 201711220764 A CN201711220764 A CN 201711220764A CN 107835191 A CN107835191 A CN 107835191A
Authority
CN
China
Prior art keywords
webpage
hash
cryptographic hash
changed
similar
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201711220764.1A
Other languages
Chinese (zh)
Inventor
方杨森
王彦杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
ZHONGKE INFORMATION SECURITY COMMON TECHNOLOGY NATIONAL ENGINEERING RESEARCH CENTER Co Ltd
Original Assignee
ZHONGKE INFORMATION SECURITY COMMON TECHNOLOGY NATIONAL ENGINEERING RESEARCH CENTER Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ZHONGKE INFORMATION SECURITY COMMON TECHNOLOGY NATIONAL ENGINEERING RESEARCH CENTER Co Ltd filed Critical ZHONGKE INFORMATION SECURITY COMMON TECHNOLOGY NATIONAL ENGINEERING RESEARCH CENTER Co Ltd
Priority to CN201711220764.1A priority Critical patent/CN107835191A/en
Publication of CN107835191A publication Critical patent/CN107835191A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1441Countermeasures against malicious traffic
    • H04L63/1466Active attacks involving interception, injection, modification, spoofing of data unit addresses, e.g. hijacking, packet injection or TCP sequence number attacks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1416Event detection, e.g. attack signature detection

Abstract

It is a kind of to detect the method for webpage tamper, including the root of website is scanned, the cryptographic Hash of each webpage is calculated using similar hash algorithm, and the cryptographic Hash for collecting generation establishes basic Hash storehouse;The write operation of monitoring site catalogue, for the page changed, recalculated using similar hash algorithm by the cryptographic Hash of the modification page, and the cryptographic Hash of corresponding document is extracted from basic Hash storehouse;The cryptographic Hash generated twice before and after contrast, if the similarity of comparison result is less than a threshold value, can be considered that the webpage is changed;Feature detection is carried out to the webpage changed, and judges whether webpage is maliciously tampered.Beneficial effect is:Detection method proposed by the invention need not periodically calculate the fingerprint of webpage under site listing, and can be detected in real time when modification operation occurs for website, simplify operating procedure, improve the efficiency of detection webpage tamper.

Description

A kind of method and apparatus for detecting webpage malicious and distorting
Technical field
The present invention relates to safe web page field, in particular to a kind of method for detecting webpage malicious and distorting.
Background technology
Webpage tamper is a kind of common attack.Attacker often changes existing after website is attacked Webpage, malicious code or junk information etc. are write into the existing page.The webpage being tampered not only have impact on the normal of website Operation, also have propagated malicious code and invalid information etc. to the user for browsing webpage, and harm is extremely serious.
The method of currently used detection webpage tamper is web page fingerprint Comparison Method.This method is counted in advance by hash function The digital finger-print of each webpage under website is calculated, digital finger-print is collected and establishes fingerprint base, is recalculated again after being separated by certain time every The fingerprint of individual webpage, and be compared with the fingerprint in fingerprint base.Illustrate the webpage if the digital finger-print difference of same webpage It is tampered.But this method needs to establish fingerprint base before being not tampered with website, and also must during newly-built every time or modification webpage Fingerprint base must be updated, cumbersome and efficiency is low.
The content of the invention
The present invention be directed to the deficiencies in the prior art, it is proposed that a kind of method for detecting webpage malicious and distorting, this method Using can fast and effectively detect whether webpage is changed, there is higher security.
A kind of method for detecting webpage tamper, including:
The root of website is scanned, the cryptographic Hash of each webpage is calculated using similar hash algorithm, and collects generation Cryptographic Hash establishes basic Hash storehouse;
The write operation of monitoring site catalogue, for the page changed, recalculated using similar hash algorithm by modification page The cryptographic Hash in face, and extract from basic Hash storehouse the cryptographic Hash of corresponding document;
The cryptographic Hash generated twice before and after contrast, if the similarity of comparison result is less than a threshold value, can be considered the webpage quilt Modification;
Feature detection is carried out to the webpage changed, and judges whether webpage is maliciously tampered.
Described similar hash algorithm is:Similar hash algorithm is consistent with other hash algorithms, and difference is that similar Hash is calculated Method is for specifying object to generate unique and fixed length cryptographic Hash;For two objects, if two objects are more similar, the Kazakhstan generated Uncommon value difference is smaller.
Meanwhile the invention also provides a kind of device for detecting webpage tamper, the device includes web page crawl unit, calculated Unit and detection unit;
The web page crawl unit, for traveling through site listing, obtain all webpages under website, while monitoring station catalogue Write operation, record the webpage changed;
The computing unit, calculate the cryptographic Hash for the webpage that web page crawl unit obtains using similar hash algorithm and store to base In plinth Hash storehouse;The webpage changed monitored simultaneously for web page crawl unit, is recalculated by the Hash of modification webpage Value, and be compared with the corresponding cryptographic Hash in basic Hash storehouse, calculate the similarity of webpage before and after modification;
The detection unit, obtain and the webpage changed that similarity is less than given threshold is calculated in computing unit, using spy Whether the webpage that sign detection method detection is changed contains malicious code or harmful information.
Further, the web page crawl unit can remove the html tag that webpage includes when crawling webpage, to obtain The content of text of webpage.
The beneficial effect of technical scheme of the present invention is:Detection method proposed by the invention uses similar hash algorithm The similarity of webpage is calculated, judges whether webpage is tampered with this.Compared with existing web page fingerprint Comparison Method, the present invention is carried The detection method gone out need not periodically calculate the fingerprint of webpage under site listing, and can be carried out in real time when modification operation occurs for website Detection, simplifies operating procedure, improves the efficiency of detection webpage tamper.
Embodiment
In order that those skilled in the art more fully understand technical scheme, with reference to specific embodiment to this Invention is described in further detail.
A kind of method for detecting webpage tamper, including:
The root of website is scanned, the cryptographic Hash of each webpage is calculated using similar hash algorithm, and collects generation Cryptographic Hash establishes basic Hash storehouse;
The write operation of monitoring site catalogue, for the page changed, recalculated using similar hash algorithm by modification page The cryptographic Hash in face, and extract from basic Hash storehouse the cryptographic Hash of corresponding document;
The cryptographic Hash generated twice before and after contrast, if the similarity of comparison result is less than a threshold value, can be considered the webpage quilt Modification;
Feature detection is carried out to the webpage changed, and judges whether webpage is maliciously tampered.
Described similar hash algorithm is:Similar hash algorithm is consistent with other hash algorithms, and difference is that similar Hash is calculated Method is for specifying object to generate unique and fixed length cryptographic Hash;For two objects, if two objects are more similar, the Kazakhstan generated Uncommon value difference is smaller.Therefore, the algorithm can be used for the similarity for quickly comparing two objects.
Meanwhile the invention also provides a kind of device for detecting webpage tamper, the device includes web page crawl unit, calculated Unit and detection unit;
The web page crawl unit, for traveling through site listing, obtain all webpages under website, while monitoring station catalogue Write operation, record the webpage changed;
The computing unit, calculate the cryptographic Hash for the webpage that web page crawl unit obtains using similar hash algorithm and store to base In plinth Hash storehouse;The webpage changed monitored simultaneously for web page crawl unit, is recalculated by the Hash of modification webpage Value, and be compared with the corresponding cryptographic Hash in basic Hash storehouse, calculate the similarity of webpage before and after modification;
The detection unit, obtain and the webpage changed that similarity is less than given threshold is calculated in computing unit, using spy Whether the webpage that sign detection method detection is changed contains malicious code or harmful information.
Further, the web page crawl unit can remove the html tag that webpage includes when crawling webpage, to obtain The content of text of webpage.
Above a kind of method for detecting webpage tamper provided by the present invention is included being described in detail, herein should The principle and embodiment of the application are set forth with embodiment, the explanation of above example is only intended to help and understood The present processes and its core concept;Meanwhile for those of ordinary skill in the art, according to the thought of the application, having There will be changes in body embodiment and application, in summary, this specification content should not be construed as to the application Limitation.

Claims (4)

  1. A kind of 1. method for detecting webpage tamper, it is characterised in that including:
    The root of website is scanned, the cryptographic Hash of each webpage is calculated using similar hash algorithm, and collects generation Cryptographic Hash establishes basic Hash storehouse;
    The write operation of monitoring site catalogue, for the page changed, recalculated using similar hash algorithm by modification page The cryptographic Hash in face, and extract from basic Hash storehouse the cryptographic Hash of corresponding document;
    The cryptographic Hash generated twice before and after contrast, if the similarity of comparison result is less than a threshold value, can be considered the webpage quilt Modification;
    Feature detection is carried out to the webpage changed, and judges whether webpage is maliciously tampered.
  2. A kind of 2. method for detecting webpage tamper as claimed in claim 1, it is characterised in that described similar hash algorithm For:Similar hash algorithm is consistent with other hash algorithms, and difference is that similar hash algorithm is unique and fixed for specifying object generation Long cryptographic Hash;For two objects, if two objects are more similar, the cryptographic Hash difference generated is smaller.
  3. A kind of 3. device for detecting webpage tamper, it is characterised in that:The device of the detection webpage tamper includes web page crawl list Member, computing unit and detection unit;
    The web page crawl unit, for traveling through site listing, obtain all webpages under website, while monitoring station catalogue Write operation, record the webpage changed;
    The computing unit, calculate the cryptographic Hash for the webpage that web page crawl unit obtains using similar hash algorithm and store to base In plinth Hash storehouse;The webpage changed monitored simultaneously for web page crawl unit, is recalculated by the Hash of modification webpage Value, and be compared with the corresponding cryptographic Hash in basic Hash storehouse, calculate the similarity of webpage before and after modification;
    The detection unit, obtain and the webpage changed that similarity is less than given threshold is calculated in computing unit, using spy Whether the webpage that sign detection method detection is changed contains malicious code or harmful information.
  4. A kind of 4. device for detecting webpage tamper as claimed in claim 3, it is characterised in that:The web page crawl unit can be The html tag that webpage includes is removed when crawling webpage, to obtain the content of text of webpage.
CN201711220764.1A 2017-11-29 2017-11-29 A kind of method and apparatus for detecting webpage malicious and distorting Pending CN107835191A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711220764.1A CN107835191A (en) 2017-11-29 2017-11-29 A kind of method and apparatus for detecting webpage malicious and distorting

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711220764.1A CN107835191A (en) 2017-11-29 2017-11-29 A kind of method and apparatus for detecting webpage malicious and distorting

Publications (1)

Publication Number Publication Date
CN107835191A true CN107835191A (en) 2018-03-23

Family

ID=61646360

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711220764.1A Pending CN107835191A (en) 2017-11-29 2017-11-29 A kind of method and apparatus for detecting webpage malicious and distorting

Country Status (1)

Country Link
CN (1) CN107835191A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108809943A (en) * 2018-05-14 2018-11-13 苏州闻道网络科技股份有限公司 Web publishing method and its device
CN109474587A (en) * 2018-11-01 2019-03-15 北京亚鸿世纪科技发展有限公司 The method that HTTP based on letter peace system kidnaps monitoring analysis and positioning
CN109978626A (en) * 2019-03-29 2019-07-05 上海幻电信息科技有限公司 Web advertisement change monitoring method, apparatus and storage medium
CN110008392A (en) * 2019-03-07 2019-07-12 北京华安普特网络科技有限公司 A kind of webpage tamper detection method based on web crawlers technology
CN111967064A (en) * 2020-09-05 2020-11-20 湖南西盈网络科技有限公司 Webpage tamper-proofing method and system
CN117056584A (en) * 2023-10-08 2023-11-14 杭州海康威视数字技术股份有限公司 Information system abnormal change monitoring method and equipment based on dynamic similarity threshold

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102624713A (en) * 2012-02-29 2012-08-01 深信服网络科技(深圳)有限公司 Website tampering identification method and website tampering identification device
US20120284270A1 (en) * 2011-05-04 2012-11-08 Nhn Corporation Method and device to detect similar documents
CN103281177A (en) * 2013-04-10 2013-09-04 广东电网公司信息中心 Method and system for detecting hostile attack on Internet information system
CN106528508A (en) * 2016-10-27 2017-03-22 乐视控股(北京)有限公司 Repeated text judgment method and apparatus

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120284270A1 (en) * 2011-05-04 2012-11-08 Nhn Corporation Method and device to detect similar documents
CN102624713A (en) * 2012-02-29 2012-08-01 深信服网络科技(深圳)有限公司 Website tampering identification method and website tampering identification device
CN103281177A (en) * 2013-04-10 2013-09-04 广东电网公司信息中心 Method and system for detecting hostile attack on Internet information system
CN106528508A (en) * 2016-10-27 2017-03-22 乐视控股(北京)有限公司 Repeated text judgment method and apparatus

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108809943A (en) * 2018-05-14 2018-11-13 苏州闻道网络科技股份有限公司 Web publishing method and its device
CN108809943B (en) * 2018-05-14 2021-05-14 苏州闻道网络科技股份有限公司 Website monitoring method and device
CN109474587A (en) * 2018-11-01 2019-03-15 北京亚鸿世纪科技发展有限公司 The method that HTTP based on letter peace system kidnaps monitoring analysis and positioning
CN110008392A (en) * 2019-03-07 2019-07-12 北京华安普特网络科技有限公司 A kind of webpage tamper detection method based on web crawlers technology
CN109978626A (en) * 2019-03-29 2019-07-05 上海幻电信息科技有限公司 Web advertisement change monitoring method, apparatus and storage medium
CN111967064A (en) * 2020-09-05 2020-11-20 湖南西盈网络科技有限公司 Webpage tamper-proofing method and system
CN117056584A (en) * 2023-10-08 2023-11-14 杭州海康威视数字技术股份有限公司 Information system abnormal change monitoring method and equipment based on dynamic similarity threshold
CN117056584B (en) * 2023-10-08 2024-01-16 杭州海康威视数字技术股份有限公司 Information system abnormal change monitoring method and equipment based on dynamic similarity threshold

Similar Documents

Publication Publication Date Title
CN107835191A (en) A kind of method and apparatus for detecting webpage malicious and distorting
Xiang et al. Cantina+ a feature-rich machine learning framework for detecting phishing web sites
CN102664875B (en) Malicious code type detection method based on cloud mode
CN106611123A (en) Method and system for detecting 'Harm. Extortioner. a' virus
CN102624713B (en) The method of website tamper Detection and device
CN106845222A (en) A kind of detection method and system of blackmailer's virus
Balakrishnan et al. Intrusion detection system using feature selection and classification technique
CN103810425A (en) Method and device for detecting malicious website
CN102111267A (en) Website safety protection method based on digital signature and system adopting same
CN103929440A (en) Web page tamper prevention device based on web server cache matching and method thereof
CN105827594A (en) Suspicion detection method based on domain name readability and domain name analysis behavior
Taylor et al. Detecting malicious exploit kits using tree-based similarity searches
Huang et al. Mitigate web phishing using site signatures
CN102779245A (en) Webpage abnormality detection method based on image processing technology
CN106549980A (en) A kind of malice C&C server determines method and device
Provos et al. Search worms
KR101535529B1 (en) Method for collecting the suspicious file and trace information to analysis the ATP attack
CN102446211A (en) Method and system for filing and verifying image
Britt et al. Clustering Potential Phishing Websites Using {DeepMD5}
JP2013152497A (en) Black list extraction device, extraction method and extraction program
CN104503962A (en) Method for detecting hidden link of webpage
Yue et al. Fine-grained mining and classification of malicious Web pages
Peng et al. Detection of cache-based side channel attack based on performance counters
Yin An improved BM pattern matching algorithm in intrusion detection system
CN105844154A (en) Internal honeypot based malicious program detection method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20180323