WO2013026320A1 - 一种网页挂马检测方法及系统 - Google Patents

一种网页挂马检测方法及系统 Download PDF

Info

Publication number
WO2013026320A1
WO2013026320A1 PCT/CN2012/077469 CN2012077469W WO2013026320A1 WO 2013026320 A1 WO2013026320 A1 WO 2013026320A1 CN 2012077469 W CN2012077469 W CN 2012077469W WO 2013026320 A1 WO2013026320 A1 WO 2013026320A1
Authority
WO
WIPO (PCT)
Prior art keywords
execution
script
content
webpage
engine
Prior art date
Application number
PCT/CN2012/077469
Other languages
English (en)
French (fr)
Inventor
刘松
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Publication of WO2013026320A1 publication Critical patent/WO2013026320A1/zh
Priority to US14/187,891 priority Critical patent/US20140173736A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/56Computer malware detection or handling, e.g. anti-virus arrangements
    • G06F21/566Dynamic detection, i.e. detection performed at run-time, e.g. emulation, suspicious activities
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1441Countermeasures against malicious traffic
    • H04L63/145Countermeasures against malicious traffic the attack involving the propagation of malware through the network, e.g. viruses, trojans or worms

Definitions

  • the invention belongs to the technical field of computer security, and in particular relates to a method and system for detecting webpage hanging horses. Background technique
  • Web page hacking refers to an attacker using a vulnerability such as a third-party control or browser to tamper with a web page and deploy dangerous data on the web page that can trigger the vulnerability.
  • a vulnerability such as a third-party control or browser
  • the dangerous data contained in the webpage downloads and installs malware in the user system, acquires control of the user system, steals user information, and the like. It will seriously threaten the security of the user's system, so it is necessary to detect the webpage.
  • the existing method for detecting webpages is mainly to construct a huge database of features of the webpages, and to determine whether the webpage is a webpage by performing feature matching by detecting the webpages.
  • the efficiency of detecting the webpage by the feature matching method is low, and the missed detection rate and the false detection rate are high.
  • the purpose of the embodiments of the present invention is to provide a method and system for detecting a webpage hanging horse, and improve the network. Page hangs detection efficiency, and reduces missed detection rate and false detection rate.
  • the embodiment of the present invention is implemented by the method for detecting a webpage hanging horse, and the method includes the following steps:
  • Another object of the embodiments of the present invention is to provide a webpage detection system, the system comprising:
  • a first obtaining unit configured to acquire webpage content
  • An information extracting unit configured to parse the obtained webpage content, and extract a script object
  • an execution unit configured to construct an object execution engine to simulate an object content of the script object
  • a determining unit configured to monitor simulated execution of the object content, and when an abnormal behavior occurs, determining that the object content includes dangerous data.
  • the embodiment of the present invention can detect the webpage of the hanging horse without providing a huge database of the webpage of the webpage, thereby avoiding a large number of feature matching and improving the efficiency of detecting the webpage. Moreover, by constructing a plurality of object execution engines to dynamically simulate the execution of the object content of the script object, when an abnormal behavior occurs during the simulation execution, the webpage can be determined to be a webpage, thereby effectively reducing the missed detection rate of the webpage. And false detection rate.
  • FIG. 1 is a flowchart of an implementation of a method for detecting a webpage hanging horse according to a first embodiment of the present invention
  • FIG. 2 is a flowchart of an implementation of a webpage detecting method for a webpage according to a second embodiment of the present invention
  • FIG. 4 is a structural diagram of a webpage hanging horse detection system according to Embodiment 4 of the present invention.
  • the webpage content is obtained, the obtained webpage content is parsed, the script object is extracted, the object execution engine is constructed to simulate the execution of the object content of the script object, and the simulated execution of the object content is monitored, when an abnormal behavior occurs.
  • the embodiment of the present invention can detect the hanging webpage without providing a huge database of hanging webpage features, thereby avoiding a large number of feature matching and improving the efficiency of webpage hanging detection.
  • the webpage can be determined to be a webpage, thereby effectively reducing the missed detection rate of the webpage. And false detection rate.
  • Embodiment 1
  • FIG. 1 is a flowchart of an implementation process of a webpage hanging detection method according to Embodiment 1 of the present invention, where the method includes the following steps:
  • Step S101 obtaining webpage content
  • webpage content can be obtained by an existing web crawler.
  • the filtering condition may be set in advance, and the illegal data type in the webpage content and the file exceeding the predetermined size are filtered.
  • Step S102 Parse the obtained webpage content, and extract a script object.
  • the obtained webpage content is parsed by an existing webpage parser, and information such as tags, texts, and script objects is extracted.
  • the content of the web page includes multiple script objects such as table, title, and so on. And dangerous data usually appears in specific script objects, such as: iframe, reference The URL address of the javascript script, the Active control (object object), and the javascript code (script object).
  • an object feature library of an object feature of a script object that may contain dangerous data is provided, and the acquired webpage content is feature-matched according to the object feature library to extract a script object that may contain dangerous data.
  • Step S103 Construct an object execution engine to simulate execution of the object content of the script object.
  • the construction object execution engine is a script execution virtual machine, and the virtual machine defines some script objects and methods that can be utilized by the webpage, such as: javascript objects, iframe objects, and the like.
  • the object content includes, but is not limited to, a javascript script, an Active control, and the like, and the object execution engine includes but is not limited to a javascript script interpretation engine, an Active control execution engine, and the like.
  • constructing an object execution engine to simulate execution of the object content of the script object includes the following three methods: a) initializing a browser object;
  • the object is taken over by the javascript script interpretation engine when the vulnerability trigger function is called.
  • the javascript script interpretation engine determines whether the object is an object containing dangerous data according to parameters in the object (not limited to parameter judgment), and if so, acquires the object. Download link.
  • the script object of the current web page and the script object referenced by the web page are also included.
  • the source URL can also be captured by the jump relationship between the web pages.
  • the object execution engine in order for the object execution engine to correctly process each of the extracted script objects, it is necessary to convert the object content of the script object into a language recognizable by the object execution engine.
  • Step S104 Monitor simulation execution of the object content, and when abnormal behavior occurs, determine that the object content contains dangerous data.
  • the dangerous data refers to data capable of triggering a vulnerability.
  • the abnormal behavior includes, but is not limited to, whether the memory allocated by the javascript script when executed exceeds a preset threshold or overwrites a specific address, or the control invokes a dangerous interface when executed.
  • the method may further include: performing, by the object execution, enumerating all attributes in the webpage text content, and detecting whether the attribute has a shellcode feature.
  • the object execution engine will enumerate all the attributes in the webpage text, and pass the X86 emulator and GetPC heuristics provided by the open source library libemu. The properties are checked for shellcode.
  • ⁇ iframe src http: //***.
  • the webpage content is obtained, the obtained webpage content is parsed, the script object is extracted, the object execution engine is constructed to simulate the execution of the object content of the script object, and the simulation execution of the object content is monitored.
  • an abnormal behavior occurs, it is determined that the object content contains dangerous data.
  • the embodiment of the invention can detect the webpage of the hanging horse without providing a huge database of features of the webpage, thereby avoiding a large number of feature matching and improving the efficiency of webpage detection.
  • a plurality of object execution engines to dynamically simulate the execution of the object content of the script object and the shellcode detection of the web page, it is determined from various aspects whether the script object has an abnormal behavior, for example: determining whether the memory allocated by the javascript script is more than pre-executed Whether to set a threshold or whether to overwrite a specific address, or whether the control invokes a dangerous interface during execution, and whether the attribute value or the parameter value of the object content is abnormal, thereby effectively reducing the missed detection rate and error of the webpage Check rate.
  • Embodiment 2 is a diagrammatic representation of Embodiment 1:
  • FIG. 2 is a flowchart showing an implementation process of a method for detecting a webpage hanging horse according to Embodiment 2 of the present invention, where In the embodiment, step S201 is added to the first embodiment, and the other steps S202 to S205 are completely the same as steps S101 to S104 in the first embodiment.
  • step S201 a URL link associated with the script object in the currently detected web page is obtained.
  • Embodiment 3 is a diagrammatic representation of Embodiment 3
  • FIG. 3 is a block diagram showing the structure of a webpage hanging horse detection system according to Embodiment 3 of the present invention. For convenience of description, only parts related to the embodiment of the present invention are shown.
  • the webpage detection system may be a software unit, a hardware unit or a combination of hardware and software running in each application system.
  • the webpage detection system includes a first acquisition unit 31, an information extraction unit 32, an execution unit 33, and a determination unit 34. Among them, the specific functions of each unit are as follows:
  • a first obtaining unit 31 configured to acquire webpage content
  • the information extracting unit 32 is configured to parse the acquired webpage content and extract the script object.
  • the information extracting unit 32 further includes an information extracting module 321 , configured to perform feature matching on the acquired webpage content according to an object feature of the script object that may include the dangerous data, and extract the data that may contain the dangerous data. Script object.
  • An execution unit 33 configured to construct an object execution engine to simulate an object content of the script object
  • the determining unit 34 is configured to monitor simulation execution of the object content, and when an abnormal behavior occurs, determine that the object content includes dangerous data.
  • the object content includes a javascript script and an Active control
  • the object execution engine includes a javascript script interpretation engine and an Active control execution engine
  • the abnormal behavior includes whether the memory allocated by the javascript script is more than The preset threshold or overrides a specific address, or the control invokes a dangerous interface when executed.
  • the system may further include a detecting unit 35, configured to enumerate all attributes in the text content of the webpage by the object execution engine, and detect whether the attribute has Shellcode feature.
  • the webpage hanging horse detection system provided in this embodiment can use the corresponding webpage hanging horse detection method.
  • details refer to the related description of the first embodiment of the webpage hanging horse detection method, and details are not described herein again.
  • Embodiment 4 is a diagrammatic representation of Embodiment 4:
  • Fig. 4 is a view showing the structure of a webpage hanging horse detection system according to a fourth embodiment of the present invention. For convenience of explanation, only parts related to the embodiment of the present invention are shown.
  • the webpage detection system may be a software unit, a hardware unit or a combination of hardware and software running in each application system.
  • the webpage detection system adds a second obtaining unit 41, which is used to acquire and The URL link associated with the script object in the web page is currently detected, and the system described in the third embodiment detects whether the webpage content pointed to by the URL link contains dangerous data.
  • the webpage hanging horse detection system provided in this embodiment can use the corresponding webpage hanging horse detection method.
  • details refer to the related description of the second embodiment of the webpage hanging horse detection method, and details are not described herein again.
  • the webpage content is obtained, the obtained webpage content is parsed, the script object is extracted, and the object execution engine is constructed to simulate execution of the script object. And monitoring the simulated execution of the object content, and determining that the object content contains dangerous data when an abnormal behavior occurs.
  • the embodiment of the invention can detect the webpage of the hanging horse without providing a huge database of the webpage of the webpage, thereby avoiding a large number of feature matching and improving the efficiency of detecting the webpage.
  • the script object has an abnormal behavior, for example: determining whether the memory allocated by the javascript script is more than pre-executed Whether to set a threshold or whether to overwrite a specific address, or whether the control invokes a dangerous interface during execution, and whether the attribute value or the parameter value of the object content is abnormal, thereby effectively reducing the missed detection rate and error of the webpage Check rate.
  • the practicality and effectiveness of webpage detection is enhanced.
  • the storage medium may be a magnetic disk, an optical disk, or a read-only storage memory.

Abstract

本发明适用于计算机安全技术领域,提供了一种网页挂马检测方法及系统,所述方法包括:获取网页内容;对所获取的网页内容进行解析,提取脚本对象;构造对象执行引擎来模拟执行所述脚本对象的对象内容;监控所述对象内容的模拟执行,当出现异常行为时,确定所述对象内容包含危险数据。本发明可有效提高网页挂马检测的效率,降低网页挂马检测的漏检率和误检率。

Description

一种网页挂马检测方法及系统 优先权申明
本专利申请要求 2011 年 8 月 25 日提交的中国专利申请号为 2011102455648, 申请人为腾讯科技(深圳)有限公司, 发明名称为 "一种 网页挂马检测方法及系统" 的优先权, 该申请的全文以引用的方式并入本 申请中。 技术领域
本发明属于计算机安全技术领域, 尤其涉及一种网页挂马检测方法及 系统。 背景技术
网页挂马是指攻击者利用诸如第三方控件或者浏览器等漏洞篡改网 页, 在网页上部署能够触发漏洞的危险数据。 当用户使用浏览器浏览被挂 马的网页时, 如果系统中存在相应的漏洞, 网页中包含的危险数据就会在 用户系统下载和安装恶意软件, 获取用户系统的控制权, 窃取用户信息等, 将严重威胁到用户系统的安全, 因此对于网页挂马的检测显得十分必要。
现有的网页挂马检测方法主要釆用的是构建一个庞大的挂马网页特征 数据库, 通过对待检测网页进行逐个特征匹配来判断该网页是否为挂马网 页。 然而, 由于网页脚本变形容易、 加密方式又多种多样, 通过特征匹配 的方式进行网页挂马检测效率较低, 而且漏检率和误检率较高。 发明内容
本发明实施例的目的在于提供一种网页挂马检测方法及系统, 提高网 页挂马检测效率、 以及降低漏检率和误检率。
本发明实施例是这样实现的, 一种网页挂马检测方法, 所述方法包括 以下步骤:
A、 获取网页内容;
B、 对所获取的网页内容进行解析, 提取脚本对象;
C、 构造对象执行引擎来模拟执行所述脚本对象的对象内容;
D、 监控所述对象内容的模拟执行, 当出现异常行为时, 确定所述对象 内容包含危险数据。
本发明实施例的另一目的在于提供一种网页挂马检测系统, 所述系统 包括:
第一获取单元, 用于获取网页内容;
信息提取单元, 用于对所获取的网页内容进行解析, 提取脚本对象; 执行单元, 用于构造对象执行引擎来模拟执行所述脚本对象的对象内 容;
确定单元, 用于监控所述对象内容的模拟执行, 当出现异常行为时, 确定所述对象内容包含危 P佥数据。
从上述技术方案可以看出, 本发明实施例不需要提供庞大的挂马网页 特征数据库就可以进行挂马网页的检测, 从而可避免大量的特征匹配, 提 高了网页挂马检测的效率。 而且, 通过构造多个对象执行引擎来动态模拟 执行脚本对象的对象内容, 在模拟执行的过程中出现异常行为时, 就可以 确定该网页为挂马网页, 有效减低了挂马网页的漏检率和误检率。 附图说明
图 1是本发明实施例一提供的网页挂马检测方法的实现流程图; 图 2是本发明实施例二提供的网页挂马检测方法的实现流程图; 图 3是本发明实施例三提供的网页挂马检测系统的组成结构图; 图 4是本发明实施例四提供的网页挂马检测系统的组成结构图。 具体实施方式
为了使本发明的目的、 技术方案及优点更加清楚明白, 以下结合附图 及实施例, 对本发明进行进一步详细说明。 应当理解, 此处所描述的具体 实施例仅仅用以解释本发明, 并不用于限定本发明。
本发明实施例通过获取网页内容, 对所获取的网页内容进行解析, 提 取脚本对象, 构造对象执行引擎来模拟执行所述脚本对象的对象内容, 监 控所述对象内容的模拟执行, 当出现异常行为时, 确定所述对象内容包含 危险数据。 本发明实施例不需要提供庞大的挂马网页特征数据库就可以进 行挂马网页的检测, 从而可避免大量的特征匹配, 提高网页挂马检测的效 率。 而且, 通过构造多个对象执行引擎来动态模拟执行脚本对象的对象内 容, 在模拟执行的过程中出现异常行为时, 就可以确定该网页为挂马网页, 有效减低了挂马网页的漏检率和误检率。
为了说明本发明所述的技术方案, 下面通过具体实施例来进行说明。 实施例一:
图 1 是本发明实施例一提供的网页挂马检测方法的实现流程, 该方法 包括如下步骤:
步骤 S101 , 获取网页内容;
在本实施例中, 可通过现有的网页爬虫获取网页内容。 同时, 为了提 高网页内容获取的效率, 在获取网页内容时, 可以预先设置过滤条件, 过 滤网页内容中的非法数据类型和超过预定大小的文件。
步骤 S102, 对所获取的网页内容进行解析, 提取脚本对象。
在本实施例中, 通过现有的网页解析器对获取的网页内容进行解析, 提取标签、文本以及脚本对象等信息。网页内容包括多个脚本对象,如 table、 title 等。 而危险数据通常出现在特定的脚本对象中, 例如: iframe、 引用 javascript脚本的 URL地址、 Active控件 ( object对象)以及 javascript代码 ( script对象)等。
作为本发明的一个优选实施例, 提供了一个可能包含危险数据的脚本 对象的对象特征的对象特征库, 根据该对象特征库对获取的网页内容进行 特征匹配, 以提取可能包含危险数据的脚本对象。
步骤 S103 , 构造对象执行引擎来模拟执行所述脚本对象的对象内容。 在本实施例中, 所述构造对象执行引擎是一个脚本执行的虚拟机, 该 虚拟机内定义了一些能够被挂马网页利用的脚本对象和方法, 例如: javascript对象、 iframe对象等。其中,所述对象内容包括但不局限于 javascript 脚本、 Active控件等, 所述对象执行引擎包括但不局限于 javascript脚本解 释引擎、 Active控件执行引擎等。
优选的是, 构造对象执行引擎来模拟执行所述脚本对象的对象内容包 括以下三种方式: a) 初始化浏览器对象;
为了正确的模拟浏览器执行脚本的过程, 需要定义基本的浏览器对象, 如 window, document, navigator, location, ...javascript初始化脚本。
function CDocument()
{
This.elments = "Mozilla";
This.getElementBylD = function(arg)
{ } }
this. document = new CDocument();
b)模拟执行 ActiveX对象; 为了能够在挂马网页执行到包含危险数据的脚本对象时检测到异常, 需要重新定义一些被挂马网页利用的脚本对象和方法, 当挂马网页执行这 些定义的脚本对象和方法时, 将由对象执行引擎接管。 过程如下:
1)创建一个空的 javascript对象;
2)根据该对象 ID为其添加相应的属性和方法(例如:列表的高、宽等);
3) 该对象在调用漏洞触发函数时由 javascript脚本解释引擎接管, javascript脚本解释引擎根据该对象中参数(不局限于参数判断)判断该对 象是否为包含危险数据的对象, 若是, 则获取该对象的下载链接。
c) 获取 j?兆转: location, location.href, iframe.src等。
为了提取到网页中的各类跳转, 需要自定义 location, iframe等对象, 并为该对象设置属性拦截器。 当网页脚本中存在 loctioiLsrc等跳转语句时, 拦截器将获取其跳转的目标链接。
因此, 在对象执行引擎模拟执行脚本对象的对象内容中, 也包括当前 网页的脚本对象和该网页引用的脚本对象。例如: <iframe src=http: //***.com width=0 height=0> </iframe> , iframe对象引用的 http: //***. com„
当对象执行引擎发现某个网页挂马时, 通过各网页间的跳转关系, 可 以将其源 URL也一并捕获。
作为本发明的一个实施例, 为了使对象执行引擎可以正确的处理所提 取的每个脚本对象, 需要对脚本对象的对象内容进行转换, 转换成对象执 行引擎可识别的语言。
步骤 S104, 监控所述对象内容的模拟执行, 当出现异常行为时, 确定 所述对象内容包含危险数据。
在本实施例中, 所述危险数据是指能够触发漏洞的数据。 所述异常行 为包括但不局限于所述 javascript脚本在执行时分配的内存是否超过预设阔 值或者覆盖了特定地址、 或者所述控件在执行时调用危险接口。 作为本发明的另一实施例, 所述方法在步骤 S103之后还可以包括: 通 过对象执行弓 )擎枚举网页文本内容中的所有属性, 并检测所述属性是否具 有 shellcode特征。
在本实施例中, 为了进一步提高检测的准确性, 对象执行引擎在执行 完脚本对象后, 将枚举网页文本中的所有属性, 并通过开源库 libemu提供 的 X86仿真器以及 GetPC启发器对所述属性进行 Shellcode检测。
<iframe src=http: //***. com width=0 height=0> ,通过开源库 libemu 提供的 X86仿真器以及 GetPC启发器对 width和 height属性进行检测, 当 检测到 width和 height属性值为 0时, 说明该属性存在 Shellcode特征, 包 含该属性的网页存在挂马的可能, 需及时向用户发出预警。
通过增加的 Shellcode检测, 可以更准确、 快速的检测出网页是否为挂 马网页。
在本发明实施例中, 通过获取网页内容, 对所获取的网页内容进行解 析, 提取脚本对象, 构造对象执行引擎来模拟执行所述脚本对象的对象内 容, 监控所述对象内容的模拟执行, 当出现异常行为时, 确定所述对象内 容包含危险数据。 本发明实施例不需要提供庞大的挂马网页特征数据库就 可以进行挂马网页的检测, 从而可避免大量的特征匹配, 提高网页挂马检 测的效率。 而且, 通过构造多个对象执行引擎来动态模拟执行脚本对象的 对象内容以及网页的 shellcode检测, 从多个方面判断脚本对象是否存在异 常行为, 例如: 判断 javascript脚本在执行时分配的内存是否超过预设阔值 或者是否覆盖了特定地址、 或者所述控件是否在执行时调用危险接口以及 所述对象内容的属性值或者参数值是否存在异常等, 从而可有效减低挂马 网页的漏检率和误检率。
实施例二:
图 2示出了本发明实施例二提供的网页挂马检测方法的实现流程, 该 实施例是在实施例一的基础上增加了步骤 S201 , 其他的步骤 S202~S205与 实施例一中的步骤 S101~S104完全相同。
在步骤 S201 中, 获取与当前检测网页中的脚本对象相关联的 URL链 接。
在本实施例中, 为了进一步保护系统安全, 增强网页挂马检测的实用 性和有效性。 在存在与当前检测网页中的脚本对象相关联的 URL链接时, 需要获取与该脚本对象相关联的所有 URL链接, 并对所述相关联的 URL 链接递归执行与实施例一相同的步骤, 来判断所述相关的 URL链接中是否 存在包含危 P佥数据的脚本对象。
实施例三:
图 3 示出了本发明实施例三提供的网页挂马检测系统的组成结构, 为 了便于说明, 仅示出了与本发明实施例相关的部分。
该网页挂马检测系统可以是运行于各应用系统内的软件单元、 硬件单 元或者软硬件相结合的单元。
该网页挂马检测系统包括第一获取单元 31、 信息提取单元 32、 执行单 元 33以及确定单元 34。 其中, 各单元的具体功能如下:
第一获取单元 31 , 用于获取网页内容;
信息提取单元 32, 用于对所获取的网页内容进行解析,提取脚本对象。 其中, 所述信息提取单元 32还包括信息提取模块 321 , 所述信息提取模块 321 用于根据可能包含危险数据的脚本对象的对象特征对所获取的网页内 容进行特征匹配, 提取可能包含危险数据的脚本对象。
执行单元 33 , 用于构造对象执行引擎来模拟执行所述脚本对象的对象 内容;
确定单元 34, 用于监控所述对象内容的模拟执行, 当出现异常行为时, 确定所述对象内容包含危 P佥数据。 在本实施例中, 所述对象内容包括 javascript脚本、 Active控件, 所述 对象执行引擎包括 javascript脚本解释引擎、 Active控件执行引擎, 所述异 常行为包括所述 javascript脚本在执行时分配的内存是否超过预设阔值或者 覆盖了特定地址、 或者所述控件在执行时调用危险接口。
作为本发明的另一实施例, 为了进一步提高检测的准确性, 所述系统 还可以包括检测单元 35 , 用于通过对象执行引擎枚举网页文本内容中的所 有属性, 并检测所述属性是否具有 shellcode特征。
本实施例提供的网页挂马检测系统可以使用在前述对应的网页挂马检 测方法, 详情参见上述网页挂马检测方法实施例一的相关描述, 在此不再 赘述。
实施例四:
图 4示出了本发明实施例四提供的网页挂马检测系统的组成结构, 为 了便于说明, 仅示出了与本发明实施例相关的部分。
该网页挂马检测系统可以是运行于各应用系统内的软件单元、 硬件单 元或者软硬件相结合的单元。
为了进一步保护系统安全, 增强网页挂马检测的实用性和有效性, 该 网页挂马检测系统在实施例三的基础上增加了第二获取单元 41 , 所述第二 获取单元 41用于获取与当前检测网页中的脚本对象相关联的 URL链接, 并通过实施例三所述的系统来检测所述 URL链接所述指向的网页内容是否 包含危险数据。
本实施例提供的网页挂马检测系统可以使用在前述对应的网页挂马检 测方法, 详情参见上述网页挂马检测方法实施例二的相关描述, 在此不再 赘述。
在本发明实施例中, 通过获取网页内容, 对所获取的网页内容进行解 析, 提取脚本对象, 构造对象执行引擎来模拟执行所述脚本对象的对象内 容, 监控所述对象内容的模拟执行, 当出现异常行为时, 确定所述对象内 容包含危险数据。 本发明实施例不需要提供庞大的挂马网页特征数据库就 可以进行挂马网页的检测, 从而可避免大量的特征匹配, 提高网页挂马检 测的效率。 而且, 通过构造多个对象执行引擎来动态模拟执行脚本对象的 对象内容以及网页的 shellcode检测, 从多个方面判断脚本对象是否存在异 常行为, 例如: 判断 javascript脚本在执行时分配的内存是否超过预设阔值 或者是否覆盖了特定地址、 或者所述控件是否在执行时调用危险接口以及 所述对象内容的属性值或者参数值是否存在异常等, 从而可有效减低挂马 网页的漏检率和误检率。 同时, 为了进一步保护系统安全, 增强网页挂马 检测的实用性和有效性。 在存在与当前脚本对象相关联的 URL链接时, 需 要获取与当前脚本对象相关联的所有 URL链接, 并对所述相关联的 URL 链接递归执行与实施例一相同的网页挂马检测步骤, 来判断所述相关的 URL链接中是否存在包含危险数据的脚本对象。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流 程, 是可以通过计算机程序来指令相关的硬件来完成, 所述的程序可存储 于一计算机可读取存储介质中, 该程序在执行时, 可包括如上述各方法的 实施例的流程。 其中, 所述的存储介质可为磁碟、 光盘、 只读存储记忆体
RAM )等。
以上所述仅为本发明的较佳实施例而已, 并不用以限制本发明, 凡在 本发明的精神和原则之内所作的任何修改、 等同替换和改进等, 均应包含 在本发明的保护范围之内。

Claims

权利要求书
1、 一种网页挂马检测方法, 其特征在于, 所述方法包括:
获取网页内容;
对所获取的网页内容进行解析, 提取脚本对象;
构造对象执行引擎来模拟执行所述脚本对象的对象内容;
监控所述对象内容的模拟执行, 当出现异常行为时, 确定所述对象内 容包含危险数据。
2、如权利要求 1所述的方法,其特征在于, 所述提取脚本对象还包括: 根据可能包含危险数据的脚本对象的对象特征对所获取的网页内容进 行特征匹配, 提取可能包含危险数据的脚本对象。
3、 如权利要求 1所述的方法, 其特征在于, 所述构造对象执行引擎来 模拟执行所述脚本对象的对象内容通过以下三种方式中的任意一种实现: 初始化浏览器对象;
模拟执行 ActiveX对象;
获取跳转。
4、 如权利要求 1 或 3 所述的方法, 其特征在于, 所述对象内容包括 javascript脚本、 Active控件;
所述对象执行引擎包括 javascript脚本解释引擎、 Active控件执行引擎; 所述异常行为包括所述 javascript脚本在执行时分配的内存是否超过预 设阔值、 或者覆盖了特定地址、 或者所述控件在执行时调用危险接口。
5、 如权利要求 1所述的方法, 其特征在于, 所述方法还包括: 获取所述脚本对象相关联的 URL链接, 通过递归执行权利要求 1所述 的方法来检测所述 URL链接所述指向的网页内容是否包含危险数据。
6、 如权利要求 1或 3所述的方法, 其特征在于, 所述构造对象执行引 擎来模拟执行所述脚本对象的对象内容之后, 所述方法还包括: 通过所述对象执行引擎枚举网页文本内容中的所有属性, 并检测所述 属性是否具有 shellcode特征。
7、 一种网页挂马检测系统, 其特征在于, 所述系统包括:
第一获取单元, 用于获取网页内容;
信息提取单元, 用于对所获取的网页内容进行解析, 提取脚本对象; 执行单元, 用于构造对象执行引擎来模拟执行所述脚本对象的对象内 容;
确定单元, 用于监控所述对象内容的模拟执行, 当出现异常行为时, 确定所述对象内容包含危 P佥数据。
8、如权利要求 7所述的系统,其特征在于, 所述信息提取单元还包括: 信息提取模块, 用于根据可能包含危险数据的脚本对象的对象特征对 所获取的网页内容进行特征匹配, 提取可能包含危险数据的脚本对象。
9、 如权利要求 7所述的系统, 其特征在于, 所述执行单元用于通过以 下三种方式中的任意一种构造对象执行引擎来模拟执行所述脚本对象的对 象内容:
初始化浏览器对象;
模拟执行 ActiveX对象;
获取跳转。
10、 如权利要求 7或 9所述的系统, 其特征在于, 所述对象内容包括 javascript脚本、 Active控件, 所述对象执行弓 |擎包括 javascript脚本解释弓 | 擎、 Active控件执行引擎, 所述异常行为包括所述 javascript脚本在执行时 分配的内存是否超过预设阔值、 或者覆盖了特定地址、 或者所述控件在执 行时调用危险接口。
11、 如权利要求 7或 9所述的系统, 其特征在于, 所述系统还包括: 第二获取单元, 用于获取所述脚本对象相关联的 URL链接, 通过所述 第一获取单元、 信息提取单元、 执行单元和确定单元检测所述 URL链接所 述指向的网页内容是否包含危 P佥数据。
12、 如权利要求 7或 9所述的系统, 其特征在于, 所述系统还包括: 检测单元, 用于通过对象执行引擎枚举网页文本内容中的所有属性, 并检 测所述属性是否具有 shellcode特征。
PCT/CN2012/077469 2011-08-25 2012-06-25 一种网页挂马检测方法及系统 WO2013026320A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US14/187,891 US20140173736A1 (en) 2011-08-25 2014-02-24 Method and system for detecting webpage Trojan embedded

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN2011102455648A CN102955913A (zh) 2011-08-25 2011-08-25 一种网页挂马检测方法及系统
CN201110245564.8 2011-08-25

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US14/187,891 Continuation US20140173736A1 (en) 2011-08-25 2014-02-24 Method and system for detecting webpage Trojan embedded

Publications (1)

Publication Number Publication Date
WO2013026320A1 true WO2013026320A1 (zh) 2013-02-28

Family

ID=47745909

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2012/077469 WO2013026320A1 (zh) 2011-08-25 2012-06-25 一种网页挂马检测方法及系统

Country Status (3)

Country Link
US (1) US20140173736A1 (zh)
CN (1) CN102955913A (zh)
WO (1) WO2013026320A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11122316B2 (en) 2009-07-15 2021-09-14 Time Warner Cable Enterprises Llc Methods and apparatus for targeted secondary content insertion
US11212593B2 (en) * 2016-09-27 2021-12-28 Time Warner Cable Enterprises Llc Apparatus and methods for automated secondary content management in a digital network

Families Citing this family (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10805331B2 (en) 2010-09-24 2020-10-13 BitSight Technologies, Inc. Information technology security assessment system
CN103177115B (zh) * 2013-04-03 2016-06-29 北京奇虎科技有限公司 一种提取网页页面链接的方法和装置
US9438615B2 (en) 2013-09-09 2016-09-06 BitSight Technologies, Inc. Security risk management
CN103617390A (zh) * 2013-11-06 2014-03-05 北京奇虎科技有限公司 一种恶意网页判断方法、装置和系统
CN104881605B (zh) * 2014-02-27 2018-10-02 腾讯科技(深圳)有限公司 一种网页重定向漏洞检测方法及装置
CN104008336B (zh) * 2014-05-07 2017-04-12 中国科学院信息工程研究所 一种ShellCode检测方法和装置
CN104182478A (zh) * 2014-08-01 2014-12-03 北京华清泰和科技有限公司 一种网站监控预警方法
CN106663171B (zh) * 2014-08-11 2019-12-10 日本电信电话株式会社 浏览器模拟器装置、构建装置、浏览器模拟方法以及构建方法
CN104331663B (zh) * 2014-10-31 2017-09-01 北京奇虎科技有限公司 web shell的检测方法以及web服务器
CN104484603A (zh) * 2014-12-31 2015-04-01 北京奇虎科技有限公司 网站后门的检测方法及装置
CN104978529B (zh) * 2015-03-10 2018-12-07 腾讯科技(深圳)有限公司 网页前端的异常处理方法、异常处理系统及异常处理服务器
CN106201817A (zh) * 2016-06-21 2016-12-07 微梦创科网络科技(中国)有限公司 动态展示内容监控方法、系统及装置
US10482248B2 (en) * 2016-11-09 2019-11-19 Cylance Inc. Shellcode detection
US10257219B1 (en) 2018-03-12 2019-04-09 BitSight Technologies, Inc. Correlated risk in cybersecurity
CN110798439B (zh) * 2018-09-04 2022-04-19 国家计算机网络与信息安全管理中心 主动探测物联网僵尸网络木马的方法、设备及存储介质
US11200323B2 (en) 2018-10-17 2021-12-14 BitSight Technologies, Inc. Systems and methods for forecasting cybersecurity ratings based on event-rate scenarios
US10521583B1 (en) * 2018-10-25 2019-12-31 BitSight Technologies, Inc. Systems and methods for remote detection of software through browser webinjects
CN109933977A (zh) * 2019-03-12 2019-06-25 北京神州绿盟信息安全科技股份有限公司 一种检测webshell数据的方法及装置
US10726136B1 (en) 2019-07-17 2020-07-28 BitSight Technologies, Inc. Systems and methods for generating security improvement plans for entities
US11956265B2 (en) 2019-08-23 2024-04-09 BitSight Technologies, Inc. Systems and methods for inferring entity relationships via network communications of users or user devices
US11032244B2 (en) 2019-09-30 2021-06-08 BitSight Technologies, Inc. Systems and methods for determining asset importance in security risk management
US10893067B1 (en) 2020-01-31 2021-01-12 BitSight Technologies, Inc. Systems and methods for rapidly generating security ratings
US11023585B1 (en) 2020-05-27 2021-06-01 BitSight Technologies, Inc. Systems and methods for managing cybersecurity alerts

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1924866A (zh) * 2006-09-28 2007-03-07 北京理工大学 基于统计特征的网页恶意脚本检测方法
CN101159000A (zh) * 2007-10-17 2008-04-09 深圳市迅雷网络技术有限公司 一种网页安全信息检测系统及方法
CN101364988A (zh) * 2008-09-26 2009-02-11 深圳市迅雷网络技术有限公司 一种确定网页安全性的方法和装置

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101562618B (zh) * 2009-04-08 2012-03-28 深圳市腾讯计算机系统有限公司 一种检测网马的方法及装置
CN101964026A (zh) * 2009-07-23 2011-02-02 中联绿盟信息技术(北京)有限公司 网页挂马检测方法和系统
CN102043919B (zh) * 2010-12-27 2012-11-21 北京安天电子设备有限公司 基于脚本虚拟机的漏洞通用检测方法和系统
CN102088379B (zh) * 2011-01-24 2013-03-13 国家计算机网络与信息安全管理中心 基于沙箱技术的客户端蜜罐网页恶意代码检测方法与装置

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1924866A (zh) * 2006-09-28 2007-03-07 北京理工大学 基于统计特征的网页恶意脚本检测方法
CN101159000A (zh) * 2007-10-17 2008-04-09 深圳市迅雷网络技术有限公司 一种网页安全信息检测系统及方法
CN101364988A (zh) * 2008-09-26 2009-02-11 深圳市迅雷网络技术有限公司 一种确定网页安全性的方法和装置

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
ZHANG, HUILIN ET AL.: "Detection of drive-by downloads based on dynamic page views", JOURNAL OF TSINGHUA UNIVERSITY (SCIENCE AND TECHNOLOGY), vol. 49, no. S2, December 2009 (2009-12-01), pages 2126 - 2132 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11122316B2 (en) 2009-07-15 2021-09-14 Time Warner Cable Enterprises Llc Methods and apparatus for targeted secondary content insertion
US11212593B2 (en) * 2016-09-27 2021-12-28 Time Warner Cable Enterprises Llc Apparatus and methods for automated secondary content management in a digital network

Also Published As

Publication number Publication date
US20140173736A1 (en) 2014-06-19
CN102955913A (zh) 2013-03-06

Similar Documents

Publication Publication Date Title
WO2013026320A1 (zh) 一种网页挂马检测方法及系统
US20240121266A1 (en) Malicious script detection
Carmony et al. Extract Me If You Can: Abusing PDF Parsers in Malware Detectors.
US20160065613A1 (en) System and method for detecting malicious code based on web
US20190114426A1 (en) Method of remediating operations performed by a program and system thereof
US9355246B1 (en) Tuning sandbox behavior based on static characteristics of malware
KR101083311B1 (ko) 악성 스크립트 분석 시스템 및 그를 이용한 악성 스크립트 분석 방법
US10621349B2 (en) Detection of malware using feature hashing
CN102254111B (zh) 恶意网站检测方法及装置
US9544316B2 (en) Method, device and system for detecting security of download link
CN101820419B (zh) 一种挂马网页中网页木马挂接点自动定位方法
JP5992622B2 (ja) 悪意あるアプリケーション診断装置及び方法
Jueckstock et al. Visiblev8: In-browser monitoring of javascript in the wild
CN103279710B (zh) Internet信息系统恶意代码的检测方法和系统
CN101964026A (zh) 网页挂马检测方法和系统
JP2018502351A (ja) スクリプト言語用のrasp
CN109347882B (zh) 网页木马监测方法、装置、设备及存储介质
CN104834859A (zh) 一种Android应用中恶意行为的动态检测方法
CN102012988B (zh) 自动二进制恶意代码行为分析方法
CN105653949B (zh) 一种恶意程序检测方法及装置
Parameshwaran et al. Auto-patching DOM-based XSS at scale
Veerappan et al. Taxonomy on malware evasion countermeasures techniques
CN101902481A (zh) 一种网页木马实时监测方法及其装置
CN103294951A (zh) 一种基于文档型漏洞的恶意代码样本提取方法及系统
CN105550573B (zh) 拦截捆绑软件的方法和装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 12825196

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 31/07/2014)

122 Ep: pct application non-entry in european phase

Ref document number: 12825196

Country of ref document: EP

Kind code of ref document: A1