CN115967559A - Webpage monitoring method and device and baseline data construction method and device - Google Patents
Webpage monitoring method and device and baseline data construction method and device Download PDFInfo
- Publication number
- CN115967559A CN115967559A CN202211656344.9A CN202211656344A CN115967559A CN 115967559 A CN115967559 A CN 115967559A CN 202211656344 A CN202211656344 A CN 202211656344A CN 115967559 A CN115967559 A CN 115967559A
- Authority
- CN
- China
- Prior art keywords
- operation log
- data
- webpage
- baseline
- baseline data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 141
- 238000012544 monitoring process Methods 0.000 title claims abstract description 32
- 238000010276 construction Methods 0.000 title claims abstract description 15
- 230000000737 periodic effect Effects 0.000 claims description 21
- 238000012806 monitoring device Methods 0.000 claims description 8
- 230000008859 change Effects 0.000 abstract description 4
- 230000008569 process Effects 0.000 description 64
- 230000006399 behavior Effects 0.000 description 12
- 238000010586 diagram Methods 0.000 description 12
- 238000011112 process operation Methods 0.000 description 11
- 238000004590 computer program Methods 0.000 description 5
- 230000004044 response Effects 0.000 description 5
- 238000005516 engineering process Methods 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 230000003287 optical effect Effects 0.000 description 3
- 230000002085 persistent effect Effects 0.000 description 3
- 238000012545 processing Methods 0.000 description 3
- 230000002159 abnormal effect Effects 0.000 description 2
- 230000003068 static effect Effects 0.000 description 2
- 230000004075 alteration Effects 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 239000002355 dual-layer Substances 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 239000010410 layer Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000002441 reversible effect Effects 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 230000001052 transient effect Effects 0.000 description 1
Images
Classifications
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Debugging And Monitoring (AREA)
Abstract
本申请提供了网页监控方法和装置、基线数据构建方法和装置,其中,网页监控方法包括:当获取到针对目标网站的一条网页操作日志时,基于构建的基线数据集合判断所述网页操作日志是否满足告警条件;在确定满足所述告警条件的情况下,生成针对所述目标网站的告警信息;本申请实现了对目标网站的网页的监控,无需关注网页内容的变化,减少了网页的业务访问量。
The present application provides a webpage monitoring method and device, and a baseline data construction method and device, wherein the webpage monitoring method includes: when a webpage operation log for a target website is obtained, judging whether the webpage operation log is based on the constructed baseline data set Satisfy the alarm condition; when it is determined that the alarm condition is satisfied, generate alarm information for the target website; this application realizes the monitoring of the webpage of the target website, without paying attention to the change of the content of the webpage, and reduces the business visit of the webpage quantity.
Description
技术领域technical field
本申请涉及计算机网络安全技术领域,尤其涉及网页监控方法和装置、基线数据构建方法和装置。The present application relates to the technical field of computer network security, in particular to a webpage monitoring method and device, and a baseline data construction method and device.
背景技术Background technique
随着互联网和网络应用的普及与发展,大量的黑客攻击随之而来,特别是针对网站的攻击。其中,篡改网页内容是黑客攻击的普遍手法。With the popularity and development of the Internet and network applications, a large number of hacker attacks follow, especially attacks on websites. Among them, tampering with webpage content is a common method of hacker attacks.
因此如何对网站的网页是否被篡改进行监控成为技术人员亟待解决的技术问题。相关技术中,是通过检测网站的网页内容变化来达到检测的目的,在检测网页内容变化时,需要周期性的向网站发起请求以获取网页内容,增加了网页的业务访问量。Therefore, how to monitor whether the webpage of the website has been tampered with has become a technical problem to be solved urgently by technicians. In related technologies, the purpose of detection is achieved by detecting the change of web page content of the website. When detecting the change of web page content, it is necessary to periodically initiate a request to the website to obtain the web page content, which increases the business visits of the web page.
发明内容Contents of the invention
为克服相关技术中存在的问题,本申请提供网页监控方法和装置、基线数据构建方法和装置,以在不增加网页的业务访问量的基础上实现对网页的监控。In order to overcome the problems existing in the related technologies, the present application provides a webpage monitoring method and device, and a baseline data construction method and device, so as to monitor webpages without increasing business visits to webpages.
本申请第一方面提供网页监控方法,包括:The first aspect of the present application provides a webpage monitoring method, including:
当获取到针对目标网站的一条网页操作日志时,基于构建的基线数据集合判断所述网页操作日志是否满足告警条件;其中,所述基线数据集合是根据网页操作日志构建的;When a webpage operation log for the target website is obtained, it is judged based on the constructed baseline data set whether the webpage operation log satisfies the warning condition; wherein, the baseline data set is constructed according to the webpage operation log;
在确定满足所述告警条件的情况下,生成针对所述目标网站的告警信息。When it is determined that the warning condition is met, warning information for the target website is generated.
可选地,还包括:Optionally, also include:
判断所述网页操作日志是否属于基线数据,若是,按照预设的写入方式将所述网页操作日志写入基线数据集合;Judging whether the webpage operation log belongs to baseline data, if so, writing the webpage operation log into the baseline data set according to a preset writing method;
若否,执行所述基于构建的基线数据集合判断所述网页操作日志是否满足告警条件的步骤及后续所有步骤。If not, execute the step of judging whether the webpage operation log meets the alarm condition based on the constructed baseline data set and all subsequent steps.
可选地,所述判断所述网页操作日志是否属于基线数据,包括:Optionally, the judging whether the webpage operation log belongs to baseline data includes:
判断所述网页操作日志是否包含基线数据集合中的索引数据,若否,确定所述网页操作日志属于基线数据;Judging whether the webpage operation log includes index data in the baseline data set, if not, determining that the webpage operation log belongs to the baseline data;
若是,确定所述基线数据集合中所述网页操作日志包含的索引数据为目标索引数据,判断所述网页操作日志中的操作时间是否在所述目标索引数据对应的基线时间范围内,若在所述基线时间范围内,确定所述网页操作日志属于基线数据,若不在所述基线时间范围内,确定所述网页操作日志不属于基线数据。If yes, determine that the index data included in the web page operation log in the baseline data set is target index data, and determine whether the operation time in the web page operation log is within the baseline time range corresponding to the target index data, if within the target index data If it is not within the baseline time range, it is determined that the webpage operation log belongs to the baseline data, and if it is not within the baseline time range, it is determined that the webpage operation log does not belong to the baseline data.
可选地,所述基于所述基线数据集合判断所述网页操作日志是否满足告警条件,包括:Optionally, the judging whether the webpage operation log meets an alarm condition based on the baseline data set includes:
判断所述网页操作日志中的特征数据与所述基线数据集合中的所述目标索引数据对应的目标特征数据是否一致,若不一致,确定满足告警条件;Judging whether the feature data in the webpage operation log is consistent with the target feature data corresponding to the target index data in the baseline data set, if not, determine that the alarm condition is met;
若一致,判断所述网页操作日志的特征数据在所述基线数据集合中是否属于周期性数据,其中,所述周期性数据指代为所述目标特征数据在所述目标索引数据对应的操作时间范围内周期性重复;If they are consistent, determine whether the feature data of the webpage operation log belongs to periodic data in the baseline data set, wherein the periodic data refers to the operation time range corresponding to the target feature data in the target index data Intra-cyclical repetition;
若不属于所述周期性数据,确定满足所述告警条件。If it does not belong to the periodic data, determine that the alarm condition is met.
可选地,所述按照预设的写入方式将所述网页操作日志写入所述基线数据集合,包括:Optionally, writing the webpage operation log into the baseline data set according to a preset writing method includes:
将所述网页操作日志按照以索引数据为索引建立对应的操作时间和特征数据的方式写入到所述基线数据集合中。The web page operation log is written into the baseline data set in a manner of using index data as an index to establish corresponding operation time and feature data.
本申请第二方面提供一种基线数据构建方法,包括:The second aspect of the present application provides a baseline data construction method, including:
在获取到针对目标网站的一条网页操作日志的情况下,判断所述网页操作日志是否包含基线数据集合中的索引数据,若否,确定所述网页操作日志属于基线数据;In the case of obtaining a webpage operation log for the target website, determine whether the webpage operation log includes index data in the baseline data set, and if not, determine that the webpage operation log belongs to the baseline data;
若是,确定所述基线数据集合中所述网页操作日志包含的索引数据为目标索引数据,判断所述网页操作日志中的操作时间是否在所述目标索引数据对应的基线时间范围内,若在所述基线时间范围内,确定所述网页操作日志属于基线数据,若不在所述基线时间范围内,确定所述网页操作日志不属于基线数据;If yes, determine that the index data included in the web page operation log in the baseline data set is target index data, and determine whether the operation time in the web page operation log is within the baseline time range corresponding to the target index data, if within the target index data If it is not within the baseline time range, it is determined that the webpage operation log belongs to the baseline data, and if it is not within the baseline time range, it is determined that the webpage operation log does not belong to the baseline data;
将属于所述基线数据的网页操作日志按照预设的写入方式写入基线数据集合中。Writing the webpage operation logs belonging to the baseline data into the baseline data set according to a preset writing method.
本申请第三方面提供网页监控装置,包括:The third aspect of the application provides a webpage monitoring device, including:
告警判断单元,当获取到针对目标网站的一条网页操作日志时,基于构建的基线数据集合判断所述网页操作日志是否满足告警条件;其中,所述基线数据集合是根据网页操作日志构建的;The alarm judging unit, when obtaining a webpage operation log for the target website, judges whether the webpage operation log meets the alarm condition based on the constructed baseline data set; wherein, the baseline data set is constructed according to the webpage operation log;
第一生成单元,用于在确定满足所述告警条件的情况下,生成针对所述目标网站的告警信息。The first generating unit is configured to generate warning information for the target website when it is determined that the warning condition is met.
本申请第四方面提供一种基线数据构建装置,包括:The fourth aspect of the present application provides a baseline data construction device, including:
数据判断单元,用于在获取到针对目标网站的一条网页操作日志的情况下,判断所述网页操作日志是否包含基线数据集合中的索引数据;A data judging unit, configured to judge whether the webpage operation log includes index data in the baseline data set when a webpage operation log for the target website is obtained;
第一确定单元,用于在所述网页操作日志不包含所述基线数据集合中的索引数据时,确定所述网页操作日志属于基线数据;A first determining unit, configured to determine that the webpage operation log belongs to baseline data when the webpage operation log does not include index data in the baseline data set;
第二确定单元,用于在所述网页操作日志包含所述基线数据集合中的索引数据时,确定所述基线数据集合中所述网页操作日志包含的索引数据为目标索引数据;The second determination unit is configured to determine that the index data included in the webpage operation log in the baseline data set is target index data when the webpage operation log includes index data in the baseline data set;
范围判断单元,用于判断所述网页操作日志中的操作时间是否在所述目标索引数据对应的基线时间范围内;A range judging unit, configured to judge whether the operation time in the webpage operation log is within the baseline time range corresponding to the target index data;
第三确定单元,用于在所述网页操作日志中的操作时间在所述目标索引数据对应的基线时间范围内,确定所述网页操作日志属于基线数据;A third determining unit, configured to determine that the webpage operation log belongs to baseline data when the operation time in the webpage operation log is within the baseline time range corresponding to the target index data;
第四确定单元,用于在所述网页操作日志中的操作时间不在所述目标索引数据对应的基线时间范围内,确定所述网页操作日志不属于基线数据;The fourth determination unit is configured to determine that the webpage operation log does not belong to the baseline data when the operation time in the webpage operation log is not within the baseline time range corresponding to the target index data;
写入数据单元,用于将属于所述基线数据的网页操作日志按照预设的写入方式写入基线数据集合中。The writing data unit is used to write the web page operation log belonging to the baseline data into the baseline data set according to a preset writing method.
本申请第五方面提供一种电子设备,包括:The fifth aspect of the present application provides an electronic device, including:
处理器;以及processor; and
存储器,其上存储有可执行代码,当所述可执行代码被所述处理器执行时,使所述处理器执行如上所述的方法。A memory, on which executable codes are stored, which, when executed by the processor, cause the processor to perform the method as described above.
本申请第六方面提供一种非暂时性机器可读存储介质,其上存储有可执行代码,当所述可执行代码被电子设备的处理器执行时,使所述处理器执行如上所述的方法。The sixth aspect of the present application provides a non-transitory machine-readable storage medium on which executable code is stored, and when the executable code is executed by a processor of an electronic device, the processor executes the above-mentioned method.
由此可见,本申请提供了网页监控方法,在获取到针对目标网站的一条网页操作日志时,基于构建的基线数据集合判断所述网页操作日志是否满足告警条件,在确定满足所述告警条件的情况下,生成针对所述目标网站的告警信息,由此可见,本申请能够通过网页操作日志与基线数据集合来实现对目标网站的网页的监控,无需关注网页内容的变化,减少了网页的业务访问量;It can be seen that the present application provides a webpage monitoring method. When a webpage operation log for the target website is obtained, it is judged based on the constructed baseline data set whether the webpage operation log satisfies the warning condition. In this case, the alarm information for the target website is generated. It can be seen that the application can realize the monitoring of the webpage of the target website through the collection of webpage operation logs and baseline data, without paying attention to changes in the content of the webpage, reducing the business of the webpage. Views;
进一步的,使用网页操作日志对网页进行监控,能够确保网页中正在发生的篡改行为或已经发生的篡改行为,提高了确定篡改行为的准确性。Further, using the webpage operation log to monitor the webpage can ensure that the tampering behavior is occurring or has occurred in the webpage, and the accuracy of determining the tampering behavior is improved.
应当理解的是,以上的一般描述和后文的细节描述仅是示例性和解释性的,并不能限制本申请。It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application.
附图说明Description of drawings
通过结合附图对本申请示例性实施方式进行更详细的描述,本申请的上述以及其它目的、特征和优势将变得更加明显,其中,在本申请示例性实施方式中,相同的参考标号通常代表相同部件。The above and other objects, features and advantages of the present application will become more apparent by describing the exemplary embodiments of the present application in more detail with reference to the accompanying drawings, wherein, in the exemplary embodiments of the present application, the same reference numerals generally represent same parts.
图1是本申请一个方法实施例示出的网页监控方法的流程示意图;Fig. 1 is a schematic flow diagram of a webpage monitoring method shown in a method embodiment of the present application;
图2是本申请另一方法实施例示出的网页监控方法的流程示意图;FIG. 2 is a schematic flow diagram of a webpage monitoring method shown in another method embodiment of the present application;
图3是本申请一个方法实施例示出的一种基线数据构建方法的流程示意图;Fig. 3 is a schematic flowchart of a baseline data construction method shown in a method embodiment of the present application;
图4是本申请一个装置实施例示出的网页监控装置的结构示意图;FIG. 4 is a schematic structural diagram of a webpage monitoring device shown in a device embodiment of the present application;
图5是本申请另一装置实施例示出的网页监控装置的结构示意图;FIG. 5 is a schematic structural diagram of a webpage monitoring device shown in another device embodiment of the present application;
图6是本申请一个装置实施例示出的一种基线数据构建装置的结构示意图;Fig. 6 is a schematic structural diagram of a baseline data construction device shown in a device embodiment of the present application;
图7是本申请一个装置实施例示出的电子设备的结构示意图。Fig. 7 is a schematic structural diagram of an electronic device shown in an apparatus embodiment of the present application.
具体实施方式Detailed ways
下面将参照附图更详细地描述本申请的优选实施方式。虽然附图中显示了本申请的优选实施方式,然而应该理解,可以以各种形式实现本申请而不应被这里阐述的实施方式所限制。相反,提供这些实施方式是为了使本申请更加透彻和完整,并且能够将本申请的范围完整地传达给本领域的技术人员。Preferred embodiments of the present application will be described in more detail below with reference to the accompanying drawings. Although preferred embodiments of the present application are shown in the drawings, it should be understood that the present application may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this application will be thorough and complete, and will fully convey the scope of this application to those skilled in the art.
在本申请使用的术语是仅仅出于描述特定实施例的目的,而非旨在限制本申请。在本申请和所附权利要求书中所使用的单数形式的“一种”、“所述”和“该”也旨在包括多数形式,除非上下文清楚地表示其他含义。还应当理解,本文中使用的术语“和/或”是指并包含一个或多个相关联的列出项目的任何或所有可能组合。The terminology used in this application is for the purpose of describing particular embodiments only, and is not intended to limit the application. As used in this application and the appended claims, the singular forms "a", "the", and "the" are intended to include the plural forms as well, unless the context clearly dictates otherwise. It should also be understood that the term "and/or" as used herein refers to and includes any and all possible combinations of one or more of the associated listed items.
应当理解,尽管在本申请可能采用术语“第一”、“第二”、“第三”等来描述各种信息,但这些信息不应限于这些术语。这些术语仅用来将同一类型的信息彼此区分开。例如,在不脱离本申请范围的情况下,第一信息也可以被称为第二信息,类似地,第二信息也可以被称为第一信息。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括一个或者更多个该特征。在本申请的描述中,“多个”的含义是两个或两个以上,除非另有明确具体的限定。It should be understood that although the terms "first", "second", "third" and so on may be used in this application to describe various information, such information should not be limited to these terms. These terms are only used to distinguish information of the same type from one another. For example, without departing from the scope of the present application, first information may also be called second information, and similarly, second information may also be called first information. Thus, a feature defined as "first" and "second" may explicitly or implicitly include one or more of these features. In the description of the present application, "plurality" means two or more, unless otherwise specifically defined.
本申请一个方法实施例提供了网页监控方法,该方法可以应用于与目标网站的服务器进行通信的终端中,也可以应用于目标网站的服务器中。该目标网站为需要监控其网页是否被篡改的网站。A method embodiment of the present application provides a web page monitoring method, which can be applied to a terminal communicating with a server of a target website, or to a server of a target website. The target website is a website that needs to monitor whether its webpage has been tampered with.
该方法可以包括如下步骤:The method may include the steps of:
步骤101:当获取到针对目标网站的一条网页操作日志时,基于构建的基线数据集合判断所述网页操作日志是否满足告警条件,若是,进入步骤102,若否,结束针对当前的所述网页操作日志的监控;Step 101: When a web page operation log for the target website is obtained, judge whether the web page operation log satisfies the alarm condition based on the constructed baseline data set, if yes, go to step 102, if not, end the current web page operation log monitoring;
本申请中,一条网页操作日志记录了某个主机对目标网站的一次操作行为,包含文件路径、文件操作方式、进程路径、操作时间、主机标识等各种数据。In this application, a webpage operation log records an operation behavior of a certain host on a target website, including various data such as file path, file operation method, process path, operation time, and host identification.
文件路径包括文件目录以及文件名称,用于表征操作的文件信息。The file path includes a file directory and a file name, which are used to represent the file information of the operation.
文件操作方式用于表征对文件进行的操作方式,例如,删除、创建、修改、重命名等。The file operation mode is used to represent the operation mode performed on the file, for example, delete, create, modify, rename, and so on.
进程路径用于表征操作使用的进程。The process path is used to characterize the process used by the operation.
操作时间用于表征操作的时间。Operation time is used to characterize the time of operation.
主机标识用于表征操作的主机ID和/或主机IP。The host ID is used to characterize the host ID and/or host IP of the operation.
可选的,网页操作日志还可以包含服务器标识,该服务器标识可以为技术人员对服务器自定义的标识,用于表征网页操作日志的服务器来源。Optionally, the webpage operation log may also include a server identifier, which may be an identifier customized by technicians for the server, and is used to represent the server source of the webpage operation log.
本申请构建有基线数据集合,基线数据集合是基于网页操作日志构建的,设置有索引数据的类型和索引数据下的特征数据的类型,从而基于网页操作日志来生成基线数据,并能够写入到基线数据集合中。因此可以基于构建的基线数据集合来判断当前获取到的网页操作日志是否满足告警条件,在否的情况下,结束针对当前网页操作日志的监控,当再次获取到新的一条网页操作日志时,针对新获取的一条网页操作日志再执行上述方法;在是的情况下,进入步骤102。This application constructs a baseline data set, which is constructed based on the web page operation log, and sets the type of index data and the type of feature data under the index data, so that the baseline data is generated based on the web page operation log, and can be written into in the baseline data set. Therefore, based on the constructed baseline data set, it can be judged whether the currently obtained webpage operation log meets the alarm condition. If not, the monitoring of the current webpage operation log is ended. Execute the above method again for a newly acquired web page operation log; if yes, go to step 102.
步骤102:生成针对所述目标网站的告警信息。Step 102: Generate warning information for the target website.
该告警信息可以为针对网页操作日志中的主机标识,如主机ID和/或主机地址的告警信息,以使得能够快速定位到发生篡改的主机,提高篡改发生后的响应处置速度。The alarm information can be alarm information for the host identification in the webpage operation log, such as host ID and/or host address, so that the host that has been tampered with can be quickly located and the response speed after tampering occurs is improved.
本申请实施例中,在获取到针对目标网站的一条网页操作日志时,基于构建的基线数据集合判断所述网页操作日志是否满足告警条件,在确定满足所述告警条件的情况下,生成针对所述目标网站的告警信息,由此可见,本申请能够通过网页操作日志与基线数据集合来实现对目标网站的网页的监控,无需关注网页内容的变化,减少了网页的业务访问量;In the embodiment of the present application, when a webpage operation log for the target website is obtained, it is judged based on the constructed baseline data set whether the webpage operation log satisfies the warning condition, and when it is determined that the warning condition is satisfied, an The alarm information of the target website is described, so it can be seen that this application can realize the monitoring of the webpage of the target website through the collection of webpage operation logs and baseline data, without paying attention to the changes of the webpage content, and reducing the business visits of the webpage;
进一步的,使用网页操作日志对网页进行监控,能够确保网页中正在发生的篡改行为或已经发生的篡改行为,提高了确定篡改行为的准确性。Further, using the webpage operation log to monitor the webpage can ensure that the tampering behavior is occurring or has occurred in the webpage, and the accuracy of determining the tampering behavior is improved.
本申请另一方法实施例提供了网页监控方法,如图2所示,该方法包括如下过程:Another method embodiment of the present application provides a webpage monitoring method, as shown in Figure 2, the method includes the following process:
步骤201:当获取到针对目标网站的一条网页操作日志时,判断所述网页操作日志是否属于基线数据,若是,进入步骤202;若否,进入步骤203;Step 201: When a webpage operation log for the target website is obtained, judge whether the webpage operation log belongs to the baseline data, if so, go to step 202; if not, go to step 203;
本申请中,一条网页操作日志记录了某个主机对目标网站的一次操作行为,包含文件路径、文件操作方式、进程路径、操作时间、主机标识等各种数据。In this application, a webpage operation log records an operation behavior of a certain host on a target website, including various data such as file path, file operation method, process path, operation time, and host identification.
文件路径包括文件目录以及文件名称,用于表征操作的文件信息。The file path includes a file directory and a file name, which are used to represent the file information of the operation.
文件操作方式用于表征对文件进行的操作方式,例如,删除、创建、修改、重命名等。The file operation mode is used to represent the operation mode performed on the file, for example, delete, create, modify, rename, and so on.
进程路径用于表征操作使用的进程。The process path is used to characterize the process used by the operation.
操作时间用于表征操作的时间。Operation time is used to characterize the time of operation.
主机标识用于表征操作的主机ID和/或主机IP。The host ID is used to characterize the host ID and/or host IP of the operation.
可选的,网页操作日志还可以包含服务器标识,该服务器标识可以为技术人员对服务器自定义的标识,用于表征网页操作日志的服务器来源。Optionally, the webpage operation log may also include a server identifier, which may be an identifier customized by technicians for the server, and is used to represent the server source of the webpage operation log.
本申请中,判断所述网页操作日志是否属于基线数据,可以包括如下过程:In this application, judging whether the webpage operation log belongs to the baseline data may include the following process:
(1.1)判断所述网页操作日志是否包含基线数据集合中的索引数据,若否,确定所述网页操作日志属于基线数据;(1.1) judging whether the webpage operation log includes index data in the baseline data set, if not, determining that the webpage operation log belongs to the baseline data;
本申请中的基线数据集合是基于网页操作日志生成的,可以预先设定索引数据的类型和索引数据下的特征数据的类型,从而基于网页操作日志来生成基线数据,并能够写入到基线数据集合中。The baseline data set in this application is generated based on webpage operation logs, and the type of index data and the type of feature data under the index data can be preset, so that baseline data can be generated based on webpage operation logs and can be written into the baseline data set.
基线数据集合还用于对网页操作日志是否满足告警条件进行判断,因此,当获取到一条网页操作日志时,需先判断该网页操作日志是用于生成基线数据的日志还是用于判断是否满足告警条件的日志。The baseline data set is also used to judge whether the webpage operation log satisfies the alarm condition. Therefore, when obtaining a webpage operation log, it is necessary to first judge whether the webpage operation log is used to generate the baseline data log or to judge whether the alert condition is satisfied. condition log.
若网页操作日志不包含基线数据集合中的索引数据,确定基线数据集合中未记载关于该条网页操作日志的索引数据,因此该条网页操作日志属于基线数据,参照本实施例的后续步骤,是将其按照预设的写入方式写入基线数据集合。If the web page operation log does not include the index data in the baseline data set, it is determined that the index data about the web page operation log is not recorded in the baseline data set, so the web page operation log belongs to the baseline data, and with reference to the subsequent steps of this embodiment, it is Write it into the baseline data set according to the preset writing method.
可以理解的是,如果基线数据集合中没有记录任何基线数据,说明需要先将网页操作日志写入到基线数据集合中,即确定网页操作日志属于基线数据。或者,如果基线数据集合中虽然记录了基线数据,但是索引数据中并没有关于网页操作日志的数据,那么也需要将网页操作日志写入到基线数据集合中,即也能确定网页操作日志属于基线数据。例如,索引数据类型为主机标识,基线数据集合中的主机标识为主机A,而网页操作日志中的主机标识为主机B,那么确定网页操作日志不包含基线数据集合中的索引数据,确定该网页操作日志属于基线数据。It can be understood that if no baseline data is recorded in the baseline data set, it means that the webpage operation log needs to be written into the baseline data set first, that is, it is determined that the webpage operation log belongs to the baseline data. Or, if the baseline data is recorded in the baseline data set, but there is no data about webpage operation logs in the index data, then the webpage operation logs also need to be written into the baseline data set, that is, it can also be determined that the webpage operation logs belong to the baseline data. For example, if the index data type is host ID, the host ID in the baseline data set is host A, and the host ID in the web page operation log is host B, then it is determined that the web page operation log does not contain the index data in the baseline data set, and it is determined that the web page Operation logs are baseline data.
(1.2)若所述网页操作日志包含基线数据集合中的索引数据,确定所述基线数据集合中所述网页操作日志包含的索引数据为目标索引数据,判断所述网页操作日志中的操作时间是否在所述目标索引数据对应的操作时间范围内;(1.2) If the webpage operation log contains index data in the baseline data set, determine that the index data contained in the webpage operation log in the baseline data set is the target index data, and determine whether the operation time in the webpage operation log is within the operating time range corresponding to the target index data;
基线数据集合中,每一索引数据对应一基线时间范围,该基线时间范围可以以首条写入的索引数据的操作时间为基准,并延长指定的时间长度下确定的范围。例如,索引数据为主机A,首条写入时对应的操作时间为2022-11-15 20:00:00,以该时间往后延长两周的时间范围为基线时间范围,也就是基线时间范围为2022-11-1520:00:00至2022-11-29 20:00:00。In the baseline data set, each index data corresponds to a baseline time range. The baseline time range can be based on the operation time of the first written index data and extend the range determined under the specified time length. For example, the index data is host A, and the corresponding operation time when the first entry is written is 2022-11-15 20:00:00, and the time range extended by two weeks after this time is the baseline time range, that is, the baseline time range From 2022-11-15 20:00:00 to 2022-11-29 20:00:00.
可以理解的是,两周仅是指定的时间长度的一个具体示例,并不构成限定,本申请可以基于实际情况对指定的时间长度进行设定。It can be understood that two weeks is only a specific example of the designated time length, and does not constitute a limitation, and the application can set the designated time length based on actual conditions.
(1.3)若在所述操作时间范围内,确定所述网页操作日志属于基线数据,若不在所述操作时间范围内,确定所述网页操作日志不属于基线数据。(1.3) If it is within the operation time range, it is determined that the web page operation log belongs to the baseline data, and if it is not within the operation time range, it is determined that the web page operation log does not belong to the baseline data.
步骤202:按照预设的写入方式将所述网页操作日志写入基线数据集合;Step 202: Write the webpage operation log into the baseline data set according to a preset writing method;
预先设定网页操作日志的写入方式,一种方式下,可以直接将网页操作日志作为基线数据写入到基线数据集合中。而为了便于管理和数据定位,另一种方式下,按照预设的写入方式将所述网页操作日志写入所述基线数据集合,可以包括:将所述网页操作日志按照以索引数据为索引建立对应的操作时间和特征数据的方式写入到所述基线数据集合中。The writing mode of the web page operation log is preset. In one mode, the web page operation log can be directly written into the baseline data set as the baseline data. In order to facilitate management and data location, in another way, writing the webpage operation log into the baseline data set according to a preset writing method may include: writing the webpage operation log according to index data The method of establishing the corresponding operation time and characteristic data is written into the baseline data set.
索引数据的类型和特征数据的类型均可以预先指定,例如,索引数据可以为主机标识、服务器标识、文件标识、进程路径中的一个或多个,特征数据可以包括:文件路径、进程路径、文件操作方式中的一个或多个。其中,文件路径包括文件目录和文件名称。Both the type of index data and the type of feature data can be specified in advance, for example, the index data can be one or more of host ID, server ID, file ID, and process path, and feature data can include: file path, process path, file One or more of the modes of operation. Wherein, the file path includes a file directory and a file name.
需说明的是,索引数据和特征数据不重复。例如,索引数据中包括进程路径,那么对应的特征数据则不包含进程路径。It should be noted that the index data and feature data are not repeated. For example, if the index data includes a process path, then the corresponding characteristic data does not include the process path.
其中,每一索引数据还对应有操作时间,在将网页操作日志写入基线数据集合中,也需要将网页操作日志的操作时间与索引数据建立对应关系写入。可选,基线数据集合可以具有索引区和数据区,索引区中用于写入索引数据和与索引数据对应的时间信息,该时间信息为操作时间;当确定了索引数据的基线时间范围,也可以将该基线时间范围对应写入,以避免后续获取到网页操作日志还需要对基线时间范围进行计算。Wherein, each index data also corresponds to an operation time, and when writing the web page operation log into the baseline data set, it is also necessary to establish a corresponding relationship between the operation time of the web page operation log and the index data. Optionally, the baseline data set may have an index area and a data area, and the index area is used to write index data and time information corresponding to the index data. The time information is the operation time; when the baseline time range of the index data is determined, also The baseline time range can be correspondingly written, so as to avoid the need to calculate the baseline time range for subsequent web page operation logs.
数据区用于写入与索引数据对应的特征数据。因此,在将网页操作日志写入时,具体可以将网页操作日志按照在索引区写入索引数据和对应的操作时间,在特征区写入与所述索引数据对应的特征数据的方式写入基线数据集合中。The data area is used to write feature data corresponding to the index data. Therefore, when writing the web page operation log, specifically, the web page operation log can be written into the baseline in the manner of writing index data and corresponding operation time in the index area, and writing feature data corresponding to the index data in the feature area in the data set.
步骤203:基于所述基线数据集合判断所述网页操作日志是否满足告警条件,若是,进入步骤204,若否,结束针对当前的所述网页操作日志的监控;Step 203: judging whether the webpage operation log satisfies the alarm condition based on the baseline data set, if yes, proceed to step 204, if not, end the monitoring of the current webpage operation log;
结束针对当前的网页操作日志的监控后,当再次获取到新的一条网页操作日志时,针对新获取的一条网页操作日志再执行上述方法。After the monitoring of the current webpage operation log is finished, when a new webpage operation log is obtained again, the above method is executed again for the newly obtained webpage operation log.
可选的,基于所述基线数据集合判断所述网页操作日志是否满足告警条件,可以包括如下过程:Optionally, judging whether the webpage operation log satisfies an alarm condition based on the baseline data set may include the following process:
(2.1)判断所述网页操作日志中的特征数据与所述基线数据集合中的所述目标索引数据对应的目标特征数据是否一致,若不一致,确定满足告警条件;(2.1) judging whether the feature data in the webpage operation log is consistent with the target feature data corresponding to the target index data in the baseline data set, if inconsistent, determine that the alarm condition is met;
基线数据集合中,索引数据与特征数据具有对应关系,当确定网页操作日志具有基线数据集合中的目标索引数据后,可以判断网页操作日志中的特征数据与目标索引数据对应的目标特征数据是否一致,可以理解的是,目标特征数据在进行比对时需要进行同类型比对。In the baseline data set, the index data and feature data have a corresponding relationship. When it is determined that the web page operation log has the target index data in the baseline data set, it can be judged whether the feature data in the web page operation log is consistent with the target feature data corresponding to the target index data , it can be understood that the same type of comparison needs to be performed when the target feature data is compared.
例如,目标特征数据包括进程路径、文件路径以及文件操作方式,那么需要将网页操作日志中的进程路径与目标特征数据中的进程路径进行比对,将网页操作日志中的文件路径与目标特征数据中的文件路径进行比对,将网页操作日志中的文件操作方式与目标特征数据中的文件操作方式进行比对,如有任意一项不一致,则确定满足告警条件。For example, target feature data includes process path, file path, and file operation method, then it is necessary to compare the process path in the web page operation log with the process path in the target feature data, and compare the file path in the web page operation log with the target feature data Compare the file path in the webpage operation log with the file operation method in the target feature data. If any item is inconsistent, it is determined that the alarm condition is met.
(2.2)若一致,判断所述网页操作日志的特征数据在所述基线数据集合中是否属于周期性数据;(2.2) If consistent, determine whether the feature data of the webpage operation log belongs to periodic data in the baseline data set;
其中,所述周期性数据指代为所述目标特征数据在所述目标索引数据对应的操作时间范围内周期性重复。该周期性重复的时间和次数本申请不做限定,例如,每隔2天重复一次。Wherein, the periodic data refers to that the target characteristic data is periodically repeated within the operating time range corresponding to the target index data. The time and frequency of the periodic repetition are not limited in this application, for example, it is repeated every 2 days.
当然,在本实施例中,作为另一种方式,在一致的情况下,也可以直接结束针对当前的所述网页操作日志的判断。而对网页操作日志进行周期性数据的判断能够进一步提高针对网页监控的准确性。Of course, in this embodiment, as another manner, in the case of a match, the judgment on the current webpage operation log may also be ended directly. Judging the periodic data of the webpage operation log can further improve the accuracy of webpage monitoring.
(2.3)若不属于所述周期性数据,确定满足告警条件。(2.3) If it does not belong to the periodic data, determine that the alarm condition is met.
步骤204:生成针对所述目标网站的告警信息。Step 204: Generate warning information for the target website.
该告警信息具体可以为针对网页操作日志中的主机标识,如主机ID和/或主机地址的告警信息,以使得能够快速定位到发生篡改的主机,提高篡改发生后的响应处置速度。The alarm information can specifically be the alarm information for the host identification in the web page operation log, such as host ID and/or host address, so that the host that has been tampered with can be quickly located and the response speed after tampering occurs is improved.
本实施例中,在获取到针对目标网站的一条网页操作日志时,判断所述网页操作日志是否属于基线数据,若是,按照预设的写入方式将所述网页操作日志写入基线数据集合,若否,基于基线数据集合判断所述网页操作日志是否满足告警条件,在确定满足所述告警条件的情况下,生成针对所述目标网站的告警信息,由此可见,本申请能够通过网页操作日志与基线数据集合来实现对目标网站的网页的监控,无需关注网页内容的变化,减少了网页的业务访问量;In this embodiment, when a webpage operation log for the target website is obtained, it is judged whether the webpage operation log belongs to the baseline data, and if so, the webpage operation log is written into the baseline data set according to the preset writing method, If not, judge whether the webpage operation log satisfies the warning condition based on the baseline data set, and generate warning information for the target website when it is determined that the warning condition is met. It can be seen that the application can use the webpage operation log Combined with the baseline data to monitor the webpages of the target website, without paying attention to the changes of the webpage content, reducing the business visits of the webpages;
进一步的,使用网页操作日志对网页进行监控,能够确保网页中正在发生的篡改行为或已经发生的篡改行为,提高了确定篡改行为的准确性;Further, using the web page operation log to monitor the web page can ensure that the tampering behavior in the webpage is taking place or has already occurred, and improves the accuracy of determining the tampering behavior;
此外,通过网页操作日志能够快速定位到发生篡改的主机,提高篡改发生后的响应处置速度。In addition, the tampered host can be quickly located through the web page operation log, which improves the response speed after the tampering occurs.
本申请又一方法实施例提供了网页监控方法,在本实施例中,在获取到针对目标网站的一条网页操作日志之前,还包括:采集针对目标网站的日志,若所述日志中的进程路径符合指定进程路径,且所述日志中的操作文件具有指定属性,将所述日志确定为网页操作日志;Another method embodiment of the present application provides a webpage monitoring method. In this embodiment, before obtaining a webpage operation log for the target website, it also includes: collecting a log for the target website, if the process path in the log is Comply with the specified process path, and the operation file in the log has specified attributes, and determine the log as a web page operation log;
其中,上述日志可以为文件进程操作日志或网页应用日志。Wherein, the above-mentioned log may be a file process operation log or a web application log.
具体的,可以通过API HOOK技术采集关于进程、文件、网络行为的文件进程操作日志。Specifically, file process operation logs about processes, files, and network behaviors can be collected through the API HOOK technology.
为了便于对网页操作日志进行识别,可选的,所述将日志确定为网页操作日志,可以包括:按照预设的格式将所述日志转换为标准日志,将所述标准日志确定为网页操作日志。In order to facilitate the identification of the webpage operation log, optionally, the determining the log as the webpage operation log may include: converting the log into a standard log according to a preset format, and determining the standard log as the webpage operation log .
指定进程路径用于表征启动网页业务的进程,可以包括用于表征Java、wpw3、httpd、nginx、php-cgi、Tomcat等中的一种或多种进程的进程路径。The specified process path is used to represent the process of starting the webpage service, and may include a process path used to represent one or more processes among Java, wpw3, httpd, nginx, php-cgi, Tomcat, and the like.
指定属性用于表征网页资源文件,如用于表征图片资源、静态htm资源、web脚本资源等中的一种或多种的属性。而操作文件是否具有指定属性可以基于操作文件的文件名的后缀来确定其是否具有指定属性,例如后缀为jpg则确定其具有图片资源的属性。The specified attribute is used to characterize the webpage resource file, such as one or more attributes used to represent image resources, static htm resources, web script resources, and the like. Whether the operation file has the specified attribute can be determined based on the suffix of the file name of the operation file. For example, if the suffix is jpg, it is determined that it has the attribute of a picture resource.
可选地,本实施例中,采集针对所述目标网站的日志之前,还可以包括:Optionally, in this embodiment, before collecting the log for the target website, it may also include:
设定采集所述目标网站的文件进程操作日志,对应的,可以设定基线数据集合中的索引数据的类型包括主机标识和服务器标识;Set to collect the file process operation log of the target website, correspondingly, the type of index data in the baseline data set can be set to include host ID and server ID;
或者,设定采集所述目标网站的文件进程操作日志,对应的,可以设定基线数据集合中的索引数据包括主机标识和文件路径;Alternatively, it is set to collect the file process operation log of the target website, and correspondingly, the index data in the baseline data set can be set to include the host identifier and the file path;
或者,设定采集所述目标网站的网页应用日志,对应的,可以设定基线数据集合中的索引数据包括主机标识和进程路径。Alternatively, it is set to collect the web page application logs of the target website, and correspondingly, it may be set that the index data in the baseline data set includes the host identifier and the process path.
在设定索引数据的类型包含主机标识和服务器标识时,对应索引数据的特征数据可以包括:文件路径、进程路径以及文件操作方式。When the type of the index data is set to include the host ID and the server ID, the feature data corresponding to the index data may include: a file path, a process path, and a file operation mode.
在设定索引数据的类型包含主机标识和文件路径时,对应索引数据的特征数据可以包括:进程路径;When the type of index data is set to include a host identifier and a file path, the feature data corresponding to the index data may include: process path;
在设定索引数据的类型包含主机标识和进程路径时,对应索引数据的特征数据可以包括:文件路径以及文件操作方式。When the type of index data is set to include a host identifier and a process path, the feature data corresponding to the index data may include: a file path and a file operation method.
可选的,基线数据集合可以包括索引区和数据区,索引数据的类型下的索引数据写入索引区,上述特征数据下的特征数据写入数据区,且与索引数据对应。索引区还用于写入与索引数据对应的时间信息,而当确定了索引数据的基线时间范围后,也可以将基线时间范围对应写入。Optionally, the baseline data set may include an index area and a data area, the index data under the type of index data is written into the index area, and the feature data under the above-mentioned feature data is written into the data area, corresponding to the index data. The index area is also used to write time information corresponding to the index data, and after the baseline time range of the index data is determined, the baseline time range can also be correspondingly written.
基线时间范围可以在首条索引数据写入时基于索引数据对应的操作时间和指定的时间长度确定;当然也可以再后续对网页操作日志确定时需要使用基线时间范围时确定,并写入,这均是可以实现的。The baseline time range can be determined based on the operation time corresponding to the index data and the specified time length when the first index data is written; of course, it can also be determined and written when the baseline time range needs to be used when determining the web page operation log. are all achievable.
本申请又一方法实施例以网页应用日志为例,提供网页监控方法,需说明的是,本实施例仅是提供针对网页应用日志的一个具体示例,并不构成对其他实现方式的限定,具体过程如下:Another method embodiment of the present application takes webpage application logs as an example to provide a webpage monitoring method. It should be noted that this embodiment only provides a specific example for webpage application logs, and does not constitute a limitation on other implementation methods. The process is as follows:
(3.1)设定采集目标网站的网页应用日志,并设定基线数据集合包括索引区和数据区,索引区用于写入索引数据和与索引数据对应的时间信息,索引数据包括主机标识和进程路径,数据区用于写入与索引数据对应的特征数据,包括文件路径和文件操作方式。(3.1) Set the webpage application log of the collection target website, and set the baseline data set to include the index area and the data area. The index area is used to write the index data and the time information corresponding to the index data. The index data includes the host identification and process Path, the data area is used to write feature data corresponding to the index data, including file path and file operation mode.
由于网页应用日志中记载的均是关于同一进程针对目标网站的操作,因此,本实施例的基线数据集合中记录的是不同主机使用同一进程针对目标网站的操作。Since the webpage application logs all record operations on the target website by the same process, the baseline data set in this embodiment records the operations on the target website by different hosts using the same process.
(3.2)采集针对目标网站的网页应用日志,若所述网页应用日志中的进程路径符合指定进程路径,且所述网页应用日志中的操作文件具有指定属性,将所述日志确定为网页操作日志;(3.2) Collect the webpage application log for the target website, if the process path in the webpage application log meets the specified process path, and the operation file in the webpage application log has specified attributes, the log is determined as a webpage operation log ;
(3.3)将首次获取的网页操作日志作为基线数据按照基线数据集合设定的索引区和数据区,将作为索引数据的主机标识和进程路径,以及作为时间信息的操作时间写入索引区,将作为特征数据的文件路径、文件操作方式写入与索引数据对应的数据区。(3.3) Use the webpage operation log acquired for the first time as the baseline data to set the index area and data area according to the baseline data set, write the host identification and process path as the index data, and the operation time as time information into the index area, and write The file path and file operation mode as feature data are written into the data area corresponding to the index data.
(3.4)再次获取到网页操作日志时,判断网页操作日志中是否包含基线数据集合中的主机标识和进程路径,如不包含,则仍按上述步骤(3.3)中的写入方式将网页操作日志写入到基线数据集合中。若包含,确定基线数据集合中网页操作日志包含的主机标识和进程路径对应的操作时间,基于该操作时间以及指定的时间长度确定一基线时间范围,该基线时间范围也可以写入到索引区。(3.4) When the webpage operation log is obtained again, judge whether the host identifier and process path in the baseline data set are included in the webpage operation log; Write to the baseline data set. If included, determine the operation time corresponding to the host ID and process path included in the web page operation log in the baseline data set, determine a baseline time range based on the operation time and the specified time length, and the baseline time range can also be written into the index area.
(3.5)判断网页操作日志中的操作时间是否在该基线时间范围内,若在,仍按照上述步骤(3.3)中的写入方式将网页操作日志写入到基线数据集合中。若不在,则判断网页操作日志中的文件路径与基线数据集合中相应的文件路径是否一致、文件操作方式是否一致,若有一项不一致,则满足告警条件,直接输出告警信息。若一致,则判断网页操作日志的特征数据在基线数据集合中是否属于周期性数据,若不是周期性数据,则确定满足告警条件,直接输出告警信息,若是周期性数据,则结束针对当前网页操作日志的判断。(3.5) Determine whether the operation time in the webpage operation log is within the baseline time range, and if so, write the webpage operation log into the baseline data set according to the writing method in the above step (3.3). If not, judge whether the file path in the web page operation log is consistent with the corresponding file path in the baseline data set, and whether the file operation mode is consistent. If there is an inconsistency, the alarm condition is met and the alarm information is output directly. If they are consistent, judge whether the characteristic data of the web page operation log is periodic data in the baseline data set. If it is not periodic data, determine that the alarm condition is met, and output the alarm information directly. If it is periodic data, end the operation for the current web page log judgment.
在具体应用场景中,例如,主机A使用进程B操作了文件C,对应该操作的数据写入到了基线数据集合中;而后续主机A使用进程B操作了文件D,基于上述方法可以确定操作的文件发生了改变,那么则确定出现了网页篡改情况,因此可以通过输出告警信息来提示用户网页发生了篡改,可见,本实施例能够通过监控操作文件的变化实现针对网页的监控。In a specific application scenario, for example, host A uses process B to operate file C, and the data corresponding to the operation is written into the baseline data set; and subsequent host A uses process B to operate file D, based on the above method, the operation can be determined If the file has changed, then it is determined that the webpage has been tampered with, so the user can be prompted that the webpage has been tampered with by outputting an alarm message. It can be seen that this embodiment can monitor the webpage by monitoring the changes in the operating file.
本申请又一方法实施例以监控异常进程的方式实现对网页的监控,需说明的是,本实施例仅是提供监控异常进程的一个具体示例,并不构成对其他实现方式的限定,具体过程如下::Yet another method embodiment of the present application implements the monitoring of web pages by monitoring abnormal processes. It should be noted that this embodiment only provides a specific example of monitoring abnormal processes, and does not constitute a limitation on other implementation methods. The specific process as follows::
(4.1)设定采集所述目标网站的文件进程操作日志,并设定基线数据集合包括索引区和数据区,索引区用于写入索引数据和与索引数据对应的时间信息,索引数据包括主机标识和文件路径,数据区用于写入与索引数据对应的特征数据,包括进程路径。(4.1) Set and collect the file process operation log of the target website, and set the baseline data set to include an index area and a data area, the index area is used to write index data and time information corresponding to the index data, and the index data includes the host Identification and file path, the data area is used to write feature data corresponding to the index data, including the process path.
(4.2)采集针对目标网站的文件进程操作日志,若所述文件进程操作日志中的进程路径符合指定进程路径,且所述网页应用日志中的操作文件具有指定属性,将所述操作日志确定为网页操作日志;(4.2) Collect the file process operation log for the target website, if the process path in the file process operation log meets the specified process path, and the operation file in the webpage application log has specified attributes, the operation log is determined as Web page operation log;
(4.3)将首次获取的网页操作日志作为基线数据按照基线数据集合设定的索引区和数据区,将作为索引数据的主机标识和文件路径,以及作为时间信息的操作时间写入索引区,将作为特征数据的进程路径写入与索引数据对应的数据区。(4.3) Use the webpage operation log acquired for the first time as the baseline data according to the index area and data area set by the baseline data set, write the host ID and file path as the index data, and the operation time as time information into the index area, and write The process path as feature data is written into the data area corresponding to the index data.
(4.4)再次获取到网页操作日志时,判断网页操作日志中是否包含基线数据集合中的主机标识和文件路径,如不包含,则仍按上述步骤(4.3)中的写入方式将网页操作日志写入到基线数据集合中。若包含,确定基线数据集合中网页操作日志包含的主机标识和文件路径对应的操作时间,基于该操作时间以及指定的时间长度确定一基线时间范围,该基线时间范围也可以写入到索引区。(4.4) When the webpage operation log is obtained again, determine whether the webpage operation log contains the host identifier and file path in the baseline data set, if not, then still write the webpage operation log according to the writing method in the above step (4.3) Write to the baseline data set. If included, determine the operation time corresponding to the host ID and file path contained in the web page operation log in the baseline data set, determine a baseline time range based on the operation time and the specified time length, and the baseline time range can also be written into the index area.
(4.5)判断网页操作日志中的操作时间是否在该基线时间范围内,若在,仍按照上述步骤(4.3)中的写入方式将网页操作日志写入到基线数据集合中。若不在,则判断网页操作日志中的进程路径与基线数据集合中相应的进程路径是否一致,确定不一致,则满足告警条件,直接输出告警信息。若一致,则判断网页操作日志的特征数据在基线数据集合中是否属于周期性数据,若不是周期性数据,则确定满足告警条件,直接输出告警信息,若是周期性,结束针对当前网页操作日志的操作。(4.5) Determine whether the operation time in the webpage operation log is within the baseline time range, and if so, write the webpage operation log into the baseline data set according to the writing method in the above step (4.3). If not, it is judged whether the process path in the web page operation log is consistent with the corresponding process path in the baseline data set, and if the inconsistency is determined, the alarm condition is met, and the alarm information is output directly. If they are consistent, then judge whether the feature data of the webpage operation log belongs to periodic data in the baseline data set, if not periodic data, then determine that the alarm condition is met, and directly output the alarm information, if it is periodic, end the operation log for the current webpage operate.
在具体应用场景中,例如,主机A在操作文件B时使用的是进程C,,对应该操作的数据写入到了基线数据集合中;而后续主机A在操作文件B时又使用了进程D,基于上述方法可以确定进程发生了改变,那么则确定出现了网页篡改情况,因此可以通过输出告警信息来提示用户网页发生了篡改,可见,本实施例能够通过监控操作进程的变化实现针对网页的监控。In a specific application scenario, for example, host A uses process C when operating file B, and the data corresponding to the operation is written into the baseline data set; and subsequent host A uses process D when operating file B, Based on the above method, it can be determined that the process has changed, and then it is determined that the webpage has been tampered with, so the user can be prompted that the webpage has been tampered with by outputting an alarm message. It can be seen that this embodiment can monitor the webpage by monitoring the change of the operation process. .
本申请一个方法实施例提供了一种基线数据构建方法,如图3所示,该方法可以包括如下步骤:A method embodiment of the present application provides a baseline data construction method, as shown in Figure 3, the method may include the following steps:
步骤301:当获取到针对目标网站的一条网页操作日志时,判断所述网页操作日志是否包含基线数据集合中的索引数据,若否,进入步骤304,若是,进入步骤302;Step 301: When a web page operation log for the target website is obtained, judge whether the web page operation log includes index data in the baseline data set, if not, go to step 304, if yes, go to step 302;
本申请中,一条网页操作日志记录了某个主机对目标网站的一次操作行为,包含文件路径、文件操作方式、进程路径、操作时间、主机标识等各种数据。In this application, a webpage operation log records an operation behavior of a certain host on a target website, including various data such as file path, file operation method, process path, operation time, and host identification.
文件路径包括文件目录以及文件名称,用于表征操作的文件信息。The file path includes a file directory and a file name, which are used to represent the file information of the operation.
文件操作方式用于表征对文件进行的操作方式,例如,删除、创建、修改、重命名等。The file operation mode is used to represent the operation mode performed on the file, for example, delete, create, modify, rename, and so on.
进程路径用于表征操作使用的进程。The process path is used to characterize the process used by the operation.
操作时间用于表征操作的时间。Operation time is used to characterize the time of operation.
主机标识用于表征操作的主机ID和/或主机IP。The host ID is used to characterize the host ID and/or host IP of the operation.
可选的,网页操作日志还可以包含服务器标识,该服务器标识可以为技术人员对服务器自定义的标识,用于表征网页操作日志的服务器来源。Optionally, the webpage operation log may also include a server identifier, which may be an identifier customized by technicians for the server, and is used to represent the server source of the webpage operation log.
本申请中的基线数据集合是基于网页操作日志生成的,可以预先设定索引数据的类型和索引数据下的特征数据的类型,从而基于网页操作日志来生成基线数据,并能够写入到基线数据集合中。The baseline data set in this application is generated based on webpage operation logs, and the type of index data and the type of feature data under the index data can be preset, so that baseline data can be generated based on webpage operation logs and can be written into the baseline data set.
若网页操作日志不包含基线数据集合中的索引数据,确定基线数据集合中未记载关于该条网页操作日志的索引数据,因此该条网页操作日志属于基线数据,参照本实施例的后续步骤,是将其按照预设的写入方式写入基线数据集合。If the web page operation log does not include the index data in the baseline data set, it is determined that the index data about the web page operation log is not recorded in the baseline data set, so the web page operation log belongs to the baseline data, and with reference to the subsequent steps of this embodiment, it is Write it into the baseline data set according to the preset writing method.
可以理解的是,如果基线数据集合中没有记录任何基线数据,说明需要先将网页操作日志写入到基线数据集合中,即确定网页操作日志属于基线数据。或者,如果基线数据集合中虽然记录了基线数据,但是索引数据中并没有关于网页操作日志的数据,那么也需要将网页操作日志写入到基线数据集合中,即也能确定网页操作日志属于基线数据。例如,索引数据类型为主机标识,基线数据集合中的主机标识为主机A,而网页操作日志中的主机标识为主机B,那么确定网页操作日志不包含基线数据集合中的索引数据,确定该网页操作日志属于基线数据。It can be understood that if no baseline data is recorded in the baseline data set, it means that the webpage operation log needs to be written into the baseline data set first, that is, it is determined that the webpage operation log belongs to the baseline data. Or, if the baseline data is recorded in the baseline data set, but there is no data about webpage operation logs in the index data, then the webpage operation logs also need to be written into the baseline data set, that is, it can also be determined that the webpage operation logs belong to the baseline data. For example, if the index data type is host ID, the host ID in the baseline data set is host A, and the host ID in the web page operation log is host B, then it is determined that the web page operation log does not contain the index data in the baseline data set, and it is determined that the web page Operation logs are baseline data.
步骤302:确定所述基线数据集合中所述网页操作日志包含的索引数据为目标索引数据;Step 302: determining that the index data included in the web page operation log in the baseline data set is the target index data;
步骤303:判断所述网页操作日志中的操作时间是否在所述目标索引数据对应的基线时间范围内,若是,进入步骤304,若否,进入步骤305;Step 303: judging whether the operation time in the web page operation log is within the baseline time range corresponding to the target index data, if yes, proceed to step 304, if not, proceed to step 305;
基线数据集合中,每一索引数据对应一基线时间范围,该基线时间范围可以以首条写入的索引数据的操作时间为基准,并延长指定的时间长度下确定的范围。例如,索引数据为主机A,首条写入时对应的操作时间为2022-11-15 20:00:00,以该时间往后延长两周的时间范围为基线时间范围,也就是基线时间范围为2022-11-1520:00:00至2022-11-29 20:00:00。In the baseline data set, each index data corresponds to a baseline time range. The baseline time range can be based on the operation time of the first written index data and extend the range determined under the specified time length. For example, the index data is host A, and the corresponding operation time when the first entry is written is 2022-11-15 20:00:00, and the time range extended by two weeks after this time is the baseline time range, that is, the baseline time range From 2022-11-15 20:00:00 to 2022-11-29 20:00:00.
可以理解的是,两周仅是指定的时间长度的一个具体示例,并不构成限定,本申请可以基于实际情况对指定的时间长度进行设定。It can be understood that two weeks is only a specific example of the designated time length, and does not constitute a limitation, and the application can set the designated time length based on actual conditions.
步骤304:确定所述网页操作日志属于基线数据,将属于所述基线数据的网页操作日志按照预设的写入方式写入基线数据集合中;Step 304: Determine that the webpage operation log belongs to the baseline data, and write the webpage operation log belonging to the baseline data into the baseline data set according to a preset writing method;
预先设定网页操作日志的写入方式,一种方式下,可以直接将网页操作日志作为基线数据写入到基线数据集合中。而为了便于管理和数据定位,另一种方式下,按照预设的写入方式将所述网页操作日志写入所述基线数据集合,可以包括:将所述网页操作日志按照以索引数据为索引建立对应的操作时间和特征数据的方式写入到所述基线数据集合中。The writing mode of the web page operation log is preset. In one mode, the web page operation log can be directly written into the baseline data set as the baseline data. In order to facilitate management and data location, in another way, writing the webpage operation log into the baseline data set according to a preset writing method may include: writing the webpage operation log according to index data The method of establishing the corresponding operation time and characteristic data is written into the baseline data set.
索引数据的类型和特征数据的类型均可以预先指定,例如,索引数据可以为主机标识、服务器标识、文件标识、进程路径中的一个或多个,特征数据可以包括:文件路径、进程路径、文件操作方式中的一个或多个。其中,文件路径包括文件目录和文件名称。Both the type of index data and the type of feature data can be specified in advance, for example, the index data can be one or more of host ID, server ID, file ID, and process path, and feature data can include: file path, process path, file One or more of the modes of operation. Wherein, the file path includes a file directory and a file name.
需说明的是,索引数据和特征数据不重复。例如,索引数据中包括进程路径,那么对应的特征数据则不包含进程路径。It should be noted that the index data and feature data are not repeated. For example, if the index data includes a process path, then the corresponding characteristic data does not include the process path.
其中,每一索引数据还对应有操作时间,在将网页操作日志写入基线数据集合中,也需要将网页操作日志的操作时间与索引数据建立对应关系写入。可选,基线数据集合可以具有索引区和数据区,索引区中用于写入索引数据和与索引数据对应的时间信息,该时间信息为操作时间;当确定了索引数据的基线时间范围,也可以将该基线时间范围对应写入,以避免后续获取到网页操作日志还需要对基线时间范围进行计算。Wherein, each index data also corresponds to an operation time, and when writing the web page operation log into the baseline data set, it is also necessary to establish a corresponding relationship between the operation time of the web page operation log and the index data. Optionally, the baseline data set may have an index area and a data area, and the index area is used to write index data and time information corresponding to the index data. The time information is the operation time; when the baseline time range of the index data is determined, also The baseline time range can be correspondingly written, so as to avoid the need to calculate the baseline time range for subsequent web page operation logs.
数据区用于写入与索引数据对应的特征数据。因此,在将网页操作日志写入时,具体可以将网页操作日志按照在索引区写入索引数据和对应的操作时间,在特征区写入与所述索引数据对应的特征数据的方式写入基线数据集合中。The data area is used to write feature data corresponding to the index data. Therefore, when writing the web page operation log, specifically, the web page operation log can be written into the baseline in the manner of writing index data and corresponding operation time in the index area, and writing feature data corresponding to the index data in the feature area in the data set.
步骤305:确定所述网页操作日志不属于基线数据。Step 305: Determine that the webpage operation log does not belong to the baseline data.
不属于基线数据的网页操作日志不会写入基线数据集合中,可选地,可以基于基线数据集合对不属于基线数据的网页操作日志进行监控,如基于基线数据集合判断其是否满足告警条件。Webpage operation logs that do not belong to the baseline data will not be written into the baseline data set. Optionally, the webpage operation logs that do not belong to the baseline data can be monitored based on the baseline data set, such as judging whether an alarm condition is met based on the baseline data set.
本实施例中,在获取到针对目标网站的一条网页操作日志时,判断所述网页操作日志是否包含基线数据集合中的索引数据,若否,确定所述网页操作日志属于基线数据;若是,确定所述基线数据集合中所述网页操作日志包含的索引数据为目标索引数据,判断所述网页操作日志中的操作时间是否在所述目标索引数据对应的基线时间范围内,若在所述基线时间范围内,确定所述网页操作日志属于基线数据,若不在所述基线时间范围内,确定所述网页操作日志不属于基线数据;将属于所述基线数据的网页操作日志按照预设的写入方式写入基线数据集合中,由此可见,本申请能够通过网页操作日志实现基线数据集合的构建,从而用于后续网页的监控。In this embodiment, when a webpage operation log for the target website is obtained, it is judged whether the webpage operation log includes the index data in the baseline data set, if not, it is determined that the webpage operation log belongs to the baseline data; if so, it is determined The index data contained in the webpage operation log in the baseline data set is target index data, and it is judged whether the operation time in the webpage operation log is within the baseline time range corresponding to the target index data, if within the baseline time Within the scope, determine that the webpage operation log belongs to the baseline data, if not within the baseline time range, determine that the webpage operation log does not belong to the baseline data; write the webpage operation log belonging to the baseline data according to the preset writing method It can be seen that the application can realize the construction of the baseline data set through the web page operation log, so as to be used for the monitoring of subsequent web pages.
与前述应用功能实现方法实施例相对应,本申请还提供了网页监控装置、一种基线数据构建装置、电子设备及相应的实施例。Corresponding to the foregoing embodiment of the application function realization method, the present application also provides a webpage monitoring device, a baseline data construction device, electronic equipment and corresponding embodiments.
图4是本申请一个装置实施例示出的网页监控装置的结构示意图。Fig. 4 is a schematic structural diagram of a web page monitoring device shown in a device embodiment of the present application.
参见图4,该装置可以包括:告警判断单元110以及第一生成单元120;Referring to FIG. 4, the device may include: an alarm judging unit 110 and a first generating unit 120;
告警判断单元110,当获取到针对目标网站的一条网页操作日志时,基于构建的基线数据集合判断所述网页操作日志是否满足告警条件;其中,所述基线数据集合是根据网页操作日志构建的;The alarm judging unit 110, when obtaining a webpage operation log for the target website, judges whether the webpage operation log meets the alarm condition based on the constructed baseline data set; wherein, the baseline data set is constructed according to the webpage operation log;
第一生成单元120,用于在确定满足所述告警条件的情况下,生成针对所述目标网站的告警信息。The first generating unit 120 is configured to generate warning information for the target website when it is determined that the warning condition is met.
本申请另一装置实施例提供了网页监控装置,如图5所示,该装置包括:第一判断单元130、第一写入单元140、告警判断单元110以及第一生成单元120;其中:Another device embodiment of the present application provides a webpage monitoring device. As shown in FIG. 5 , the device includes: a first judging unit 130, a first writing unit 140, an alarm judging unit 110, and a first generating unit 120; wherein:
第一判断单元130,当获取到针对目标网站的一条网页操作日志时,判断所述网页操作日志是否属于基线数据;The first judging unit 130, when obtaining a webpage operation log for the target website, judges whether the webpage operation log belongs to the baseline data;
第一写入单元140,用于在确定所述网页操作日志属于所述基线数据下,按照预设的写入方式将所述网页操作日志写入基线数据集合;The first writing unit 140 is configured to write the webpage operation log into the baseline data set according to a preset writing method when it is determined that the webpage operation log belongs to the baseline data;
告警判断单元110,用于在确定所述网页操作日志不属于所述基线数据下,基于所述基线数据集合判断所述网页操作日志是否满足告警条件;An alarm judging unit 110, configured to determine whether the web page operation log satisfies an alarm condition based on the baseline data set when it is determined that the web page operation log does not belong to the baseline data;
第一生成单元120,用于在确定满足所述告警条件的情况下,生成针对所述目标网站的告警信息。The first generating unit 120 is configured to generate warning information for the target website when it is determined that the warning condition is satisfied.
该告警信息具体可以为针对网页操作日志中的主机标识,如主机ID和/或主机地址的告警信息,以使得能够快速定位到发生篡改的主机,提高篡改发生后的响应处置速度。The alarm information can specifically be the alarm information for the host identification in the web page operation log, such as host ID and/or host address, so that the host that has been tampered with can be quickly located and the response speed after tampering occurs is improved.
其中,第一判断单元130,包括:第一判断模块、第一确定模块、第二确定模块、第二判断模块、第三确定模块以及第四确定模块;具体的:Wherein, the first determination unit 130 includes: a first determination module, a first determination module, a second determination module, a second determination module, a third determination module and a fourth determination module; specifically:
第一判断模块,用于当获取到针对目标网站的一条网页操作日志时,判断所述网页操作日志是否包含基线数据集合中的索引数据;The first judging module is used to judge whether the webpage operation log includes the index data in the baseline data set when a webpage operation log for the target website is obtained;
第一确定模块,用于在确定所述网页操作日志不包含所述基线数据集合中的索引数据时,确定所述网页操作日志属于基线数据;A first determining module, configured to determine that the webpage operation log belongs to baseline data when it is determined that the webpage operation log does not include index data in the baseline data set;
第二确定模块,用于在确定所述网页操作日志包含所述基线数据集合中的索引数据时,确定所述基线数据集合中所述网页操作日志包含的索引数据为目标索引数据;The second determination module is configured to determine that the index data included in the webpage operation log in the baseline data set is target index data when it is determined that the webpage operation log includes index data in the baseline data set;
第二判断模块,用于判断所述网页操作日志中的操作时间是否在所述目标索引数据对应的基线时间范围内;A second judging module, configured to judge whether the operation time in the webpage operation log is within the baseline time range corresponding to the target index data;
第三确定模块,用于确定在所述基线时间范围内,确定所述网页操作日志属于基线数据;A third determining module, configured to determine that within the baseline time range, it is determined that the webpage operation log belongs to baseline data;
第四确定模块,用于确定不在所述基线时间范围内,确定所述网页操作日志不属于基线数据。The fourth determining module is configured to determine that it is not within the baseline time range, and determine that the webpage operation log does not belong to the baseline data.
其中,告警判断单元110,可以包括:第三判断模块、第五确定模块、第四判断模块以及第六确定模块;具体的:Wherein, the alarm judgment unit 110 may include: a third judgment module, a fifth determination module, a fourth judgment module, and a sixth determination module; specifically:
第三判断模块,用于在确定所述网页操作日志不属于所述基线数据下,判断所述网页操作日志中的特征数据与所述基线数据集合中的所述目标索引数据对应的目标特征数据是否一致;A third judging module, configured to judge the target feature data corresponding to the feature data in the web page operation log and the target index data in the baseline data set when it is determined that the web page operation log does not belong to the baseline data Is it consistent;
第五确定模块,用于在不一致的情况下,确定满足告警条件;The fifth determination module is used to determine that the alarm condition is met in the case of inconsistency;
第四判断模块,用于在一致的情况下,判断所述网页操作日志的特征数据在所述基线数据集合中是否属于周期性数据,其中,所述周期性数据指代为所述目标特征数据在所述目标索引数据对应的操作时间范围内周期性重复;The fourth judging module is used to judge whether the characteristic data of the web page operation log in the baseline data set is periodic data in the case of consistency, wherein the periodic data refers to the target characteristic data in Repeat periodically within the operating time range corresponding to the target index data;
第六确定模块,用于在不属于所述周期性数据下,确定满足所述告警条件。A sixth determining module, configured to determine that the alarm condition is met when the periodic data does not belong to the periodic data.
可选的,第一写入单元140具体用于在确定所述网页操作日志属于所述基线数据下,将所述网页操作日志按照以索引数据为索引建立对应的操作时间和特征数据的方式写入到所述基线数据集合中。Optionally, the first writing unit 140 is specifically configured to, when it is determined that the webpage operation log belongs to the baseline data, write the webpage operation log by using index data as an index to establish corresponding operation time and feature data. into the baseline dataset.
作为另一种实现方式,第一写入单元140具体可以用于直接将网页操作日志作为基线数据写入到基线数据集合中。As another implementation manner, the first writing unit 140 may be specifically configured to directly write the webpage operation log as the baseline data into the baseline data set.
本实施例能够通过网页操作日志与基线数据集合来实现对目标网站的网页的监控,无需关注网页内容的变化,减少了网页的业务访问量;This embodiment can realize the monitoring of the webpage of the target website through the collection of webpage operation logs and baseline data, without paying attention to the changes of the webpage content, and reducing the business visits of the webpage;
进一步的,使用网页操作日志对网页进行监控,能够确保网页中正在发生的篡改行为或已经发生的篡改行为,提高了确定篡改行为的准确性;Further, using the web page operation log to monitor the web page can ensure that the tampering behavior in the webpage is taking place or has already occurred, and improves the accuracy of determining the tampering behavior;
此外,通过网页操作日志能够快速定位到发生篡改的主机,提高篡改发生后的响应处置速度。In addition, the tampered host can be quickly located through the web page operation log, which improves the response speed after the tampering occurs.
本申请又一装置实施例提供了网页监控装置,在本实施例中,该装置可以包括:采集确定单元;其中:Another device embodiment of the present application provides a webpage monitoring device. In this embodiment, the device may include: a collection and determination unit; wherein:
采集确定单元,用于采集针对所述目标网站的日志,若所述日志中的进程路径符合指定进程路径,且所述日志中的操作文件具有指定属性,将所述日志确定为网页操作日志;A collection and determination unit, configured to collect a log for the target website, if the process path in the log conforms to the specified process path, and the operation file in the log has a specified attribute, determine the log as a web page operation log;
其中,所述日志为文件进程操作日志或网页应用日志。Wherein, the log is a file process operation log or a webpage application log.
为了便于对网页操作日志进行识别,可选地,所述采集确定单元将日志确定为网页操作日志,具体可以为:按照预设的格式将所述日志转换为标准日志,将所述标准日志确定为网页操作日志。In order to facilitate the identification of the webpage operation log, optionally, the acquisition and determination unit determines the log as a webpage operation log, specifically: converting the log into a standard log according to a preset format, and determining the standard log Logs for web operations.
可选地,该装置还可以包括:Optionally, the device may also include:
第一设定单元,用于预先设定采集所述目标网站的文件进程操作日志,还可以用于设定基线数据集合中的索引数据包括主机标识和服务器标识;The first setting unit is used to pre-set to collect the file process operation log of the target website, and can also be used to set the index data in the baseline data set to include host ID and server ID;
或者,第二设定单元,用于预先设定采集所述目标网站的文件进程操作日志,还可以用于设定基线数据集合中的索引数据包括主机标识和文件路径;Or, the second setting unit is used to pre-set to collect the file process operation log of the target website, and can also be used to set the index data in the baseline data set to include host identification and file path;
或者,第三设定单元,用于预先设定采集所述目标网站的网页应用日志,还可以设定基线数据集合中的索引数据包括主机标识和进程路径。Alternatively, the third setting unit is configured to pre-set to collect the webpage application log of the target website, and may also set the index data in the baseline data set to include host identifier and process path.
在设定索引数据包含主机标识和服务器标识时,对应索引数据的特征数据可以包括:文件路径、进程路径以及文件操作方式。When it is set that the index data includes a host ID and a server ID, the characteristic data corresponding to the index data may include: a file path, a process path, and a file operation mode.
在设定索引数据包含主机标识和文件路径时,对应索引数据的特征数据可以包括:进程路径;When setting the index data to include the host identifier and the file path, the feature data corresponding to the index data may include: process path;
在设定索引数据包含主机标识和进程路径时,对应索引数据的特征数据可以包括:文件路径以及文件操作方式。When it is set that the index data includes a host identifier and a process path, the feature data corresponding to the index data may include: a file path and a file operation mode.
由此可见,本实施例中,可以使用文件进程操作日志或网页操作日志,并从其中来筛选出网页操作日志,由于使用的日志为底层数据,因此使得本申请针对网页的监控方法识别通用性更强;并且通过筛选的方式能够减少后续数据处理的信息量,进一步提高监控效率。It can be seen that in this embodiment, the file process operation log or the webpage operation log can be used, and the webpage operation log can be filtered out from it. Since the log used is the bottom layer data, the monitoring method of this application for the webpage can be identified universally. Stronger; and through screening, the amount of information for subsequent data processing can be reduced, further improving monitoring efficiency.
本申请一个装置实施例还提供了一种基线数据构建装置,如图6所示,该装置包括:数据判断单元210、第一确定单元220、第二确定单元230、范围判断单元240、第三确定单元250、第四确定单元260以及写入数据单元270;其中:A device embodiment of the present application also provides a baseline data construction device. As shown in FIG. The determination unit 250, the fourth determination unit 260, and the write data unit 270; wherein:
数据判断单元210,用于当获取到针对目标网站的一条网页操作日志时,判断所述网页操作日志是否包含基线数据集合中的索引数据;A data judging unit 210, configured to judge whether the webpage operation log includes index data in the baseline data set when a webpage operation log for the target website is acquired;
第一确定单元220,用于在所述网页操作日志不包含基线数据集合中的索引数据时,确定所述网页操作日志属于基线数据;The first determining unit 220 is configured to determine that the webpage operation log belongs to the baseline data when the webpage operation log does not include the index data in the baseline data set;
第二确定单元230,用于在所述网页操作日志包含基线数据集合中的索引数据时,确定所述基线数据集合中所述网页操作日志包含的索引数据为目标索引数据;The second determining unit 230 is configured to determine that the index data included in the webpage operation log in the baseline data set is target index data when the webpage operation log includes index data in the baseline data set;
范围判断单元240,用于判断所述网页操作日志中的操作时间是否在所述目标索引数据对应的基线时间范围内;A range judging unit 240, configured to judge whether the operation time in the web page operation log is within the baseline time range corresponding to the target index data;
第三确定单元250,用于在所述网页操作日志中的操作时间在所述目标索引数据对应的基线时间范围内,确定所述网页操作日志属于基线数据;The third determining unit 250 is configured to determine that the webpage operation log belongs to the baseline data when the operation time in the webpage operation log is within the baseline time range corresponding to the target index data;
第四确定单元260,用于在所述网页操作日志中的操作时间不在所述目标索引数据对应的基线时间范围内,确定所述网页操作日志不属于基线数据;The fourth determining unit 260 is configured to determine that the webpage operation log does not belong to the baseline data when the operation time in the webpage operation log is not within the baseline time range corresponding to the target index data;
写入数据单元270,用于将属于所述基线数据的网页操作日志按照预设的写入方式写入基线数据集合中。The writing data unit 270 is configured to write the web page operation log belonging to the baseline data into the baseline data set according to a preset writing method.
可选的,写入数据单元270具体用于将属于所述基线数据的网页操作日志按照以索引数据为索引建立对应的操作时间和特征数据的方式写入到所述基线数据集合中。Optionally, the writing data unit 270 is specifically configured to write the webpage operation logs belonging to the baseline data into the baseline data set in a manner of establishing corresponding operation time and characteristic data with index data as an index.
作为另一种实现方式,写入数据单元270具体可以用于直接将网页操作日志作为基线数据写入到基线数据集合中。As another implementation manner, the writing data unit 270 may be specifically configured to directly write the webpage operation log as the baseline data into the baseline data set.
本实施例中,在获取到针对目标网站的一条网页操作日志时,判断所述网页操作日志是否包含基线数据集合中的索引数据,若否,确定所述网页操作日志属于基线数据;若是,确定所述基线数据集合中所述网页操作日志包含的索引数据为目标索引数据,判断所述网页操作日志中的操作时间是否在所述目标索引数据对应的基线时间范围内,若在所述基线时间范围内,确定所述网页操作日志属于基线数据,若不在所述基线时间范围内,确定所述网页操作日志不属于基线数据;将属于所述基线数据的网页操作日志按照预设的写入方式写入基线数据集合中,由此可见,本申请能够通过网页操作日志实现基线数据集合的构建,从而用于后续网页的监控。In this embodiment, when a webpage operation log for the target website is obtained, it is judged whether the webpage operation log includes the index data in the baseline data set, if not, it is determined that the webpage operation log belongs to the baseline data; if so, it is determined The index data contained in the webpage operation log in the baseline data set is target index data, and it is judged whether the operation time in the webpage operation log is within the baseline time range corresponding to the target index data, if within the baseline time Within the scope, determine that the webpage operation log belongs to the baseline data, if not within the baseline time range, determine that the webpage operation log does not belong to the baseline data; write the webpage operation log belonging to the baseline data according to the preset writing method It can be seen that the application can realize the construction of the baseline data set through the web page operation log, so as to be used for the monitoring of subsequent web pages.
关于上述实施例中的装置,其中各个模块执行操作的具体方式已经在有关该方法的实施例中进行了详细描述,此处将不再做详细阐述说明。Regarding the apparatus in the above embodiments, the specific manner in which each module executes operations has been described in detail in the embodiments related to the method, and will not be described in detail here.
图7是本申请一个装置实施例示出的电子设备的结构示意图。Fig. 7 is a schematic structural diagram of an electronic device shown in an apparatus embodiment of the present application.
参见图7,电子设备1000包括存储器1010和处理器1020。Referring to FIG. 7 , the electronic device 1000 includes a memory 1010 and a processor 1020 .
处理器1020可以是中央处理单元(Central Processing Unit,CPU),还可以是其他通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。The processor 1020 can be a central processing unit (Central Processing Unit, CPU), and can also be other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), on-site Field-Programmable Gate Array (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. A general-purpose processor may be a microprocessor, or the processor may be any conventional processor, or the like.
存储器1010可以包括各种类型的存储单元,例如系统内存、只读存储器(ROM),和永久存储装置。其中,ROM可以存储处理器1020或者计算机的其他模块需要的静态数据或者指令。永久存储装置可以是可读写的存储装置。永久存储装置可以是即使计算机断电后也不会失去存储的指令和数据的非易失性存储设备。在一些实施方式中,永久性存储装置采用大容量存储装置(例如磁或光盘、闪存)作为永久存储装置。另外一些实施方式中,永久性存储装置可以是可移除的存储设备(例如软盘、光驱)。系统内存可以是可读写存储设备或者易失性可读写存储设备,例如动态随机访问内存。系统内存可以存储一些或者所有处理器在运行时需要的指令和数据。此外,存储器1010可以包括任意计算机可读存储媒介的组合,包括各种类型的半导体存储芯片(DRAM,SRAM,SDRAM,闪存,可编程只读存储器),磁盘和/或光盘也可以采用。在一些实施方式中,存储器1010可以包括可读和/或写的可移除的存储设备,例如激光唱片(CD)、只读数字多功能光盘(例如DVD-ROM,双层DVD-ROM)、只读蓝光光盘、超密度光盘、闪存卡(例如SD卡、min SD卡、Micro-SD卡等等)、磁性软盘等等。计算机可读存储媒介不包含载波和通过无线或有线传输的瞬间电子信号。The memory 1010 may include various types of storage units such as system memory, read only memory (ROM), and persistent storage. Wherein, the ROM may store static data or instructions required by the processor 1020 or other modules of the computer. The persistent storage device may be a readable and writable storage device. Persistent storage may be a non-volatile storage device that does not lose stored instructions and data even if the computer is powered off. In some embodiments, the permanent storage device adopts a large-capacity storage device (such as a magnetic or optical disk, flash memory) as the permanent storage device. In some other implementations, the permanent storage device may be a removable storage device (such as a floppy disk, an optical drive). The system memory can be a readable and writable storage device or a volatile readable and writable storage device, such as dynamic random access memory. System memory can store some or all of the instructions and data that the processor needs at runtime. In addition, the memory 1010 can include any combination of computer-readable storage media, including various types of semiconductor memory chips (DRAM, SRAM, SDRAM, flash memory, programmable read-only memory), and magnetic disks and/or optical disks can also be used. In some embodiments, memory 1010 may include a readable and/or writable removable storage device, such as a compact disc (CD), a read-only digital versatile disc (e.g., DVD-ROM, dual-layer DVD-ROM), Read-only Blu-ray Disc, Super Density Disc, Flash memory card (such as SD card, min SD card, Micro-SD card, etc.), magnetic floppy disk, etc. Computer-readable storage media do not contain carrier waves and transient electronic signals transmitted by wireless or wire.
存储器1010上存储有可执行代码,当可执行代码被处理器1020处理时,可以使处理器1020执行上文述及的方法中的部分或全部。Executable codes are stored in the memory 1010 , and when the executable codes are processed by the processor 1020 , the processor 1020 may execute part or all of the methods mentioned above.
上文中已经参考附图详细描述了本申请的方案。在上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详细描述的部分,可以参见其他实施例的相关描述。本领域技术人员也应该知悉,说明书中所涉及的动作和模块并不一定是本申请所必须的。另外,可以理解,本申请实施例方法中的步骤可以根据实际需要进行顺序调整、合并和删减,本申请实施例装置中的模块可以根据实际需要进行合并、划分和删减。The solution of the present application has been described in detail above with reference to the accompanying drawings. In the foregoing embodiments, the descriptions of each embodiment have their own emphases, and for parts not described in detail in a certain embodiment, reference may be made to relevant descriptions of other embodiments. Those skilled in the art should also know that the actions and modules involved in the description are not necessarily required by the present application. In addition, it can be understood that the order of the steps in the method of the embodiment of the present application can be adjusted, combined and deleted according to actual needs, and the modules in the device of the embodiment of the present application can be combined, divided and deleted according to actual needs.
此外,根据本申请的方法还可以实现为一种计算机程序或计算机程序产品,该计算机程序或计算机程序产品包括用于执行本申请的上述方法中部分或全部步骤的计算机程序代码指令。In addition, the method according to the present application can also be implemented as a computer program or computer program product, which includes computer program code instructions for executing some or all of the steps in the above-mentioned method of the present application.
或者,本申请还可以实施为一种非暂时性机器可读存储介质(或计算机可读存储介质、或机器可读存储介质),其上存储有可执行代码(或计算机程序、或计算机指令代码),当所述可执行代码(或计算机程序、或计算机指令代码)被电子设备(或电子设备、服务器等)的处理器执行时,使所述处理器执行根据本申请的上述方法的各个步骤的部分或全部。Alternatively, the present application may also be implemented as a non-transitory machine-readable storage medium (or computer-readable storage medium, or machine-readable storage medium), on which executable code (or computer program, or computer instruction code) is stored. ), when the executable code (or computer program, or computer instruction code) is executed by the processor of the electronic device (or electronic device, server, etc.), causing the processor to perform the steps of the above method according to the present application part or all of .
本领域技术人员还将明白的是,结合这里的申请所描述的各种示例性逻辑块、模块、电路和算法步骤可以被实现为电子硬件、计算机软件或两者的组合。Those of skill would also appreciate that the various illustrative logical blocks, modules, circuits, and algorithm steps described in connection with the applications herein may be implemented as electronic hardware, computer software, or combinations of both.
附图中的流程图和框图显示了根据本申请的多个实施例的系统和方法的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段或代码的一部分,所述模块、程序段或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现中,方框中所标记的功能也可以以不同于附图中所标记的顺序发生。例如,两个连续的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。The flowchart and block diagrams in the figures show the architecture, functions and operations of possible implementations of systems and methods according to various embodiments of the present application. In this regard, each block in a flowchart or block diagram may represent a module, program segment, or part of code that includes one or more Executable instructions. It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks in succession may, in fact, be executed substantially concurrently, or they may sometimes be executed in the reverse order, depending upon the functionality involved. It should also be noted that each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations, can be implemented by a dedicated hardware-based system that performs the specified functions or operations , or may be implemented by a combination of dedicated hardware and computer instructions.
以上已经描述了本申请的各实施例,上述说明是示例性的,并非穷尽性的,并且也不限于所披露的各实施例。在不偏离所说明的各实施例的范围和精神的情况下,对于本技术领域的普通技术人员来说许多修改和变更都是显而易见的。本文中所用术语的选择,旨在最好地解释各实施例的原理、实际应用或对市场中的技术的改进,或者使本技术领域的其它普通技术人员能理解本文披露的各实施例。Having described various embodiments of the present application above, the foregoing description is exemplary, not exhaustive, and is not limited to the disclosed embodiments. Many modifications and alterations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terminology used herein is chosen to best explain the principle of each embodiment, practical application or improvement of technology in the market, or to enable other ordinary skilled in the art to understand each embodiment disclosed herein.
Claims (10)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211656344.9A CN115967559A (en) | 2022-12-22 | 2022-12-22 | Webpage monitoring method and device and baseline data construction method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211656344.9A CN115967559A (en) | 2022-12-22 | 2022-12-22 | Webpage monitoring method and device and baseline data construction method and device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115967559A true CN115967559A (en) | 2023-04-14 |
Family
ID=87352344
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211656344.9A Pending CN115967559A (en) | 2022-12-22 | 2022-12-22 | Webpage monitoring method and device and baseline data construction method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115967559A (en) |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105933268A (en) * | 2015-11-27 | 2016-09-07 | 中国银联股份有限公司 | Webshell detection method and apparatus based on total access log analysis |
CN106682529A (en) * | 2017-01-04 | 2017-05-17 | 北京国舜科技股份有限公司 | Anti-tampering method and anti-tampering terminal |
CN107844706A (en) * | 2017-12-07 | 2018-03-27 | 郑州云海信息技术有限公司 | Security baseline log processing method and state methods of exhibiting |
CN109145536A (en) * | 2017-06-19 | 2019-01-04 | 北京金山云网络技术有限公司 | A kind of webpage integrity assurance and device |
CN109460671A (en) * | 2018-10-21 | 2019-03-12 | 北京亚鸿世纪科技发展有限公司 | A method of realizing that web page contents are anti-tamper based on operating system nucleus |
CN112100035A (en) * | 2020-10-27 | 2020-12-18 | 苏州浪潮智能科技有限公司 | Page abnormity detection method, system and related device |
-
2022
- 2022-12-22 CN CN202211656344.9A patent/CN115967559A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105933268A (en) * | 2015-11-27 | 2016-09-07 | 中国银联股份有限公司 | Webshell detection method and apparatus based on total access log analysis |
CN106682529A (en) * | 2017-01-04 | 2017-05-17 | 北京国舜科技股份有限公司 | Anti-tampering method and anti-tampering terminal |
CN109145536A (en) * | 2017-06-19 | 2019-01-04 | 北京金山云网络技术有限公司 | A kind of webpage integrity assurance and device |
CN107844706A (en) * | 2017-12-07 | 2018-03-27 | 郑州云海信息技术有限公司 | Security baseline log processing method and state methods of exhibiting |
CN109460671A (en) * | 2018-10-21 | 2019-03-12 | 北京亚鸿世纪科技发展有限公司 | A method of realizing that web page contents are anti-tamper based on operating system nucleus |
CN112100035A (en) * | 2020-10-27 | 2020-12-18 | 苏州浪潮智能科技有限公司 | Page abnormity detection method, system and related device |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107145489B (en) | Information statistics method and device for client application based on cloud platform | |
CN110992992B (en) | A hard disk testing method, device and storage medium | |
WO2017190580A1 (en) | Method and device for accessing database | |
CN111294459B (en) | Method, device and storage medium for detecting privacy of android mobile phone | |
CN110515795B (en) | Big data component monitoring method and device and electronic equipment | |
CN113157209B (en) | A method and device for data reconstruction from file system to object storage | |
WO2018233630A1 (en) | Fault discovery | |
CN112104663B (en) | Method and equipment for managing login user and user equipment | |
CN108121645A (en) | A kind of daily record method for evaluating quality, device, server and storage medium | |
CN110287696A (en) | A detection method, device and equipment for reverse shell process | |
CN112769775B (en) | Threat information association analysis method, system, equipment and computer medium | |
CN112433964B (en) | Method and equipment for cleaning cache dirty data | |
CN114357445A (en) | Method, device and storage medium for identifying attack path on terminal side | |
TWI610196B (en) | Network attack pattern determination apparatus, determination method, and computer program product thereof | |
CN115580607A (en) | Lottery system link monitoring method, device and system | |
CN108268369B (en) | Test data acquisition method and device | |
CN115967559A (en) | Webpage monitoring method and device and baseline data construction method and device | |
CN108228613B (en) | Data reading method and device | |
CN111290747A (en) | A method, system, device and medium for creating function hook | |
CN117971894A (en) | Data query method, device, computer equipment and storage medium | |
CN108062323A (en) | A log reading method and device | |
CN115733828A (en) | A method, device, electronic device and storage medium for identifying API parameters | |
CN112738221B (en) | Auditing method and device for object storage flow | |
CN111367697B (en) | Error handling method and device | |
CN107766216A (en) | It is a kind of to be used to obtain the method and apparatus using execution information |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |