CN111291300B - Webpage data processing method and device, webpage system and server - Google Patents

Webpage data processing method and device, webpage system and server Download PDF

Info

Publication number
CN111291300B
CN111291300B CN202010077968.XA CN202010077968A CN111291300B CN 111291300 B CN111291300 B CN 111291300B CN 202010077968 A CN202010077968 A CN 202010077968A CN 111291300 B CN111291300 B CN 111291300B
Authority
CN
China
Prior art keywords
data
webpage
item
data area
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010077968.XA
Other languages
Chinese (zh)
Other versions
CN111291300A (en
Inventor
郭春燕
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Yanyang Technology Service Co ltd
Original Assignee
Shenzhen Yinyan Data Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Yinyan Data Technology Co Ltd filed Critical Shenzhen Yinyan Data Technology Co Ltd
Priority to CN202010077968.XA priority Critical patent/CN111291300B/en
Priority to CN202010931637.8A priority patent/CN112115400A/en
Priority to CN202010931657.5A priority patent/CN112115401A/en
Publication of CN111291300A publication Critical patent/CN111291300A/en
Application granted granted Critical
Publication of CN111291300B publication Critical patent/CN111291300B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/958Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/64Protecting data integrity, e.g. using checksums, certificates or signatures
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1441Countermeasures against malicious traffic
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/51Discovery or management thereof, e.g. service location protocol [SLP] or web services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2221/00Indexing scheme relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F2221/21Indexing scheme relating to G06F21/00 and subgroups addressing additional information or applications relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F2221/2119Authenticating web pages, e.g. with suspicious links

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Physics & Mathematics (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Databases & Information Systems (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The embodiment of the application provides a webpage data processing method, a device, a webpage system and a server, wherein a corresponding webpage is associated in the server in advance through a webpage processing terminal, thereby being convenient for the server to uniformly manage the webpage tampering behaviors without independent monitoring of a service provider of each webpage, and after the data download characteristics and the data filling characteristics monitored by the target monitoring node higher than the set monitoring level can be acquired from the webpage data to determine the corresponding target webpage data area, then, the operation of web page data protection and data recovery is carried out, thereby a large amount of human resources are not needed to be spent to accurately configure the preset server in the web page, and when the webpage has service update, the normal webpage service can not be influenced, and meanwhile, the access behavior of the terminal is not influenced under the webpage services corresponding to other monitoring nodes in the webpage, so that the normal webpage service is further ensured.

Description

Webpage data processing method and device, webpage system and server
Technical Field
The present application relates to the field of web page technologies, and in particular, to a web page data processing method, an apparatus, a web page system, and a server.
Background
With the rapid development of internet information technology, in daily life and work, when webpage data in the internet are accessed, part of key data in a part of webpages may be tampered by illegal service providers, so that some irrelevant contents with potential safety or privacy hazards are illegally added, normal business of the original internet service provider is affected, and extremely poor webpage experience is brought to users.
In a conventional scheme, an internet service provider usually monitors an access behavior of a non-preset server in a webpage in real time in a process that a user terminal accesses the webpage, so as to prevent the user terminal from accessing other non-preset servers to tamper webpage data. However, through research by the present inventors, in an actual scenario, the solution needs to spend a lot of human resources to accurately configure the preset server in the web page, for example, once a service update occurs in the web page, the corresponding preset server may also have an update, and if the update is not timely, normal web page service is necessarily affected. For another example, a part of services (e.g., a multi-party collaboration service) in a web page is allowed to be modified at any time, and if access behaviors of all non-preset servers in the web page are not allowed directly, normal web page services are also affected.
Disclosure of Invention
In order to overcome at least the above-mentioned deficiencies in the prior art, the present application aims to provide a method, an apparatus, a web page system and a server for processing web page data, which associate a corresponding web page in the server in advance through a web page processing terminal, thereby facilitating the server to uniformly manage the web page tampering behavior without separate monitoring by a server for each web page, and obtaining data download characteristics and data filling characteristics monitored by a target monitoring node higher than a set monitoring level from the web page data to determine a corresponding target web page data area, and then performing operations of web page data protection and data recovery, thereby not spending a lot of human resources to accurately configure a preset server in the web page, and not affecting normal web page services when the web page has service updates, and simultaneously, under the web page services corresponding to other monitoring nodes in the web page, the access behavior of the terminal is not affected, thereby further ensuring normal webpage service.
In a first aspect, the present application provides a web page data processing method applied to a server, where the server is in communication connection with at least one web page processing terminal, and each web page processing terminal associates a corresponding web page in the server in advance, where the method includes:
acquiring webpage data of a current webpage processing terminal related to a target webpage;
when it is determined that the current webpage processing terminal performs data monitoring by adopting a webpage data monitoring strategy corresponding to a target webpage, extracting target webpage item features from the webpage data, wherein the target webpage item features are composed of first item features and second item features, the first item features are data download features monitored by target monitoring nodes of which the monitoring levels are greater than a set level in monitoring nodes included in the webpage data monitoring strategy, the second item features are data filling features monitored by the target monitoring nodes, and different monitoring nodes are used for monitoring different preset webpage services;
determining a target webpage data area of the current webpage processing terminal according to the target webpage project characteristics, and determining an access data area of a webpage data file of a webpage data project corresponding to the target webpage data area and a corresponding data protection script;
and according to the data protection script, after data protection processing is carried out on the webpage address corresponding to the current webpage content data in the access data area, original webpage content data are added in the access data area again.
In one possible design of the first aspect, the step of extracting the target webpage item feature from the webpage data includes:
aiming at a target monitoring node of which the monitoring grade is greater than a set grade in monitoring nodes included in the webpage data monitoring strategy, respectively acquiring service calling data among all service items of corresponding webpage services and item feature information contained in a service item of calling service called by the service item of the webpage service from the webpage data through the target monitoring node;
respectively determining the association degree of each service item of the webpage service according to the acquired service calling data among the service items of the webpage service;
dividing each service item of the webpage service into a data downloading item and a data filling item according to the acquired item feature information contained in the service item of the calling service called by the service item of the webpage service;
determining the respective item characteristic information of the data download item and the data filling item according to the item characteristic information and the correlation degree between the service items of the webpage service;
and determining the characteristics of the target webpage item according to the item characteristic information of the data download item and the data filling item.
In a possible design of the first aspect, the step of determining, according to the obtained service invocation data between the service items of the web service, the association degrees between the service items of the web service respectively includes:
aiming at any two service projects, determining a service calling range between the two service projects according to service calling data between the two service projects, wherein the service calling range is used for representing the coincidence condition of the service data of the two service projects in the service calling process;
respectively determining the ratio of the service calling range between the two service projects to the service calling range between the service projects and the service projects of other web page services according to the service calling range between the two service projects;
and determining the association degree between the two service projects according to the ratio of the service calling range between the two service projects to the service calling range between the service projects of the two service projects and the service projects of other web page services.
In one possible design of the first aspect, the project characteristic information includes loading behavior information of at least two project types;
the step of determining the item feature information of each of the data download item and the data fill item according to the item feature information and the association degree between each service item of the web service includes:
establishing a correlation function between the service items of the webpage service according to the correlation between the service items of the webpage service;
determining item type confidence degrees of each data downloading item and each data filling item corresponding to the loading behavior information of each item type according to the item characteristic information and the loading behavior information of at least two item types included in the item characteristic information;
establishing a first item type function of the loading behavior information of each item type corresponding to the service item of the web service according to the item type confidence of the loading behavior information of each item type corresponding to each data downloading item and the item type confidence of the loading behavior information of each item type corresponding to each data filling item;
utilizing the relevance function to cycle the fusion result of the relevance function and the first item type function to obtain a second item type function of loading behavior information of each service item of the web page service corresponding to each item type until the cycle number reaches a preset number or the change value of each item type confidence coefficient in the second item type function is lower than a set change value, wherein before each cycle starts, for each data download item, the item type confidence coefficient corresponding to the data download item contained in the fusion result obtained by the previous cycle is recovered as the item type confidence coefficient corresponding to the data download item contained in the first item type function, and the loading behavior information of the item type with the largest item type confidence coefficient is selected as the loading behavior information corresponding to the data download item;
for each data filling item, selecting the loading behavior information of the item type with the largest item type confidence coefficient as the loading behavior information corresponding to the data filling item according to the item type confidence coefficient of the loading behavior information of each item type corresponding to the data filling item in the second item type function;
and obtaining corresponding item characteristic information according to the corresponding loading behavior information of the data downloading item and the data filling item.
In a possible design of the first aspect, the step of determining the target webpage data area of the current webpage processing terminal according to the target webpage item feature includes:
determining a first initial webpage data area and a second initial webpage data area which respectively correspond to the current webpage processing terminal according to the first project characteristic and the second project characteristic;
determining a superposed data area between the first initial webpage data area and the second initial webpage data area, acquiring webpage item characteristics of the superposed data area, and dividing the webpage item characteristics into characteristic fragments;
adding each characteristic fragment to a data area positioning sequence, wherein each characteristic fragment corresponds to a data area positioning node in the data area positioning sequence one to one;
sequentially selecting data area positioning nodes from the data area positioning sequence, and distributing the selected data area positioning nodes to each data area positioning process in an idle state in parallel, wherein the data area positioning nodes are used for indicating each data area positioning process to generate first data area information corresponding to a feature fragment corresponding to the data area positioning node, the feature fragment is used for indicating the corresponding data area positioning process to generate the first data area information corresponding to the feature fragment, the feature fragment is also used for indicating the corresponding data area positioning process to respectively convert the feature fragment into a download feature sequence and a filling feature sequence, respectively extract a first feature from each download feature of the download feature sequence, and extract a second feature from each filling feature of the filling feature sequence;
analyzing the first characteristics and the second characteristics to obtain first data area information corresponding to the characteristic fragments, then obtaining the first data area information fed back by the positioning process of each data area, fusing the first data area information according to the sequence of the nodes of the corresponding characteristic fragments in the webpage project characteristics, and converting each data area node in the fused data area information into a data area node vector to obtain a data area node vector sequence;
and performing redundancy removal coding on the data area node vector sequence to generate second data area information corresponding to the webpage project characteristics so as to obtain a target webpage data area of the current webpage processing terminal.
In a possible design of the first aspect, the step of determining an access data area of a web page data file of a web page data item corresponding to the target web page data area and a corresponding data protection script includes:
determining an access identifier of a webpage data file of a webpage data item corresponding to the target webpage data area, and determining a corresponding access data area according to the access identifier;
acquiring feature information of project features of the webpage data project, and acquiring file feature information of each suspected falsified data file in a plurality of webpage data files under the received webpage data project;
similarity calculation is carried out on the file characteristic information and the characteristic information of each pre-configured data protection script, and a plurality of first similarity calculation results for each data protection script are obtained, wherein the characteristic information of each data protection script is as follows: determining the characteristic information of the preset data downloading characteristic and the data filling characteristic corresponding to the data protection script in the configuration process;
determining the corresponding data protection script according to the plurality of first similarity calculation results;
wherein, each data protection script is configured and obtained by adopting the following mode:
acquiring target characteristic information of preset data protection characteristics corresponding to each data protection instruction in a configuration set to form a target characteristic information set;
selecting one target feature information in the target feature information set one by one as current target feature information respectively, creating a data protection script according to the target feature information, calculating the association degree between the current target feature information and the header information of the data protection script, and obtaining a plurality of second association degree values as second similarity degree results;
judging whether each second correlation value is smaller than a preset threshold value, if so, determining that the second similarity result meets a preset similar condition, and if not, determining that the second similarity result does not meet the preset similar condition;
acquiring a corresponding data protection script when a second similarity calculation result meets a preset similarity condition, determining the data protection script to which the current target characteristic information belongs, and adding the current target characteristic information into the data protection script to which the current target characteristic information belongs;
and if no second similarity calculation result meets the preset similarity condition, creating a data protection script, recording the head information of the data protection script as the current target characteristic information, recalculating the head information of the data protection script, and taking the head information of each data protection script as the characteristic information of the preset data protection characteristic corresponding to each data protection script after completing the fusion of each target characteristic information in the target characteristic information set.
In a possible design of the first aspect, the step of adding the original web page content data again in the accessed data area after performing data protection processing on the web page address corresponding to the current web page content data in the accessed data area according to the data protection script includes:
acquiring a plurality of protection processing nodes according to the data protection script, and acquiring a protection instruction of each protection processing node in the plurality of protection processing nodes;
acquiring protection sequence control information of each protection processing node according to the protection instruction of each protection processing node and protection sequence parameters pre-configured by each protection processing node, wherein the protection sequence control information comprises protection sequence parameters and corresponding node parameters of each protection processing node;
according to the protection processing label of each protection processing node and the protection sequence parameter of each protection processing node, performing protection marking on an access domain of a webpage address corresponding to the current webpage content data in the access data area, associating the access domain after protection marking to a preset protection set corresponding to the access data area, configuring the protection strength of the access domain according to the total data amount of the current webpage content data, and setting a corresponding protection interception instruction for the access domain according to the protection strength of the access domain;
and deleting the original webpage content data in the access data area, and adding the original webpage content data in the access data area again.
In a second aspect, an embodiment of the present application further provides a web page data processing apparatus, which is applied to a server, where the server is in communication connection with at least one web page processing terminal, and each web page processing terminal associates a corresponding web page in the server in advance, where the apparatus includes:
the acquisition module is used for acquiring webpage data of a current webpage processing terminal related to a target webpage;
the extraction module is used for extracting target webpage item features from the webpage data when determining that the current webpage processing terminal adopts a webpage data monitoring strategy corresponding to a target webpage for data monitoring, wherein the target webpage item features consist of first item features and second item features, the first item features are data downloading features monitored by target monitoring nodes of which the monitoring levels are greater than a set level in monitoring nodes included in the webpage data monitoring strategy, the second item features are data filling features monitored by the target monitoring nodes, and different monitoring nodes are used for monitoring different preset webpage services;
the determining module is used for determining a target webpage data area of the current webpage processing terminal according to the target webpage project characteristics, and determining an access data area of a webpage data file of a webpage data project corresponding to the target webpage data area and a corresponding data protection script;
and the protection adding module is used for performing data protection processing on the webpage address corresponding to the current webpage content data in the access data area according to the data protection script and then adding the original webpage content data in the access data area again.
In a third aspect, an embodiment of the present application further provides a web page system, where the web page system includes a server and at least one web page processing terminal in communication connection with the server, and each web page processing terminal associates a corresponding web page in the server in advance;
when the webpage processing terminal is associated with a target webpage, the webpage processing terminal is used for sending webpage data to the server;
the server is used for acquiring webpage data of the webpage processing terminal related to the target webpage;
when it is determined that the current webpage processing terminal performs data monitoring by adopting a webpage data monitoring strategy corresponding to a target webpage, the server is used for extracting target webpage item features from the webpage data, the target webpage item features are composed of first item features and second item features, the first item features are data downloading features monitored by target monitoring nodes of which the monitoring levels are greater than a set level in monitoring nodes included in the webpage data monitoring strategy, the second item features are data filling features monitored by the target monitoring nodes, and different monitoring nodes are used for monitoring different preset webpage services;
the server is used for determining a target webpage data area of the current webpage processing terminal according to the target webpage project characteristics, and determining an access data area of a webpage data file of a webpage data project corresponding to the target webpage data area and a corresponding data protection script;
and the server is used for performing data protection processing on the webpage address corresponding to the current webpage content data in the access data area according to the data protection script and then adding the original webpage content data in the access data area again.
In a fourth aspect, an embodiment of the present application further provides a server, where the server includes a processor, a machine-readable storage medium, and a network interface, where the machine-readable storage medium, the network interface, and the processor are connected through a bus system, the network interface is configured to be communicatively connected to at least one web page processing terminal, the machine-readable storage medium is configured to store a program, an instruction, or a code, and the processor is configured to execute the program, the instruction, or the code in the machine-readable storage medium to perform the web page data processing method in the first aspect or any possible design of the first aspect.
In a fifth aspect, an embodiment of the present application provides a computer-readable storage medium, where instructions are stored in the computer-readable storage medium, and when the instructions are detected on a computer, the instructions cause the computer to perform the method for processing web page data in the first aspect or any one of the possible designs of the first aspect.
Based on any one of the above aspects, the corresponding web pages are associated in the server in advance through the web page processing terminal, so that the server can conveniently and uniformly manage the web page tampering behaviors, independent monitoring of a service provider of each web page is not needed, data downloading characteristics and data filling characteristics monitored by target monitoring nodes higher than a set monitoring level can be obtained from web page data to determine corresponding target web page data areas, and then web page data protection and data recovery operations are performed, so that the preset server in the web page is not needed to be accurately configured by spending a large amount of human resources, normal web page services cannot be influenced when the web page has service updating, and meanwhile, the access behaviors of the terminal are not influenced under the web page services corresponding to other monitoring nodes in the web page, so that the normal web page services are further ensured.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained from the drawings without inventive effort.
Fig. 1 is a schematic view of an application scenario of a web page system according to an embodiment of the present application;
fig. 2 is a schematic flowchart of a web page data processing method according to an embodiment of the present application;
fig. 3 is a schematic functional module diagram of a web page data processing apparatus according to an embodiment of the present application;
fig. 4 is a block diagram schematically illustrating a structure of a server for implementing the above-mentioned web page data processing method according to an embodiment of the present application.
Detailed Description
The present application will now be described in detail with reference to the drawings, and the specific operations in the method embodiments may also be applied to the apparatus embodiments or the system embodiments. In the description of the present application, "at least one" includes one or more unless otherwise specified. "plurality" means two or more. For example, at least one of A, B and C, comprising: a alone, B alone, a and B in combination, a and C in combination, B and C in combination, and A, B and C in combination. In this application, "/" means "or, for example, A/B may mean A or B; "and/or" herein is merely an association describing an associated object, and means that there may be three relationships, e.g., a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone.
Fig. 1 is an interaction diagram of a web page system 10 according to an embodiment of the present application. The web page system 10 may include a server 100 and a web page processing terminal 200 communicatively connected to the server 100, and the server 100 may include a processor for executing an instruction operation. The netpage system 10 shown in fig. 1 is only one possible example, and in other possible embodiments, the netpage system 10 may include only a portion of the components shown in fig. 1 or may include other components.
In some embodiments, the server 100 may be a single server or a group of servers. The set of operating servers may be centralized or distributed (e.g., the server 100 may be a distributed system). In some embodiments, the server 100 may be local or remote to the web page processing terminal 200. For example, the server 100 may access information stored in the web page processing terminal 200 and a database, or any combination thereof, via a network. As another example, the server 100 may be directly connected to at least one of the web page processing terminal 200 and a database to access information and/or data stored therein. In some embodiments, the server 100 may be implemented on a cloud platform; by way of example only, the cloud platform may include a private cloud, a public cloud, a hybrid cloud, a community cloud (community cloud), a distributed cloud, an inter-cloud, a multi-cloud, and the like, or any combination thereof.
In some embodiments, the server 100 may include a processor. The processor may process information and/or data related to the service request to perform one or more of the functions described herein. A processor may include one or more processing cores (e.g., a single-core processor (S) or a multi-core processor (S)). Merely by way of example, a Processor may include a Central Processing Unit (CPU), an Application Specific Integrated Circuit (ASIC), an Application Specific Instruction Set Processor (ASIP), a Graphics Processing Unit (GPU), a Physical Processing Unit (PPU), a Digital Signal Processor (DSP), a Field Programmable Gate Array (FPGA), a Programmable Logic Device (PLD), a controller, a microcontroller Unit, a Reduced Instruction Set computer (Reduced Instruction Set computer), a microprocessor, or the like, or any combination thereof.
The network may be used for the exchange of information and/or data. In some embodiments, one or more components in the web page system 10 (e.g., the server 100, the web page processing terminal 200, and the database) may send information and/or data to other components. In some embodiments, the network may be any type of wired or wireless network, or combination thereof. Merely by way of example, Network 130 may include a wired Network, a Wireless Network, a fiber optic Network, a telecommunications Network, an intranet, the internet, a Local Area Network (LAN), a Wide Area Network (WAN), a Wireless Local Area Network (WLAN), a WLAN, a Metropolitan Area Network (MAN), a Wide Area Network (WAN), a Public Switched Telephone Network (PSTN), a bluetooth Network, a ZigBee Network, a Near Field Communication (NFC) Network, or the like, or any combination thereof. In some embodiments, the network may include one or more network access points. For example, the network may include wired or wireless network access points, such as base stations and/or network switching nodes, through which one or more components of the netpage system 10 may connect to the network to exchange data and/or information.
The aforementioned database may store data and/or instructions. In some embodiments, the database may store data distributed to the web page processing terminal 200. In some embodiments, the database may store data and/or instructions for the exemplary methods described herein. In some embodiments, the database may include mass storage, removable storage, volatile Read-write Memory, or Read-Only Memory (ROM), among others, or any combination thereof. By way of example, mass storage may include magnetic disks, optical disks, solid state drives, and the like; removable memory may include flash drives, floppy disks, optical disks, memory cards, zip disks, tapes, and the like; volatile read-write Memory may include Random Access Memory (RAM); the RAM may include Dynamic RAM (DRAM), Double data Rate Synchronous Dynamic RAM (DDR SDRAM); static RAM (SRAM), Thyristor-Based Random Access Memory (T-RAM), Zero-capacitor RAM (Zero-RAM), and the like. By way of example, ROMs may include Mask Read-Only memories (MROMs), Programmable ROMs (PROMs), Erasable Programmable ROMs (PERROMs), Electrically Erasable Programmable ROMs (EEPROMs), compact disk ROMs (CD-ROMs), digital versatile disks (ROMs), and the like. In some embodiments, the database may be implemented on a cloud platform. By way of example only, the cloud platform may include a private cloud, a public cloud, a hybrid cloud, a community cloud, a distributed cloud, across clouds, multiple clouds, or the like, or any combination thereof.
In some embodiments, the database may be connected to a network to communicate with one or more components in the web page system 10 (e.g., server 100, web page processing terminal 200, etc.). One or more components in the netpage system 10 may access data or instructions stored in a database via a network. In some embodiments, the database may be directly connected to one or more components of the web page system 10 (e.g., the server 100, the web page processing terminal 200, etc.), or in some embodiments, the database may be part of the server 100.
To solve the technical problem in the foregoing background art, fig. 2 is a schematic flowchart of a web page data processing method provided in an embodiment of the present application, and the web page data processing method provided in this embodiment may be executed by the server 100 shown in fig. 1, and the web page data processing method is described in detail below.
In step S110, web page data of the current web page processing terminal 200 associated with the target web page is acquired.
Step S120, when it is determined that the current web page processing terminal 200 performs data monitoring by using the web page data monitoring policy corresponding to the target web page, extracting the target web page item feature from the web page data.
Step S130, determining a target webpage data area of the current webpage processing terminal 200 according to the target webpage item characteristics, and determining an access data area of a webpage data file of the webpage data item corresponding to the target webpage data area and a corresponding data protection script.
Step S140, according to the data protection script, after the data protection processing is performed on the webpage address corresponding to the current webpage content data in the access data area, the original webpage content data is added again in the access data area.
In this embodiment, the target webpage item feature is composed of a first item feature and a second item feature, the first item feature is a data download feature monitored by a target monitoring node of which the monitoring level is greater than a set level among monitoring nodes included in the webpage data monitoring policy, the second item feature is a data fill feature monitored by the target monitoring node, and different monitoring nodes are used for monitoring different preset webpage services.
The preset web page service may be configured and determined according to the actual requirement of the target web page, which is not specifically limited in this embodiment. For example, different types of multimedia services may be included, such as a picture service, a video service, an audio service, and so on.
In this embodiment, the data downloading feature may be used to indicate a data downloading behavior of the target webpage from other access domains monitored by the target monitoring node, and the data filling feature may be used to indicate a data obtaining and filling behavior of the target webpage from other access domains monitored by the target monitoring node.
In this embodiment, the corresponding web pages are associated in the server in advance through the web page processing terminal 200, so that the server can conveniently and uniformly manage the web page tampering behaviors, independent monitoring by a service provider of each web page is not needed, and after the data download characteristics and the data filling characteristics monitored by the target monitoring nodes higher than the set monitoring level are obtained from the web page data to determine the corresponding target web page data area, the web page data protection and data recovery operations are performed, so that the preset server in the web page is not needed to be accurately configured by spending a large amount of human resources, and a normal web page service is not affected when the web page has a service update, and meanwhile, the access behavior of the terminal is not affected under the web page services corresponding to other monitoring nodes in the web page, so that the normal web page service is further ensured.
As a possible implementation manner, for step S120, in this embodiment, for a target monitoring node of which the monitoring level is greater than the set level among the monitoring nodes included in the webpage data monitoring policy, service invocation data between each service item of the corresponding webpage service and item feature information included in a service item of the invocation service invoked by the service item of the webpage service are respectively acquired from the webpage data by the target monitoring node.
And then, respectively determining the association degree of each service item of the webpage service according to the acquired service call data among the service items of the webpage service.
For example, as a possible example, for any two service items, a service invocation range between the two service items can be determined according to service invocation data between the two service items, where the service invocation range is used to indicate an overlapping situation of the service data in the process of invoking the service by the two service items. On the basis, the ratio of the service calling range between the two service projects to the service calling range between the two service projects and the service projects of other web page services is respectively determined according to the service calling range between the two service projects, so that the association degree between the two service projects can be determined according to the ratio of the service calling range between the two service projects to the service calling range between the two service projects and the service projects of other web page services.
On the basis, the embodiment may further divide each service item of the web service into a data download item and a data fill item according to the item feature information included in the service item of the calling service called by the acquired service item of the web service. For example, if the item characteristic information indicates that the business item mainly has data downloading behavior, the business item is divided into data downloading items, and for example, if the item characteristic information indicates that the business item mainly has data filling behavior, the business item is divided into data filling items. The specific dividing process may be determined by comparing a ratio between the data downloading behavior and the data filling behavior, or the number of times of the data downloading behavior and the data filling behavior, and the like, and is not limited specifically herein.
Then, the present embodiment may further determine the item feature information of each of the data download item and the data fill item according to the item feature information and the association degree between each service item of the web service.
For example, in one possible example, the item feature information may include loading behavior information of at least two item types, and the loading behavior information may be used to indicate a data flow direction during loading of data. On the basis, an association function between the service items of the web page service can be established according to the obtained association between the service items of the web page service, and the item type confidence of the loading behavior information of each item type corresponding to each data download item and data filling item is determined according to the loading behavior information of at least two item types included by the item characteristic information and the item characteristic information.
Then, a first item type function of the loading behavior information of the service item of the web service corresponding to each item type can be established according to the item type confidence of the loading behavior information of each item type corresponding to each data downloading item and the item type confidence of the loading behavior information of each item type corresponding to each data filling item. And then, utilizing the relevance function to circularly obtain a second item type function of loading behavior information of each service item of the webpage service corresponding to each item type by using a fusion result of the relevance function and the first item type function until the circulation times reach preset times or a change value of confidence coefficient of each item type in the second item type function is lower than a set change value.
Before each cycle starts, recovering the item type confidence coefficient corresponding to the data download item contained in the fusion result obtained by the previous cycle as the item type confidence coefficient corresponding to the data download item contained in the first item type function, and selecting the loading behavior information of the item type with the maximum item type confidence coefficient as the loading behavior information corresponding to the data download item.
Meanwhile, for each data filling item, according to the item type confidence coefficient of the loading behavior information of each item type corresponding to the data filling item in the second item type function, selecting the loading behavior information of the item type with the largest item type confidence coefficient as the loading behavior information corresponding to the data filling item;
in this way, the corresponding item feature information is obtained according to the obtained loading behavior information of the data download item and the data filling item, so that the target webpage item feature is determined according to the item feature information of the data download item and the data filling item.
To further illustrate the technical solution provided by this embodiment, for step S130, this embodiment may determine, according to the first item feature and the second item feature obtained as described above, a first initial webpage data area and a second initial webpage data area respectively corresponding to the current webpage processing terminal 200. That is, the first item feature may correspond to obtain the first initial webpage data area, and the second item feature may correspond to obtain the second initial webpage data area. The specific obtaining manner may be determined according to the distribution situation of the project service indicated by the first project characteristic and the second project characteristic in the target webpage, and the rate in this respect belongs to the prior art and is not described herein again.
Therefore, in order to accurately determine a target webpage data area for subsequent processing and improve processing efficiency, the embodiment further determines a coincident data area between a first initial webpage data area and a second initial webpage data area, acquires webpage project characteristics of the coincident data area, divides the webpage project characteristics into characteristic fragments, then adds each characteristic fragment to a data area positioning sequence, wherein each characteristic fragment corresponds to a data area positioning node in the data area positioning sequence one by one, so as to sequentially select the data area positioning nodes from the data area positioning sequence, and parallelly distribute the selected data area positioning nodes to each data area positioning process in an idle state, the data area positioning nodes are used for indicating each data area positioning process to generate first data area information corresponding to the characteristic fragment corresponding to the data area positioning node, the characteristic fragments are used for indicating the corresponding data area positioning process to generate first data area information corresponding to the characteristic fragment, the feature fragment is further used for indicating the corresponding data area positioning process to respectively convert the feature fragment into a download feature sequence and a filling feature sequence, respectively extracting a first feature from each download feature of the download feature sequence, and extracting a second feature from each filling feature of the filling feature sequence.
On the basis of the above description, each first feature and each second feature may be analyzed to obtain first data area information corresponding to the feature fragment, then the first data area information fed back by each data area positioning process is obtained, after each first data area information is fused according to the sequence of the nodes where the corresponding feature fragment is located in the web project feature, each data area node in the fused data area information is converted into a data area node vector to obtain a data area node vector sequence, then the data area node vector sequence may be subjected to redundancy removal coding to generate second data area information corresponding to the web project feature, so as to obtain the target web page data area of the current web page processing terminal 200.
Further, regarding step S130, based on the above description, after the target webpage data area is determined, the present embodiment may further determine an access identifier of the webpage data file of the webpage data item corresponding to the target webpage data area, so as to determine the corresponding access data area according to the access identifier.
Therefore, the feature information of the item feature of the webpage data item can be obtained, the file feature information of each suspected falsified data file in a plurality of webpage data files under the received webpage data item is obtained, the similarity calculation is carried out on the file feature information and the feature information of each pre-configured data protection script, a plurality of first similarity calculation results for each data protection script are obtained, and the feature information of each data protection script is as follows: and determining the characteristic information of the preset data downloading characteristic and the data filling characteristic corresponding to the data protection script in the configuration process.
Therefore, the corresponding data protection script may be determined according to the plurality of first similarity calculation results, for example, the data protection script corresponding to the highest similarity in the plurality of first similarity calculation results may be selected, or two or more data protection scripts with top-ranked similarities may be selected, the number of the specifically selected data protection scripts may be determined according to actual requirements, and this embodiment is not limited specifically.
As a possible implementation manner, each data protection script may be configured and obtained in the following manner:
firstly, acquiring each target characteristic information of preset data protection characteristics corresponding to each data protection instruction in a configuration set to form a target characteristic information set, then selecting one target characteristic information in the target characteristic information set one by one as current target characteristic information respectively, creating a data protection script according to the target characteristic information, calculating the association degree between the current target characteristic information and the header information of the data protection script, and acquiring a plurality of second association degree values as second similarity degree results.
On the basis, whether each second relevance value is smaller than a preset threshold value or not can be judged, if yes, the second similarity result is determined to meet a preset similar condition, if not, the second similarity result is determined not to meet the preset similar condition, a corresponding data protection script is obtained when the second similarity calculation result meets the preset similar condition, the data protection script is determined to be the data protection script to which the current target feature information belongs, and the current target feature information is added into the data protection script to which the current target feature information belongs.
And if no second similarity calculation result meets the preset similarity condition, creating a data protection script, recording the head information of the data protection script as the current target characteristic information, recalculating the head information of the data protection script, and taking the head information of each data protection script as the characteristic information of the preset data protection characteristic corresponding to each data protection script after completing the fusion of each target characteristic information in the target characteristic information set.
Finally, as a possible implementation manner for step S140, in this embodiment, a plurality of protection processing nodes may be obtained according to the data protection script, and a protection instruction of each protection processing node in the plurality of protection processing nodes is obtained, so as to obtain protection sequence control information of each protection processing node according to the protection instruction of each protection processing node and a protection sequence parameter pre-configured for each protection processing node, where the protection sequence control information includes a protection sequence parameter and a node parameter of each corresponding protection processing node.
Therefore, according to the protection processing label of each protection processing node and the protection sequence parameter of each protection processing node, the access domain of the webpage address corresponding to the current webpage content data in the access data area is subjected to protection marking, the access domain after the protection marking is associated to the preset protection set corresponding to the access data area, the protection strength of the access domain is configured according to the data total amount of the current webpage content data, and the corresponding protection interception instruction is set for the access domain according to the protection strength of the access domain, so that the subsequent protection can be performed on the access domain pertinently without spending a large amount of human resources to accurately configure a preset server in the webpage, and normal webpage service cannot be influenced when the webpage has service update.
On the basis, in order to ensure normal operation of the service, the original webpage content data in the access data area can be deleted, and the original webpage content data is added in the access data area again.
Fig. 3 is a schematic diagram of functional modules of a web page data processing apparatus 300 according to an embodiment of the present application, and the embodiment may divide the functional modules of the web page data processing apparatus 300 according to the foregoing method embodiment. For example, the functional blocks may be divided for the respective functions, or two or more functions may be integrated into one processing block. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. It should be noted that, the division of the modules in the present application is schematic, and is only a logical function division, and there may be another division manner in actual implementation. For example, in the case of dividing each function module according to each function, the web page data processing apparatus 300 shown in fig. 3 is only an apparatus diagram. The web page data processing apparatus 300 may include an obtaining module 310, an extracting module 320, a determining module 330, and a guard adding module 340, and the functions of the functional modules of the web page data processing apparatus 300 are described in detail below.
The obtaining module 310 is configured to obtain web page data of the current web page processing terminal 200 associated with the target web page.
The extracting module 320 is configured to, when it is determined that the current web processing terminal 200 performs data monitoring by using a web data monitoring policy corresponding to a target web, extract a target web item feature from the web data, where the target web item feature is composed of a first item feature and a second item feature, the first item feature is a data download feature monitored by a target monitoring node whose monitoring level is greater than a set level among monitoring nodes included in the web data monitoring policy, the second item feature is a data fill feature monitored by the target monitoring node, and different monitoring nodes are used for monitoring different preset web services.
The determining module 330 is configured to determine a target webpage data area of the current webpage processing terminal 200 according to the characteristics of the target webpage item, and determine an access data area of a webpage data file of the webpage data item corresponding to the target webpage data area and a corresponding data protection script.
And the protection adding module 340 is configured to perform data protection processing on the web address corresponding to the current web content data in the access data area according to the data protection script, and then add the original web content data again in the access data area.
Further, fig. 4 is a schematic structural diagram of a server 100 for executing the above-mentioned web page data processing method according to an embodiment of the present application. As shown in FIG. 4, the server 100 may include a network interface 110, a machine-readable storage medium 120, a processor 130, and a bus 140. The processor 130 may be one or more, and one processor 130 is illustrated in fig. 4 as an example. The network interface 110, the machine-readable storage medium 120, and the processor 130 may be connected by a bus 140 or otherwise, as exemplified by the connection by the bus 140 in fig. 4.
The machine-readable storage medium 120 is a computer-readable storage medium, and can be used to store software programs, computer-executable programs, and modules, such as program instructions/modules corresponding to the web page data processing method in the embodiment of the present application (for example, the obtaining module 310, the extracting module 320, the determining module 330, and the guard adding module 340 of the web page data processing apparatus 300 shown in fig. 3). The processor 130 executes various functional applications and data processing of the terminal device by detecting the software programs, instructions and modules stored in the machine-readable storage medium 120, that is, the above-mentioned web page data processing method is implemented, and details are not described herein.
The machine-readable storage medium 120 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to the use of the terminal, and the like. Further, the machine-readable storage medium 120 may be either volatile memory or nonvolatile memory, or may include both volatile and nonvolatile memory. The non-volatile Memory may be a Read-Only Memory (ROM), a Programmable ROM (PROM), an Erasable PROM (EPROM), an Electrically Erasable PROM (EEPROM), or a flash Memory. Volatile Memory can be Random Access Memory (RAM), which acts as external cache Memory. By way of example, but not limitation, many forms of RAM are available, such as Static random access memory (Static RAM, SRAM), Dynamic Random Access Memory (DRAM), Synchronous Dynamic random access memory (Synchronous DRAM, SDRAM), Double Data rate Synchronous Dynamic random access memory (DDR SDRAM), Enhanced Synchronous SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), and direct memory bus RAM (DR RAM). It should be noted that the memories of the systems and methods described herein are intended to comprise, without being limited to, these and any other suitable memory of a publishing node. In some examples, the machine-readable storage medium 120 may further include memory located remotely from the processor 130, which may be connected to the server 100 over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The processor 130 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method embodiments may be performed by integrated logic circuits of hardware or instructions in the form of software in the processor 130. The processor 130 may be a general-purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, or discrete hardware components. The various methods, steps, and logic blocks disclosed in the embodiments of the present application may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present application may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor.
The server 100 can perform information interaction with other devices (e.g., the web page processing terminal 200) through the network interface 110. Network interface 110 may be a circuit, bus, transceiver, or any other device that may be used to exchange information. Processor 130 may send and receive information using network interface 110.
In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the application to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website site, computer, server, or data center to another website site, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that incorporates one or more of the available media. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.
Embodiments of the present application are described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It will be apparent to those skilled in the art that various changes and modifications may be made in the embodiments of the present application without departing from the spirit and scope of the application. Thus, if such modifications and variations of the embodiments of the present application fall within the scope of the claims of the present application and their equivalents, the present application is also intended to encompass such modifications and variations.

Claims (8)

1. A web page data processing method is applied to a server, the server is in communication connection with at least one web page processing terminal, each web page processing terminal associates a corresponding web page in the server in advance, and the method comprises the following steps:
acquiring webpage data of a current webpage processing terminal related to a target webpage;
when it is determined that the current webpage processing terminal performs data monitoring by adopting a webpage data monitoring strategy corresponding to a target webpage, extracting target webpage item features from the webpage data, wherein the target webpage item features are composed of first item features and second item features, the first item features are data download features monitored by target monitoring nodes of which the monitoring levels are greater than a set level in monitoring nodes included in the webpage data monitoring strategy, the second item features are data filling features monitored by the target monitoring nodes, and different monitoring nodes are used for monitoring different preset webpage services;
determining a target webpage data area of the current webpage processing terminal according to the target webpage project characteristics, and determining an access data area of a webpage data file of a webpage data project corresponding to the target webpage data area and a corresponding data protection script;
according to the data protection script, after data protection processing is carried out on a webpage address corresponding to the current webpage content data in the access data area, original webpage content data are added in the access data area again;
the step of determining the target webpage data area of the current webpage processing terminal according to the target webpage project characteristics comprises the following steps:
determining a first initial webpage data area and a second initial webpage data area which respectively correspond to the current webpage processing terminal according to the first project characteristic and the second project characteristic;
determining a superposed data area between the first initial webpage data area and the second initial webpage data area, acquiring webpage item characteristics of the superposed data area, and dividing the webpage item characteristics into characteristic fragments;
adding each characteristic fragment to a data area positioning sequence, wherein each characteristic fragment corresponds to a data area positioning node in the data area positioning sequence one to one;
sequentially selecting data area positioning nodes from the data area positioning sequence, and distributing the selected data area positioning nodes to each data area positioning process in an idle state in parallel, wherein the data area positioning nodes are used for indicating each data area positioning process to generate first data area information corresponding to a feature fragment corresponding to the data area positioning node, the feature fragment is used for indicating the corresponding data area positioning process to generate the first data area information corresponding to the feature fragment, the feature fragment is also used for indicating the corresponding data area positioning process to respectively convert the feature fragment into a download feature sequence and a filling feature sequence, respectively extract a first feature from each download feature of the download feature sequence, and extract a second feature from each filling feature of the filling feature sequence;
analyzing the first characteristics and the second characteristics to obtain first data area information corresponding to the characteristic fragments, then obtaining the first data area information fed back by the positioning process of each data area, fusing the first data area information according to the sequence of the nodes of the corresponding characteristic fragments in the webpage project characteristics, and converting each data area node in the fused data area information into a data area node vector to obtain a data area node vector sequence;
performing redundancy removal coding on the data area node vector sequence to generate second data area information corresponding to the webpage project characteristics so as to obtain a target webpage data area of the current webpage processing terminal;
the step of determining the access data area of the web page data file of the web page data item corresponding to the target web page data area and the corresponding data protection script includes:
determining an access identifier of a webpage data file of a webpage data item corresponding to the target webpage data area, and determining a corresponding access data area according to the access identifier;
acquiring feature information of project features of the webpage data project, and acquiring file feature information of each suspected falsified data file in a plurality of webpage data files under the received webpage data project;
similarity calculation is carried out on the file characteristic information and the characteristic information of each pre-configured data protection script, and a plurality of first similarity calculation results for each data protection script are obtained, wherein the characteristic information of each data protection script is as follows: determining the characteristic information of the preset data downloading characteristic and the data filling characteristic corresponding to the data protection script in the configuration process;
determining the corresponding data protection script according to the plurality of first similarity calculation results;
wherein, each data protection script is configured and obtained by adopting the following mode:
acquiring target characteristic information of preset data protection characteristics corresponding to each data protection instruction in a configuration set to form a target characteristic information set;
selecting one target feature information in the target feature information set one by one as current target feature information respectively, creating a data protection script according to the target feature information, calculating the association degree between the current target feature information and the header information of the data protection script, and obtaining a plurality of second association degree values as second similarity degree results;
judging whether each second correlation value is smaller than a preset threshold value, if so, determining that the second similarity result meets a preset similar condition, and if not, determining that the second similarity result does not meet the preset similar condition;
acquiring a corresponding data protection script when a second similarity calculation result meets a preset similarity condition, determining the data protection script to which the current target characteristic information belongs, and adding the current target characteristic information into the data protection script to which the current target characteristic information belongs;
and if no second similarity calculation result meets the preset similarity condition, creating a data protection script, recording the head information of the data protection script as the current target characteristic information, recalculating the head information of the data protection script, and taking the head information of each data protection script as the characteristic information of the preset data protection characteristic corresponding to each data protection script after completing the fusion of each target characteristic information in the target characteristic information set.
2. The method for processing webpage data according to claim 1, wherein the step of extracting the target webpage item feature from the webpage data comprises:
aiming at a target monitoring node of which the monitoring grade is greater than a set grade in monitoring nodes included in the webpage data monitoring strategy, respectively acquiring service calling data among all service items of corresponding webpage services and item feature information contained in a service item of calling service called by the service item of the webpage service from the webpage data through the target monitoring node;
respectively determining the association degree of each service item of the webpage service according to the acquired service calling data among the service items of the webpage service;
dividing each service item of the webpage service into a data downloading item and a data filling item according to the acquired item feature information contained in the service item of the calling service called by the service item of the webpage service;
determining the respective item characteristic information of the data download item and the data filling item according to the item characteristic information and the correlation degree between the service items of the webpage service;
and determining the characteristics of the target webpage item according to the item characteristic information of the data download item and the data filling item.
3. The method for processing webpage data according to claim 2, wherein the step of determining the association degree between the service items of the webpage service respectively according to the acquired service call data between the service items of the webpage service comprises:
aiming at any two service projects, determining a service calling range between the two service projects according to service calling data between the two service projects, wherein the service calling range is used for representing the coincidence condition of the service data of the two service projects in the service calling process;
respectively determining the ratio of the service calling range between the two service projects to the service calling range between the service projects and the service projects of other web page services according to the service calling range between the two service projects;
and determining the association degree between the two service projects according to the ratio of the service calling range between the two service projects to the service calling range between the service projects of the two service projects and the service projects of other web page services.
4. The web page data processing method according to claim 2, wherein the item feature information includes loading behavior information of at least two item types;
the step of determining the item feature information of each of the data download item and the data fill item according to the item feature information and the association degree between each service item of the web service includes:
establishing a correlation function between the service items of the webpage service according to the correlation between the service items of the webpage service;
determining item type confidence degrees of each data downloading item and each data filling item corresponding to the loading behavior information of each item type according to the item characteristic information and the loading behavior information of at least two item types included in the item characteristic information;
establishing a first item type function of the loading behavior information of each item type corresponding to the service item of the web service according to the item type confidence of the loading behavior information of each item type corresponding to each data downloading item and the item type confidence of the loading behavior information of each item type corresponding to each data filling item;
utilizing the relevance function to cycle the fusion result of the relevance function and the first item type function to obtain a second item type function of loading behavior information of each service item of the web page service corresponding to each item type until the cycle number reaches a preset number or the change value of each item type confidence coefficient in the second item type function is lower than a set change value, wherein before each cycle starts, for each data download item, the item type confidence coefficient corresponding to the data download item contained in the fusion result obtained by the previous cycle is recovered as the item type confidence coefficient corresponding to the data download item contained in the first item type function, and the loading behavior information of the item type with the largest item type confidence coefficient is selected as the loading behavior information corresponding to the data download item;
for each data filling item, selecting the loading behavior information of the item type with the largest item type confidence coefficient as the loading behavior information corresponding to the data filling item according to the item type confidence coefficient of the loading behavior information of each item type corresponding to the data filling item in the second item type function;
and obtaining corresponding item characteristic information according to the corresponding loading behavior information of the data downloading item and the data filling item.
5. The method for processing webpage data according to any one of claims 1 to 4, wherein the step of adding original webpage content data in the accessed data area again after performing data protection processing on the webpage address corresponding to the current webpage content data in the accessed data area according to the data protection script comprises:
acquiring a plurality of protection processing nodes according to the data protection script, and acquiring a protection instruction of each protection processing node in the plurality of protection processing nodes;
acquiring protection sequence control information of each protection processing node according to the protection instruction of each protection processing node and protection sequence parameters pre-configured by each protection processing node, wherein the protection sequence control information comprises protection sequence parameters and corresponding node parameters of each protection processing node;
according to the protection processing label of each protection processing node and the protection sequence parameter of each protection processing node, performing protection marking on an access domain of a webpage address corresponding to the current webpage content data in the access data area, associating the access domain after protection marking to a preset protection set corresponding to the access data area, configuring the protection strength of the access domain according to the total data amount of the current webpage content data, and setting a corresponding protection interception instruction for the access domain according to the protection strength of the access domain;
and deleting the original webpage content data in the access data area, and adding the original webpage content data in the access data area again.
6. A web page data processing apparatus applied to a server, the server being in communication connection with at least one web page processing terminal, each web page processing terminal associating a corresponding web page in the server in advance, the apparatus comprising:
the acquisition module is used for acquiring webpage data of a current webpage processing terminal related to a target webpage;
the extraction module is used for extracting target webpage item features from the webpage data when determining that the current webpage processing terminal adopts a webpage data monitoring strategy corresponding to a target webpage for data monitoring, wherein the target webpage item features consist of first item features and second item features, the first item features are data downloading features monitored by target monitoring nodes of which the monitoring levels are greater than a set level in monitoring nodes included in the webpage data monitoring strategy, the second item features are data filling features monitored by the target monitoring nodes, and different monitoring nodes are used for monitoring different preset webpage services;
the determining module is used for determining a target webpage data area of the current webpage processing terminal according to the target webpage project characteristics, and determining an access data area of a webpage data file of a webpage data project corresponding to the target webpage data area and a corresponding data protection script;
the protection adding module is used for performing data protection processing on a webpage address corresponding to the current webpage content data in the access data area according to the data protection script and then adding original webpage content data in the access data area again;
the determining module determines a target webpage data area of the current webpage processing terminal according to the target webpage project characteristics in the following mode:
determining a first initial webpage data area and a second initial webpage data area which respectively correspond to the current webpage processing terminal according to the first project characteristic and the second project characteristic;
determining a superposed data area between the first initial webpage data area and the second initial webpage data area, acquiring webpage item characteristics of the superposed data area, and dividing the webpage item characteristics into characteristic fragments;
adding each characteristic fragment to a data area positioning sequence, wherein each characteristic fragment corresponds to a data area positioning node in the data area positioning sequence one to one;
sequentially selecting data area positioning nodes from the data area positioning sequence, and distributing the selected data area positioning nodes to each data area positioning process in an idle state in parallel, wherein the data area positioning nodes are used for indicating each data area positioning process to generate first data area information corresponding to a feature fragment corresponding to the data area positioning node, the feature fragment is used for indicating the corresponding data area positioning process to generate the first data area information corresponding to the feature fragment, the feature fragment is also used for indicating the corresponding data area positioning process to respectively convert the feature fragment into a download feature sequence and a filling feature sequence, respectively extract a first feature from each download feature of the download feature sequence, and extract a second feature from each filling feature of the filling feature sequence;
analyzing the first characteristics and the second characteristics to obtain first data area information corresponding to the characteristic fragments, then obtaining the first data area information fed back by the positioning process of each data area, fusing the first data area information according to the sequence of the nodes of the corresponding characteristic fragments in the webpage project characteristics, and converting each data area node in the fused data area information into a data area node vector to obtain a data area node vector sequence;
performing redundancy removal coding on the data area node vector sequence to generate second data area information corresponding to the webpage project characteristics so as to obtain a target webpage data area of the current webpage processing terminal;
the determining module determines the access data area and the corresponding data protection script of the webpage data file of the webpage data item corresponding to the target webpage data area in the following modes:
determining an access identifier of a webpage data file of a webpage data item corresponding to the target webpage data area, and determining a corresponding access data area according to the access identifier;
acquiring feature information of project features of the webpage data project, and acquiring file feature information of each suspected falsified data file in a plurality of webpage data files under the received webpage data project;
similarity calculation is carried out on the file characteristic information and the characteristic information of each pre-configured data protection script, and a plurality of first similarity calculation results for each data protection script are obtained, wherein the characteristic information of each data protection script is as follows: determining the characteristic information of the preset data downloading characteristic and the data filling characteristic corresponding to the data protection script in the configuration process;
determining the corresponding data protection script according to the plurality of first similarity calculation results;
wherein, each data protection script is configured and obtained by adopting the following mode:
acquiring target characteristic information of preset data protection characteristics corresponding to each data protection instruction in a configuration set to form a target characteristic information set;
selecting one target feature information in the target feature information set one by one as current target feature information respectively, creating a data protection script according to the target feature information, calculating the association degree between the current target feature information and the header information of the data protection script, and obtaining a plurality of second association degree values as second similarity degree results;
judging whether each second correlation value is smaller than a preset threshold value, if so, determining that the second similarity result meets a preset similar condition, and if not, determining that the second similarity result does not meet the preset similar condition;
acquiring a corresponding data protection script when a second similarity calculation result meets a preset similarity condition, determining the data protection script to which the current target characteristic information belongs, and adding the current target characteristic information into the data protection script to which the current target characteristic information belongs;
and if no second similarity calculation result meets the preset similarity condition, creating a data protection script, recording the head information of the data protection script as the current target characteristic information, recalculating the head information of the data protection script, and taking the head information of each data protection script as the characteristic information of the preset data protection characteristic corresponding to each data protection script after completing the fusion of each target characteristic information in the target characteristic information set.
7. The webpage system is characterized by comprising a server and at least one webpage processing terminal in communication connection with the server, wherein each webpage processing terminal associates a corresponding webpage in the server in advance;
when the webpage processing terminal is associated with a target webpage, the webpage processing terminal is used for sending webpage data to the server;
the server is used for acquiring webpage data of the webpage processing terminal related to the target webpage;
when it is determined that a current webpage processing terminal performs data monitoring by adopting a webpage data monitoring strategy corresponding to a target webpage, the server is used for extracting target webpage item features from the webpage data, the target webpage item features are composed of first item features and second item features, the first item features are data downloading features monitored by target monitoring nodes of which the monitoring levels are greater than a set level in monitoring nodes included in the webpage data monitoring strategy, the second item features are data filling features monitored by the target monitoring nodes, and different monitoring nodes are used for monitoring different preset webpage services;
the server is used for determining a target webpage data area of the current webpage processing terminal according to the target webpage project characteristics, and determining an access data area of a webpage data file of a webpage data project corresponding to the target webpage data area and a corresponding data protection script;
the server is used for performing data protection processing on a webpage address corresponding to the current webpage content data in the access data area according to the data protection script and then adding original webpage content data in the access data area again;
the server is used for determining a target webpage data area of the current webpage processing terminal according to the target webpage project characteristics in the following modes:
determining a first initial webpage data area and a second initial webpage data area which respectively correspond to the current webpage processing terminal according to the first project characteristic and the second project characteristic;
determining a superposed data area between the first initial webpage data area and the second initial webpage data area, acquiring webpage item characteristics of the superposed data area, and dividing the webpage item characteristics into characteristic fragments;
adding each characteristic fragment to a data area positioning sequence, wherein each characteristic fragment corresponds to a data area positioning node in the data area positioning sequence one to one;
sequentially selecting data area positioning nodes from the data area positioning sequence, and distributing the selected data area positioning nodes to each data area positioning process in an idle state in parallel, wherein the data area positioning nodes are used for indicating each data area positioning process to generate first data area information corresponding to a feature fragment corresponding to the data area positioning node, the feature fragment is used for indicating the corresponding data area positioning process to generate the first data area information corresponding to the feature fragment, the feature fragment is also used for indicating the corresponding data area positioning process to respectively convert the feature fragment into a download feature sequence and a filling feature sequence, respectively extract a first feature from each download feature of the download feature sequence, and extract a second feature from each filling feature of the filling feature sequence;
analyzing the first characteristics and the second characteristics to obtain first data area information corresponding to the characteristic fragments, then obtaining the first data area information fed back by the positioning process of each data area, fusing the first data area information according to the sequence of the nodes of the corresponding characteristic fragments in the webpage project characteristics, and converting each data area node in the fused data area information into a data area node vector to obtain a data area node vector sequence;
performing redundancy removal coding on the data area node vector sequence to generate second data area information corresponding to the webpage project characteristics so as to obtain a target webpage data area of the current webpage processing terminal;
the server is used for determining an access data area and a corresponding data protection script of a webpage data file of a webpage data item corresponding to the target webpage data area in the following modes:
determining an access identifier of a webpage data file of a webpage data item corresponding to the target webpage data area, and determining a corresponding access data area according to the access identifier;
acquiring feature information of project features of the webpage data project, and acquiring file feature information of each suspected falsified data file in a plurality of webpage data files under the received webpage data project;
similarity calculation is carried out on the file characteristic information and the characteristic information of each pre-configured data protection script, and a plurality of first similarity calculation results for each data protection script are obtained, wherein the characteristic information of each data protection script is as follows: determining the characteristic information of the preset data downloading characteristic and the data filling characteristic corresponding to the data protection script in the configuration process;
determining the corresponding data protection script according to the plurality of first similarity calculation results;
wherein, each data protection script is configured and obtained by adopting the following mode:
acquiring target characteristic information of preset data protection characteristics corresponding to each data protection instruction in a configuration set to form a target characteristic information set;
selecting one target feature information in the target feature information set one by one as current target feature information respectively, creating a data protection script according to the target feature information, calculating the association degree between the current target feature information and the header information of the data protection script, and obtaining a plurality of second association degree values as second similarity degree results;
judging whether each second correlation value is smaller than a preset threshold value, if so, determining that the second similarity result meets a preset similar condition, and if not, determining that the second similarity result does not meet the preset similar condition;
acquiring a corresponding data protection script when a second similarity calculation result meets a preset similarity condition, determining the data protection script to which the current target characteristic information belongs, and adding the current target characteristic information into the data protection script to which the current target characteristic information belongs;
and if no second similarity calculation result meets the preset similarity condition, creating a data protection script, recording the head information of the data protection script as the current target characteristic information, recalculating the head information of the data protection script, and taking the head information of each data protection script as the characteristic information of the preset data protection characteristic corresponding to each data protection script after completing the fusion of each target characteristic information in the target characteristic information set.
8. A server, characterized in that the server comprises a processor, a machine-readable storage medium, and a network interface, the machine-readable storage medium, the network interface and the processor are connected through a bus system, the network interface is used for being connected with at least one web page processing terminal in a communication manner, the machine-readable storage medium is used for storing programs, instructions or codes, and the processor is used for executing the programs, instructions or codes in the machine-readable storage medium to execute the web page data processing method according to any one of claims 1 to 5.
CN202010077968.XA 2020-02-02 2020-02-02 Webpage data processing method and device, webpage system and server Active CN111291300B (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
CN202010077968.XA CN111291300B (en) 2020-02-02 2020-02-02 Webpage data processing method and device, webpage system and server
CN202010931637.8A CN112115400A (en) 2020-02-02 2020-02-02 Webpage data processing method and device and webpage cloud platform
CN202010931657.5A CN112115401A (en) 2020-02-02 2020-02-02 Webpage data processing method, device and system based on cloud platform

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010077968.XA CN111291300B (en) 2020-02-02 2020-02-02 Webpage data processing method and device, webpage system and server

Related Child Applications (2)

Application Number Title Priority Date Filing Date
CN202010931657.5A Division CN112115401A (en) 2020-02-02 2020-02-02 Webpage data processing method, device and system based on cloud platform
CN202010931637.8A Division CN112115400A (en) 2020-02-02 2020-02-02 Webpage data processing method and device and webpage cloud platform

Publications (2)

Publication Number Publication Date
CN111291300A CN111291300A (en) 2020-06-16
CN111291300B true CN111291300B (en) 2020-11-17

Family

ID=71021446

Family Applications (3)

Application Number Title Priority Date Filing Date
CN202010931637.8A Withdrawn CN112115400A (en) 2020-02-02 2020-02-02 Webpage data processing method and device and webpage cloud platform
CN202010077968.XA Active CN111291300B (en) 2020-02-02 2020-02-02 Webpage data processing method and device, webpage system and server
CN202010931657.5A Withdrawn CN112115401A (en) 2020-02-02 2020-02-02 Webpage data processing method, device and system based on cloud platform

Family Applications Before (1)

Application Number Title Priority Date Filing Date
CN202010931637.8A Withdrawn CN112115400A (en) 2020-02-02 2020-02-02 Webpage data processing method and device and webpage cloud platform

Family Applications After (1)

Application Number Title Priority Date Filing Date
CN202010931657.5A Withdrawn CN112115401A (en) 2020-02-02 2020-02-02 Webpage data processing method, device and system based on cloud platform

Country Status (1)

Country Link
CN (3) CN112115400A (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112100661B (en) * 2020-09-16 2024-03-12 深圳集智数字科技有限公司 Data processing method and device
CN114168670B (en) * 2021-12-03 2022-12-27 苏州博士创新技术转移有限公司 Industrial ecological big data integration method and system and cloud platform

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102663289A (en) * 2012-03-22 2012-09-12 奇智软件(北京)有限公司 Method and device for intercepting rogue program of modifying page elements
CN105824813A (en) * 2015-01-05 2016-08-03 中国移动通信集团江苏有限公司 Core user excavate method and device

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8584232B2 (en) * 2007-04-23 2013-11-12 Sap Ag Enhanced cross-site attack prevention
JP2017501490A (en) * 2013-12-02 2017-01-12 ザグ ホールディングス インコーポレーテッド Method and system for legacy compatible software
CN106530154A (en) * 2016-08-08 2017-03-22 浙江大学城市学院 College classroom teaching efficiency monitoring system and college classroom teaching efficiency monitoring method based on WiFi
CN106790687A (en) * 2017-02-17 2017-05-31 和创(北京)科技股份有限公司 Webpage display method, web data processing method and server
CN109729100B (en) * 2019-03-12 2021-04-13 Oppo广东移动通信有限公司 Webpage data hijacking monitoring method and device and computer readable storage medium
CN110719320B (en) * 2019-09-18 2022-05-27 上海联蔚数字科技集团股份有限公司 Method and equipment for generating public cloud configuration adjustment information

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102663289A (en) * 2012-03-22 2012-09-12 奇智软件(北京)有限公司 Method and device for intercepting rogue program of modifying page elements
CN105824813A (en) * 2015-01-05 2016-08-03 中国移动通信集团江苏有限公司 Core user excavate method and device

Also Published As

Publication number Publication date
CN112115400A (en) 2020-12-22
CN112115401A (en) 2020-12-22
CN111291300A (en) 2020-06-16

Similar Documents

Publication Publication Date Title
US11574290B2 (en) Data processing method and apparatus, computer device, and storage medium
CN111064711B (en) Block chain-based data stream detection method and device and server
CN108710681B (en) File acquisition method, device, equipment and storage medium
CN111132145B (en) Network communication safety monitoring method, device, server and network communication system
CN111291300B (en) Webpage data processing method and device, webpage system and server
RU2734027C2 (en) Method and device for preventing an attack on a server
CN108595280B (en) Interface adaptation method and device, computer equipment and storage medium
CN111260475A (en) Data processing method, block chain node point equipment and storage medium
CN110659019A (en) Parameter checking method and device and server
CN112988062B (en) Metadata reading limiting method and device, electronic equipment and medium
CN108133026B (en) Multi-data processing method, system and storage medium
CN112699034B (en) Virtual login user construction method, device, equipment and storage medium
CN111209074B (en) Browser view loading method, device and system and server
CN110209717B (en) Packaging method and device of basic database, computer equipment and storage medium
CN111414239A (en) Virtual machine mirror image management method, system and medium based on kylin cloud computing platform
CN115391188A (en) Scene test case generation method, device, equipment and storage medium
CN112732676B (en) Block chain-based data migration method, device, equipment and storage medium
CN111125744B (en) Code branch merging method, system, computer device and readable storage medium
CN112698932A (en) Industrial application program calling method and device, computer equipment and storage medium
CN107656728B (en) Application program instance creating method and cloud server
CN111125567A (en) Equipment marking method and device, electronic equipment and storage medium
CN116339767B (en) Application resource allocation method, device, computer equipment and storage medium
CN113709154B (en) Browser security processing method and device, computer equipment and storage medium
CN111131205B (en) Authority management method and device based on Restful interface
CN117081845A (en) Interception method and related device of data acquisition request

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: Room 506, 5th floor, Hecheng building, high tech Zone, Kunming City, Yunnan Province

Applicant after: Guo Chunyan

Address before: 276800 R & D building 2, Venture Center, high tech Zone, No.177, Gaoxin 6th Road, Donggang District, Rizhao City, Shandong Province

Applicant before: Guo Chunyan

TA01 Transfer of patent application right

Effective date of registration: 20201103

Address after: 518000 12 / F, block B, Chuang Ling Tong Science and technology building, No.1, Shihua Road, Fubao street, Futian District, Shenzhen City, Guangdong Province

Applicant after: Shenzhen Yinyan Data Technology Co.,Ltd.

Address before: Room 506, 5th floor, Hecheng building, high tech Zone, Kunming City, Yunnan Province

Applicant before: Guo Chunyan

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant
CP01 Change in the name or title of a patent holder
CP01 Change in the name or title of a patent holder

Address after: 12 / F, block B, chuanglingtong technology building, No.1 Shihua Road, Fubao street, Futian District, Shenzhen, Guangdong 518000

Patentee after: Shenzhen Yanyang Technology Service Co.,Ltd.

Address before: 12 / F, block B, chuanglingtong technology building, No.1 Shihua Road, Fubao street, Futian District, Shenzhen, Guangdong 518000

Patentee before: Shenzhen Yinyan Data Technology Co.,Ltd.

PE01 Entry into force of the registration of the contract for pledge of patent right
PE01 Entry into force of the registration of the contract for pledge of patent right

Denomination of invention: Web page data processing method, device, web page system, and server

Effective date of registration: 20230710

Granted publication date: 20201117

Pledgee: Shenzhen SME financing Company limited by guarantee

Pledgor: Shenzhen Yanyang Technology Service Co.,Ltd.

Registration number: Y2023980047933