CN114117299A - Website intrusion tampering detection method, device, equipment and storage medium - Google Patents

Website intrusion tampering detection method, device, equipment and storage medium Download PDF

Info

Publication number
CN114117299A
CN114117299A CN202111361696.7A CN202111361696A CN114117299A CN 114117299 A CN114117299 A CN 114117299A CN 202111361696 A CN202111361696 A CN 202111361696A CN 114117299 A CN114117299 A CN 114117299A
Authority
CN
China
Prior art keywords
webpage
information
determining
detection
text
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111361696.7A
Other languages
Chinese (zh)
Inventor
韩晓愈
傅强
梁彧
蔡琳
田野
王杰
杨满智
金红
陈晓光
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Eversec Beijing Technology Co Ltd
Original Assignee
Eversec Beijing Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Eversec Beijing Technology Co Ltd filed Critical Eversec Beijing Technology Co Ltd
Priority to CN202111361696.7A priority Critical patent/CN114117299A/en
Publication of CN114117299A publication Critical patent/CN114117299A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/958Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/554Detecting local intrusion or implementing counter-measures involving event detection and direct action
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1416Event detection, e.g. attack signature detection

Abstract

The embodiment of the invention discloses a method, a device, equipment and a storage medium for detecting website intrusion tampering, wherein the method comprises the following steps: acquiring a webpage information set of a website to be detected, wherein the webpage information set at least comprises at least one of the following webpage information: the method comprises the following steps of (1) webpage source codes, webpage domain names, webpage pictures and webpage text information; and selecting the webpage information to be detected from the webpage information set, determining a detection mode corresponding to the webpage information to be detected, performing corresponding tampering detection, and determining a tampering detection result. By acquiring the webpage information set, intrusion tampering detection is carried out on the webpage information to be detected in the webpage information set, and the security of the website to be detected is ensured. And selecting a proper detection mode according to the information type of the webpage information to be detected, and detecting the website to be detected at different angles, so that the accuracy of the detection result is improved.

Description

Website intrusion tampering detection method, device, equipment and storage medium
Technical Field
The embodiment of the invention relates to the technical field of computers, in particular to a method, a device, equipment and a storage medium for detecting website intrusion tampering.
Background
With the rapid development of the digital society, the internet has had a profound impact on business, industry, banking, finance, education, government and entertainment, and people's work and life, and much conventional information is being transplanted onto the internet. Once a hacker breaks through the website serving as an important platform of e-government affairs and e-commerce, important information and data can be acquired, destroyed or tampered, and meanwhile, great economic loss and bad social influence can be caused. Therefore, it becomes important to detect whether the website is hacked or not.
Disclosure of Invention
The invention provides a method, a device, equipment and a storage medium for detecting website intrusion tampering, so as to realize accurate detection of website intrusion tampering.
In a first aspect, an embodiment of the present invention provides a method for detecting website intrusion tampering, where the method includes:
acquiring a webpage information set of a website to be detected, wherein the webpage information set at least comprises at least one of the following webpage information: the method comprises the following steps of (1) webpage source codes, webpage domain names, webpage pictures and webpage text information;
and selecting the webpage information to be detected from the webpage information set, determining a detection mode corresponding to the webpage information to be detected, performing corresponding tampering detection, and determining a tampering detection result.
In a second aspect, an embodiment of the present invention further provides a device for detecting website intrusion tampering, where the device includes:
the information set acquisition module is used for acquiring a webpage information set of a website to be detected, wherein the webpage information set at least comprises at least one of the following webpage information: the method comprises the following steps of (1) webpage source codes, webpage domain names, webpage pictures and webpage text information;
and the detection module is used for selecting the webpage information to be detected from the webpage information set, determining the detection mode corresponding to the webpage information to be detected, performing corresponding tampering detection and determining a tampering detection result.
In a third aspect, an embodiment of the present invention further provides a computer device, where the computer device includes:
one or more processors;
a memory for storing one or more programs,
when the one or more programs are executed by the one or more processors, the one or more processors implement a method for detecting website intrusion tampering according to any one of the embodiments of the present invention.
In a fourth aspect, the embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements a method for detecting website intrusion tampering according to any one of the embodiments of the present invention.
The embodiment of the invention provides a method, a device, equipment and a storage medium for detecting website intrusion tampering, wherein a webpage information set of a website to be detected is obtained, and the webpage information set at least comprises at least one of the following webpage information: the method comprises the following steps of (1) webpage source codes, webpage domain names, webpage pictures and webpage text information; and selecting the webpage information to be detected from the webpage information set, determining a detection mode corresponding to the webpage information to be detected, performing corresponding tampering detection, and determining a tampering detection result. By acquiring the webpage information set, intrusion tampering detection is carried out on the webpage information to be detected in the webpage information set, and the security of the website to be detected is ensured. And selecting a proper detection mode according to the information type of the webpage information to be detected, and detecting the website to be detected at different angles, so that the accuracy of the detection result is improved.
Drawings
Fig. 1 is a flowchart of a website intrusion tamper detection method according to an embodiment of the present invention;
fig. 2 is a schematic structural diagram of a website intrusion tamper detection system according to an embodiment of the present invention;
fig. 3 is a flowchart of a website intrusion tamper detection method according to a second embodiment of the present invention;
fig. 4 is a diagram illustrating an implementation example of a website intrusion tamper detection method according to a second embodiment of the present invention;
fig. 5 is a schematic structural diagram of a website intrusion tamper detection device according to a third embodiment of the present invention;
fig. 6 is a schematic structural diagram of a computer device in the fourth embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings. It should be understood that the embodiments described are only a few embodiments of the present application, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the application, as detailed in the appended claims.
In the description of the present application, it is to be understood that the terms "first," "second," "third," and the like are used solely to distinguish one from another and are not necessarily used to describe a particular order or sequence, nor are they to be construed as indicating or implying relative importance. The specific meaning of the above terms in the present application can be understood by those of ordinary skill in the art as appropriate. Further, in the description of the present application, "a plurality" means two or more unless otherwise specified. "and/or" describes the association relationship of the associated objects, meaning that there may be three relationships, e.g., a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship.
Example one
Fig. 1 is a flowchart of a website intrusion tampering detection method according to an embodiment of the present application, where the method is suitable for detecting whether a website is intruded and tampered. The method can be performed by a computer device, which can be formed by two or more physical entities or by one physical entity. Generally, the computer device may be a notebook, a desktop computer, a smart tablet, and the like.
Fig. 2 is a schematic structural diagram of a website intrusion detection system provided in this embodiment, where the system includes a data source access module 11, a metadata screening module 12, a high performance detection engine 13, a lightweight message queue 14, a restul _ API interface 15, and an intrusion detection analysis module 16. The data source access module 11 is configured to obtain data of a website 111 to be detected, where the website to be detected may be a key website, a docket website, or the like. The metadata screening module 12 screens data, and the data screening includes data format screening, data validity screening, and data format parsing. The high-performance detection engine 13 realizes data detection, which includes implantation dark chain detection, web page picture detection and web page text information detection, and obtains a tampering detection result. Data is processed through a lightweight Message queue 14, where the lightweight Message queue 14 includes a Message data Producer, a Message, data consumer Consumers, a Message queue Message. And sending the tamper detection result obtained by detection to an intrusion tamper detection analysis module 16 through an open RESTFUL _ API (representational state language _ application program interface) 15 for detection result analysis.
As shown in fig. 1, a method for detecting website intrusion tampering according to this embodiment specifically includes the following steps:
s101, acquiring a webpage information set of a website to be detected, wherein the webpage information set at least comprises at least one of the following webpage information: the method comprises the following steps of webpage source codes, webpage domain names, webpage pictures and webpage text information.
In this embodiment, the website to be detected may be specifically understood as a website having a requirement for detecting whether the website is tampered with by intrusion, and the website to be detected in this application may be any website. The websites to be detected can be preset and are preset according to the importance degrees of different websites, and when the number of the websites to be detected is more than one, the same method is adopted for carrying out intrusion tamper detection on each website to be detected. A set of web page information may be understood in particular as a data set consisting of different types of web page information. The webpage picture can be specifically understood as a picture displayed in a webpage; the text information of the web page may be specifically understood as word information in the web page, for example, words such as chinese, english, etc. Acquiring webpage information of a website to be detected, wherein the webpage information is a webpage source code, a webpage domain name, a webpage picture or webpage text information. The method for acquiring the webpage information of the website to be detected can be through crawler collection. The webpage domain name is extracted from the MD5.txt file, and the webpage text information is obtained from the MD5.txt and MD5.html files.
S102, selecting the webpage information to be detected from the webpage information set, determining a detection mode corresponding to the webpage information to be detected, performing corresponding tampering detection, and determining a tampering detection result.
In this embodiment, the web page information to be detected may be specifically understood as web page information having a detection requirement, and different types of web page information are different, and different types of web page information need to be detected in different manners. The tampering detection result can be specifically understood as a detection result obtained after intrusion tampering detection, and can be tampering or non-tampering; when the tampering detection result is that tampering occurs, the tampering detection result can be directly represented by the tampering type.
Specifically, one webpage information is selected from the webpage information set as the webpage information to be detected, and detection is performed. During detection, only one item of the webpage information in the webpage information set can be selected for detection, and multiple items of webpage information can also be selected for detection. When various types of webpage information need to be detected, one type of webpage information can be selected as the webpage information to be detected, after the detection of the webpage information to be detected is completed, one type of webpage information is selected from the webpage information set again as new webpage information to be detected, and a proper detection mode is selected for detection. And when the webpage information to be detected is a webpage source code or a webpage domain name, performing implantation chain detection, wherein the implantation chain detection comprises regular expression detection, generic domain name detection and website detection. When the web page information to be detected is web page pictures or web page text information, content detection is performed through a machine learning or neural network technology, for example, whether the web page pictures or the web page text information contain negative information or negative characters is detected. And obtaining a tampering detection result through tampering detection.
The embodiment of the invention provides a website intrusion tamper detection method, which is characterized in that a webpage information set of a website to be detected is obtained, and the webpage information set at least comprises at least one of the following webpage information: the method comprises the following steps of (1) webpage source codes, webpage domain names, webpage pictures and webpage text information; and selecting the webpage information to be detected from the webpage information set, determining a detection mode corresponding to the webpage information to be detected, performing corresponding tampering detection, and determining a tampering detection result. By acquiring the webpage information set, intrusion tampering detection is carried out on the webpage information to be detected in the webpage information set, and the security of the website to be detected is ensured. And selecting a proper detection mode according to the information type of the webpage information to be detected, and detecting the website to be detected at different angles, so that the accuracy of the detection result is improved.
Example two
Fig. 3 is a flowchart of a website intrusion tamper detection method according to a second embodiment of the present invention. The technical scheme of the embodiment is further refined on the basis of the technical scheme, and specifically mainly comprises the following steps:
s201, acquiring a webpage information set of the website to be detected.
And when the webpage information to be detected is the webpage source code, executing S202-S203 and determining a tampering detection result.
S202, acquiring a predetermined regular expression set.
In this embodiment, a regular expression set may be specifically understood as a data set composed of one or more regular expressions.
It should be noted that, when detecting the web page source code, the principle is to analyze the web page source code to detect the dark chain therein (i.e. hidden links, which is one of the cheating methods of black-hat SEO). In order to detect whether the webpage source code structure is maliciously modified (the content is not visible), a black-hat SEO common transceiving 'dark chain' is detected, and whether the webpage source code structure is tampered is detected.
The common types of dark chains are of type 3: the attribute color of the set label is invisible, the position is invisible, and the attribute is not displayed. And aiming at different dark chain types, detecting by setting corresponding regular expressions. And predetermining regular expressions required by detecting different types of dark chains, and forming a regular expression set according to the regular expressions and storing the regular expression set. When intrusion tampering detection is carried out on the webpage source code, the regular expression set is directly obtained.
S203, carrying out character string matching detection on the webpage source codes according to the regular expressions in the regular expression set, and determining a tampering detection result.
Carrying out character string matching on the webpage source codes sequentially through the regular expressions in the regular expression set, and if the matching is successful, tampering the webpage source codes; and if the matching is not successful, the webpage source code is not tampered.
And when the webpage information to be detected is the webpage domain name, executing S204-S207 or executing S208-S210 to determine and tamper the detection result.
And S204, acquiring and analyzing the webpage source codes, and determining a webpage hyperlink set.
In this embodiment, a web page hyperlink assembly may be specifically understood as a data assembly formed by all hyperlinks of a web page, i.e. an all-out link assembly. If the webpage information to be detected is a webpage domain name, intrusion tampering detection can be universal second-level domain name detection or whether the website domain name is tampered, and different modes are adopted for detecting two different types of detection. Performing generic domain name detection, and performing detection through steps S204-S209; whether the website domain name is tampered or not is detected, and the steps S210-S212 are used for detecting.
Specifically, when performing the generic domain name detection on the web page domain name, a web page source code is acquired and analyzed to obtain one or more web page hyperlinks, and a web page hyperlink set is formed by the one or more web page hyperlinks.
S205, determining a target secondary domain name according to the web page hyperlink set and the web page domain name.
In this embodiment, the target secondary domain name may be specifically understood as a secondary domain name of the web page hyperlink that does not match the secondary domain name of the web page domain name.
Specifically, secondary domain names are extracted from the web page hyperlinks and the web page domain names in the web page hyperlink set respectively, the secondary domain names of the web page hyperlink set and the web page domain names are matched in a circulating mode, and the target secondary domain name is determined according to a matching result.
As an optional embodiment of this embodiment, this optional embodiment further optimizes determining the target secondary domain name according to the web page hyperlink set and the web page domain name as:
a1, extracting a secondary domain name from the web page hyperlinks in the web page hyperlink set to obtain at least one hyperlink secondary domain name.
In this embodiment, the hyperlink secondary domain name may be specifically understood as a secondary domain name of the web page hyperlink. And respectively extracting a secondary domain name from each webpage hyperlink in the webpage hyperlink set to obtain a hyperlink secondary domain name.
A2, extracting a secondary domain name from the webpage domain name to obtain the secondary domain name of the webpage.
In this embodiment, the second-level domain name of the web page may be specifically understood as a second-level domain name corresponding to the domain name of the web page. And extracting the secondary domain name of the webpage domain name to obtain the secondary domain name of the webpage.
And A3, comparing each hyperlink secondary domain name with the web page secondary domain name respectively.
And for each hyperlink secondary domain name, matching and comparing the other webpage secondary domain names respectively to determine whether the hyperlink secondary domain name is the same as the webpage secondary domain name.
And A4, determining the hyperlink secondary domain name with different comparison results as the target secondary domain name.
And determining the comparison result as different hyperlink secondary domain names, and determining the part of hyperlink secondary domain names as target secondary domain names.
S206, counting the number of the target secondary domain names.
S207, judging whether the number is larger than a first preset number threshold value, if so, executing S208; otherwise, S209 is executed.
And S208, determining that the tampering detection result is the universal second-level domain name tampering.
S209, determining that the tampering detection result is that no tampering occurs.
In this embodiment, the first preset number threshold may be specifically understood as a boundary value for determining whether the number of the target secondary domain names is within a normal range. The first preset number threshold may be set as required. Comparing the number with a first preset number threshold, and when the number is larger than the first preset number threshold, determining that the tampering detection result is the second-level domain name tampering; and when the number is smaller than or equal to a first preset number threshold value, determining that the tampering detection result is that no tampering occurs.
S210, outputting the webpage domain name to a domain name detection platform through a preset webpage safety interface.
In this embodiment, the web page security interface may be specifically understood as an interface for performing a web page intrusion tamper check to ensure web page security. The domain name detection platform can specifically detect whether the webpage domain name is a tampering detection platform, and can also verify whether other functions of the website are accurate. And outputting the webpage domain name to a domain name detection platform through a webpage safety interface so as to enable the domain name detection platform to carry out domain name detection.
And S211, receiving a domain name detection result returned by the domain name detection platform.
In this embodiment, the domain name detection result may be a domain name normal or a domain name abnormal. And the domain name detection platform detects the webpage domain name and verifies whether the webpage domain name is normal.
S212, analyzing the domain name detection result and determining a tampering detection result.
When the domain name detection result is normal, the tampering detection result is that tampering does not occur; and when the domain name detection result is abnormal, the tampering detection result is that the website domain name tampering occurs.
And when the webpage information to be detected is the webpage picture, executing S213-S215 and determining a tampering detection result.
S213, inputting the webpage picture into a predetermined picture detection network model, and training the picture detection network model according to the detection data set and the classification data set to obtain the webpage picture.
In this embodiment, the picture detection network model may be specifically understood as a neural network model for identifying an object existing in the picture. Detection Datasets (Detection Datasets) have many limitations, the information of Classification tags is too small, the number of pictures is smaller than that of Classification Datasets, and the cost of the Detection Datasets is too high to be used as the Classification Datasets. The classified data set has a large number of pictures and abundant classification information. The present application proposes a new training method, a joint training algorithm, that is, by mixing the data of the detection data set and the classification data set together, classifying objects using a hierarchical view, and expanding the detection data set with a huge amount of classification data set data, thereby mixing two different data sets. Training Object Detectors (Object Detectors) on the detection data set and the classification data set, learning the accurate position of the Object by using the data of the detection data set, and increasing the classification quantity of the classification and the robustness of the model by using the data of the classification data set. Training is carried out through the detection data set and the data in the classification data set to obtain a picture detection network model, the webpage picture is input into the picture detection network model, and the picture detection network model carries out prediction processing on the webpage picture according to the learned experience.
As an optional embodiment of this embodiment, the optional embodiment further optimizes the training of the picture detection network model, where the training step of the picture detection network model includes:
and B1, acquiring a detection data set and a classification data set, wherein the pictures to be trained in the detection data set and the classification data set are correspondingly associated with standard information, and the standard information comprises standard position information and standard category information.
In this embodiment, the picture to be trained may be specifically understood as a picture for model training; the standard information can be specifically understood as information for labeling a target in a picture to be trained, for example, the picture to be trained includes a cat and a stamp, the standard information for labeling the cat is the cat, the abscissa is 30-50 pixel points, and the ordinate is 40-70 pixel points, wherein the cat is standard category information; the abscissa is 30-50 pixel points, and the ordinate is 40-70 pixel points, which are standard position information. The standard position information may also be expressed by other means, such as the coordinates of the left vertex, and the length and width, so that a matrix box can be determined, and the position of the rectangular box is the position of the target. And labeling the to-be-trained pictures in the detection data set and the classification data set in advance, and directly acquiring the data set when model training is carried out.
And B2, inputting the corresponding picture to be trained under the current iteration into the current network model to be trained to obtain prediction information, wherein the prediction information comprises prediction position information and prediction type information.
In the present embodiment, the network model to be trained may be specifically understood as an unfinished training, deep learning based neural network model. The prediction information may be specifically understood as information obtained by model prediction, and the prediction information includes prediction position information and thus prediction category information.
Specifically, the image to be trained corresponding to the current iteration is input into the current network model to be trained, and the network model to be trained predicts according to the current network parameters, so as to obtain the predicted position information and the predicted category information corresponding to each target in the image to be trained.
And B3, obtaining a corresponding loss function by adopting a given loss function expression and combining the standard information and the prediction information.
In this embodiment, the loss function expression may be understood as an expression for calculating a loss function, and when the network model to be trained is reversely propagated, parameters of the model need to be adjusted through the loss function. The loss function may be a GAN loss function, an L1 loss function, a focal loss function, a VGG percentual loss function, or the like.
Specifically, for each picture to be trained, a loss function expression is adopted to calculate according to the corresponding standard information and the corresponding prediction information, so as to obtain a corresponding loss function. When a plurality of targets exist in one picture, each target corresponds to the standard information and the prediction information, the loss function corresponding to each target can be calculated in sequence, and after a plurality of loss functions are obtained, calculation is performed according to the plurality of loss functions to obtain a final loss function which is used as the loss function of the iteration.
And B4, performing back propagation on the network model to be trained based on the loss function to obtain the network model to be trained for the next iteration until the iteration convergence condition is met, and obtaining the picture detection network model.
And in the training process of the neural network model, continuously updating the adjustment model by a back propagation method until the output of the model is consistent with the target. And after the loss function is determined, performing back propagation on the network model to be trained by using the loss function to obtain the image detection network model meeting the convergence condition. The embodiment of the invention does not limit the specific back propagation process and can be set according to specific conditions. After the model training is finished, the prediction of the class and the position of the object in the picture can be realized through the picture detection network model.
S214, determining a target object according to an output result of the picture detection network model.
In this embodiment, the target object may be specifically understood as an object in a web page picture. After the webpage picture is input into the picture detection network model, the picture detection network model carries out prediction processing on the webpage picture according to network parameters to obtain the position of the target object and the category of the target object.
S215, carrying out abnormity detection on the characters to be detected in the target object, and determining a tampering detection result according to the abnormity detection result.
In this embodiment, the characters to be detected may be specifically understood as characters included in the target object, for example, the target object is a stamp, and the characters in the stamp are the characters to be detected. The abnormality detection result may be word abnormality or word normality. The target object may have characters to be detected, the characters to be detected may be abnormal characters, for example, negative information, improper statements and the like are included, whether the characters to be detected are abnormal is detected, and a tampering detection result is determined according to an abnormal detection result, for example, when the characters are abnormal, the tampering detection result is tampering; when the characters are normal, the tampering detection result is that no tampering occurs.
And executing S216-S219 when the webpage information to be detected is webpage text information.
S216, acquiring the webpage source code and determining the text label in the webpage source code.
In the present embodiment, the text tag is a text tag when a web page is designed by HTML. When detecting whether the webpage text information is subjected to intrusion tampering, acquiring a webpage source code, and carrying out intrusion tampering detection on the webpage text information according to the webpage source code. And directly acquiring the webpage source code, and analyzing the webpage source code to obtain all text labels in the webpage source code.
And S217, determining a target text according to each text label and the webpage text information.
In this embodiment, the target text may be specifically understood as text screened from the webpage text information. And determining texts in the webpage text information according to the text labels, and further screening the lengths of the texts to obtain target texts meeting the conditions.
As an optional embodiment of this embodiment, this optional embodiment further optimizes determining the target text according to each text label and the webpage text information as follows:
and C1, determining the text length of the text corresponding to each text label in the webpage text information.
In this embodiment, the text length may be specifically understood as the length of data included in the text. And searching a text corresponding to each text label in the webpage text information, and determining the text length of each text.
And C2, determining the target text length meeting the preset length condition in the text lengths.
In this embodiment, the preset length condition is a preset length range, for example, 2 to 20. And sequentially judging whether each text length meets a preset length condition, and if so, determining the text length as the target text length.
And C3, determining the text corresponding to the target text length as the target text.
And determining texts corresponding to the lengths of the target texts, and determining the part of texts as the target texts.
S218, carrying out anomaly detection on each target text, and determining an abnormal text.
In this embodiment, the abnormal text may be specifically understood as a text containing abnormal words and information. And analyzing each target text to determine whether the information in the abnormal text is abnormal, for example, whether the target text information contains excessive sensitive information.
As an optional embodiment of this embodiment, this optional embodiment further performs anomaly detection on each target text, and determines that an abnormal text is optimized as follows:
and D1, determining the editing distance between the target text and the predetermined abnormal character information base for each target text.
In this embodiment, the abnormal text information base may be specifically understood as an information base composed of abnormal words and abnormal sentences. The edit distance is specifically understood as the minimum number of editing operations required to convert one string into another string. Permitted editing operations include replacing one character with another, inserting one character, and deleting one character. And for each target text, calculating the editing distance between each word or sentence in the target text and the abnormal character information base in a sequence ratio equivalent mode. The manner in which the edit distance is calculated may be calculated by machine learning modeling.
D2, counting the number of abnormal words with editing distance meeting the preset distance condition.
In this embodiment, the number of abnormal words may be specifically understood as the number of abnormal words. The preset distance condition may be understood as a preset distance range condition, for example, greater than 0.75. And comparing the editing distance corresponding to each word or sentence in the target text with a preset distance condition, determining the editing distance meeting the preset distance condition, wherein the word or sentence corresponding to the editing distance is an abnormal word, and counting the number of the abnormal words to obtain the number of the abnormal words.
D3, when the number of the abnormal words is larger than a third preset number threshold, determining that the target text is an abnormal text.
In this embodiment, the third preset number threshold may be specifically understood as a number threshold for determining whether the target text is abnormal, and may be preset according to a requirement. Comparing the number of the abnormal words with a third preset number threshold, and when the number of the abnormal words is larger than the third preset number threshold, determining that the target text is an abnormal text and is possibly tampered; and when the number of the abnormal words is less than or equal to a third preset number threshold, determining that the target text is a normal text and is not tampered.
S219, judging whether the number of the abnormal texts is larger than a second preset number threshold value, if so, executing S220; otherwise, S221 is executed.
S220, determining that the tampering detection result is webpage tampering.
S221, determining that the tampering detection result is that no tampering occurs.
In this embodiment, the second preset number threshold may be specifically understood as a threshold for determining whether the number of the abnormal texts meets the requirement. The values of the first preset quantity threshold, the second preset quantity threshold and the third preset quantity threshold in the application can be the same or different, and the values can be set according to requirements in practical application. And counting the number of the abnormal texts, and comparing the number of the abnormal texts with the size of a second preset number threshold. And when the number of the abnormal texts is larger than a second preset number threshold, determining that the tampering detection result is webpage tampering.
And analyzing the common character tampering behaviors, extracting the characteristics of the tampered text and forming an abnormal character information base. And modeling the webpage character content by utilizing a machine learning technology, and automatically judging whether the text is maliciously tampered. And an early warning mode can be set, early warning is carried out after the webpage text information is tampered, and tampered information and danger degree scores are output. The text tampering can occur anywhere on the page, the tampered forms are various, and the embodiment of the application can automatically detect the tampered forms of various texts.
As an optional embodiment of this embodiment, the optional embodiment further optimizes including: and generating an early warning work order according to at least one tampering detection result, and sending the early warning work order to a corresponding user.
In this embodiment, the early warning work order may be specifically understood as a work order for early warning a user, and is used to remind the user that a website has been tampered with, and to process the website in time to ensure the security of the website. The user in the embodiment of the application can be a manager, a maintainer and the like corresponding to the website to be detected. And pre-selecting the user corresponding to the website to be detected. And when at least one or more than a preset number of tampering detection results is tampering, generating an early warning work order and issuing the early warning work order to a corresponding user. The early warning work order may include a type of tampering, for example, a universal second-level domain name tampering, so that a user may determine the type of tampering in time and execute corresponding processing. The sending mode can be set as sending to a mailbox, sending to a mobile phone through a short message, or sending to a corresponding account through an operating system. Meanwhile, an operation report can be generated according to the tampering detection result.
Further, fig. 4 is a diagram illustrating an implementation example of a website intrusion tamper detection method according to an embodiment of the present application.
S301, start.
S302, collecting network data of the website to be detected.
The network data may be collected by a crawler.
S303, acquiring the webpage data file from the network data.
S304, acquiring the webpage source code file from the network data.
S305, acquiring a webpage screenshot file from the network data.
S306, according to the webpage source code file and the webpage data file, a webpage URL, a webpage domain name, a webpage label, a webpage keyword, a webpage description, a webpage short text, a short text hyperlink, a webpage text set and a webpage link set can be obtained, namely, the webpage source code, the webpage domain name and the webpage text information are included.
S307, obtaining the webpage URL, the webpage domain name and the webpage link set according to the webpage URL, the webpage domain name, the webpage label, the webpage keyword, the webpage description, the webpage short text, the short text hyperlink, the webpage text set and the webpage link set.
S308, judging whether the URL is valid, and if not, executing S309; otherwise, S310 is performed.
S309, discarding the data.
S310, detecting an implantation chain. Implant chain detection includes: website detection, generic domain name detection and regular expression detection, and execute S322.
The website detection is to detect the domain name of the webpage through a domain name detection platform to obtain a detection result. And the universal second-level domain name detection means that a second-level domain name is determined through the web page hyperlink set and the web page domain name for detection, and a detection result is obtained. The regular expression detection is to detect the webpage source code through the regular expression to obtain a detection result.
S311, obtaining the short text and the short text hyperlink of the webpage according to the URL of the webpage, the domain name of the webpage, the label of the webpage, the keyword of the webpage, the description of the webpage, the short text hyperlink, the text set of the webpage and the link set of the webpage.
S312, judging whether the short text of the webpage and the short text hyperlink contain Chinese, if so, executing S313; otherwise, S314 is executed.
And determining the text length of the text corresponding to the text label in the webpage text information under the condition that the webpage short text and the short text hyperlink contain Chinese.
S313, judging whether the text length meets a preset length condition, if not, executing S314; otherwise, S315 is performed.
And determining the target text length of which the text length meets the preset length condition, and determining the text corresponding to the target text length as the target text.
And S314, discarding the data.
S315, carrying out abnormity detection on the target text and determining an abnormal text.
S316, comparing the number of the abnormal texts with a second preset number threshold value to obtain a tampering detection result, and executing S322.
And S317, acquiring a webpage picture according to the webpage screenshot file.
S318, judging whether the webpage picture is effective or not, and if not, executing S319; otherwise, S320 is performed.
And S319, discarding the data.
And S320, detecting the webpage picture.
The web page picture may be detected through a picture detection network model.
S321, a tamper detection result is obtained, and S322 is executed.
And S322, summarizing the tampering detection results.
S323 outputs the detection result, and executes S324 and S325, respectively.
And S324, generating an operation report.
And S325, generating and issuing an early warning work order.
And S326, ending.
The embodiment of the invention provides a website intrusion tamper detection method, which is characterized in that a webpage information set of a website to be detected is obtained, and the webpage information set at least comprises at least one of the following webpage information: the method comprises the following steps of (1) webpage source codes, webpage domain names, webpage pictures and webpage text information; and selecting the webpage information to be detected from the webpage information set, determining a detection mode corresponding to the webpage information to be detected, performing corresponding tampering detection, and determining a tampering detection result. By acquiring the webpage information set, intrusion tampering detection is carried out on the webpage information to be detected in the webpage information set, and the security of the website to be detected is ensured. And selecting a proper detection mode according to the information type of the webpage information to be detected, and detecting the website to be detected at different angles, so that the accuracy of the detection result is improved. And in the detection process, the picture toilet cleaning network model is obtained through detection data set classification data set training, so that the accuracy of position prediction is ensured, the classification quantity can be increased, the robustness of the model is improved, and the accuracy of intrusion tampering detection is improved.
EXAMPLE III
Fig. 5 is a schematic structural diagram of a website intrusion tamper detection device according to a third embodiment of the present invention, where the device includes: an information set acquisition module 41 and a detection module 42.
The information set obtaining module 41 is configured to obtain a web page information set of a to-be-detected website, where the web page information set at least includes at least one of the following web page information: the method comprises the following steps of (1) webpage source codes, webpage domain names, webpage pictures and webpage text information;
the detection module 42 is configured to select the web page information to be detected from the web page information set, determine a detection mode corresponding to the web page information to be detected, perform corresponding tamper detection, and determine a tamper detection result.
The embodiment of the invention provides a website intrusion tampering detection device, which is characterized in that a webpage information set of a website to be detected is obtained, and the webpage information set at least comprises at least one of the following webpage information: the method comprises the following steps of (1) webpage source codes, webpage domain names, webpage pictures and webpage text information; and selecting the webpage information to be detected from the webpage information set, determining a detection mode corresponding to the webpage information to be detected, performing corresponding tampering detection, and determining a tampering detection result. By acquiring the webpage information set, intrusion tampering detection is carried out on the webpage information to be detected in the webpage information set, and the security of the website to be detected is ensured. And selecting a proper detection mode according to the information type of the webpage information to be detected, and detecting the website to be detected at different angles, so that the accuracy of the detection result is improved.
Further, when the web page information to be detected is a web page source code, the detecting module 42 includes:
the expression acquisition unit is used for acquiring a predetermined regular expression set;
and the matching detection unit is used for carrying out character string matching detection on the webpage source codes according to the regular expressions in the regular expression set and determining a tampering detection result.
Further, when the information of the web page to be detected is a domain name of the web page, the detecting module 42 includes:
the hyperlink determining unit is used for acquiring and analyzing the webpage source codes and determining a webpage hyperlink set;
the second-level domain name determining unit is used for determining a target second-level domain name according to the webpage hyperlink set and the webpage domain name;
the quantity determining unit is used for counting the quantity of the target secondary domain names;
the second-level domain name detection unit is used for judging whether the number is larger than a first preset number threshold value or not, and if so, determining that the tampering detection result is the second-level domain name tampering; otherwise, determining that the tampering detection result is that no tampering occurs.
Further, a secondary domain name determining unit, configured to extract a secondary domain name for the web page hyperlinks in the web page hyperlink set to obtain at least one hyperlink secondary domain name; extracting a second-level domain name from the webpage domain name to obtain a second-level domain name of the webpage; comparing each hyperlink secondary domain name with a webpage secondary domain name respectively; and determining the hyperlink secondary domain name with different comparison results as the target secondary domain name.
Further, when the information of the web page to be detected is a domain name of the web page, the detecting module 42 includes:
the domain name output unit is used for outputting the webpage domain name to a domain name detection platform through a preset webpage safety interface;
the detection result receiving unit is used for receiving the domain name detection result returned by the domain name detection platform;
and the detection result analysis unit is used for analyzing the domain name detection result and determining a tampering detection result.
Further, when the web page information to be detected is a web page picture, the detecting module 42 includes:
the model input unit is used for inputting the webpage picture into a predetermined picture detection network model, and the picture detection network model is obtained by training according to a detection data set and a classification data set;
the model output unit is used for determining a target object according to an output result of the picture detection network model;
and the abnormality detection unit is used for carrying out abnormality detection on the characters to be detected in the target object and determining a tampering detection result according to an abnormality detection result.
Further, the apparatus further comprises:
the system comprises a data set acquisition module, a data set acquisition module and a classification data set, wherein the data set acquisition module is used for acquiring a detection data set and a classification data set, the detection data set and the classification data set are associated with standard information corresponding to a picture to be trained, and the standard information comprises standard position information and standard category information;
the prediction information determining module is used for inputting the corresponding picture to be trained under the current iteration into the current network model to be trained to obtain prediction information, and the prediction information comprises prediction position information and prediction category information;
the loss function determining module is used for acquiring a corresponding loss function by adopting a given loss function expression and combining the standard information and the prediction information;
and the back propagation module is used for carrying out back propagation on the network model to be trained based on the loss function to obtain the network model to be trained for the next iteration until an iteration convergence condition is met, and obtaining the picture detection network model.
Further, when the web page information to be detected is web page text information, the detecting module 42 includes:
the system comprises a label determining unit, a text label determining unit and a text label determining unit, wherein the label determining unit is used for acquiring a webpage source code and determining a text label in the webpage source code;
the target text determining unit is used for determining a target text according to each text label and the webpage text information;
an abnormal text determining unit, configured to perform abnormal detection on each target text, and determine an abnormal text;
the tampering detection unit is used for judging whether the number of the abnormal texts is larger than a second preset number threshold value or not, and if so, determining that a tampering detection result is webpage tampering; otherwise, determining that the tampering detection result is that no tampering occurs.
Further, the target text determining unit is specifically configured to determine a text length of a text corresponding to each text label in the web page text information; determining a target text length which meets a preset length condition in the text lengths; and determining the text corresponding to the target text length as the target text.
Further, the abnormal text determining unit is specifically configured to determine, for each target text, an editing distance between the target text and a predetermined abnormal character information base; counting the number of abnormal words of which the editing distance meets a preset distance condition; and when the number of the abnormal words is larger than a third preset number threshold, determining that the target text is an abnormal text.
Further, the apparatus further comprises:
and the work order sending module is used for generating an early warning work order according to at least one tampering detection result and sending the early warning work order to a corresponding user.
The website intrusion tamper detection device provided by the embodiment of the invention can execute the website intrusion tamper detection method provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method.
Example four
Fig. 6 is a schematic structural diagram of a computer apparatus according to a fourth embodiment of the present invention, as shown in fig. 6, the apparatus includes a processor 50, a memory 51, an input device 52, and an output device 53; the number of processors 50 in the device may be one or more, and one processor 50 is taken as an example in fig. 6; the processor 50, the memory 51, the input device 52 and the output device 53 in the apparatus may be connected by a bus or other means, as exemplified by the bus connection in fig. 6.
The memory 51 is a computer-readable storage medium, and can be used for storing software programs, computer-executable programs, and modules, such as program instructions/modules corresponding to the website intrusion tamper detection method in the embodiment of the present invention (for example, the information set acquisition module 41 and the detection module 42 in the website intrusion tamper detection device). The processor 50 executes various functional applications and data processing of the device by executing software programs, instructions and modules stored in the memory 51, so as to implement the above-mentioned website intrusion tamper detection method.
The memory 51 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to the use of the terminal, and the like. Further, the memory 51 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some examples, the memory 51 may further include memory located remotely from the processor 50, which may be connected to the device over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The input device 52 is operable to receive input numeric or character information and to generate key signal inputs relating to user settings and function controls of the apparatus. The output device 53 may include a display device such as a display screen.
EXAMPLE five
An embodiment of the present invention further provides a storage medium containing computer-executable instructions, where the computer-executable instructions are executed by a computer processor to perform a method for detecting website intrusion tampering, and the method includes:
acquiring a webpage information set of a website to be detected, wherein the webpage information set at least comprises at least one of the following webpage information: the method comprises the following steps of (1) webpage source codes, webpage domain names, webpage pictures and webpage text information;
and selecting the webpage information to be detected from the webpage information set, determining a detection mode corresponding to the webpage information to be detected, performing corresponding tampering detection, and determining a tampering detection result.
Of course, the storage medium provided by the embodiment of the present invention contains computer-executable instructions, and the computer-executable instructions are not limited to the operations of the method described above, and may also perform related operations in the website intrusion tamper detection method provided by any embodiment of the present invention.
From the above description of the embodiments, it is obvious for those skilled in the art that the present invention can be implemented by software and necessary general hardware, and certainly, can also be implemented by hardware, but the former is a better embodiment in many cases. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which can be stored in a computer-readable storage medium, such as a floppy disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a FLASH Memory (FLASH), a hard disk or an optical disk of a computer, and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device) to execute the methods according to the embodiments of the present invention.
It should be noted that, in the embodiment of the website intrusion tamper detection apparatus, each unit and each module included in the embodiment are only divided according to functional logic, but are not limited to the above division, as long as the corresponding function can be implemented; in addition, specific names of the functional units are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present invention.
It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims (14)

1. A website intrusion tamper detection method is characterized by comprising the following steps:
acquiring a webpage information set of a website to be detected, wherein the webpage information set at least comprises at least one of the following webpage information: the method comprises the following steps of (1) webpage source codes, webpage domain names, webpage pictures and webpage text information;
and selecting the webpage information to be detected from the webpage information set, determining a detection mode corresponding to the webpage information to be detected, performing corresponding tampering detection, and determining a tampering detection result.
2. The method according to claim 1, wherein when the web page information to be detected is a web page source code, determining a detection mode corresponding to the web page information to be detected and performing corresponding tamper detection, and determining a tamper detection result, includes:
acquiring a predetermined regular expression set;
and performing character string matching detection on the webpage source code according to the regular expressions in the regular expression set, and determining a tampering detection result.
3. The method according to claim 1, wherein when the web page information to be detected is a web page domain name, determining a detection mode corresponding to the web page information to be detected and performing corresponding tamper detection, and determining a tamper detection result, comprises:
acquiring and analyzing a webpage source code, and determining a webpage hyperlink set;
determining a target secondary domain name according to the webpage hyperlink set and the webpage domain name;
counting the number of the target secondary domain names;
judging whether the number is larger than a first preset number threshold value or not, if so, determining that the tampering detection result is the second-level domain name tampering; otherwise, determining that the tampering detection result is that no tampering occurs.
4. The method of claim 3, wherein determining the target secondary domain name from the set of web page hyperlinks and the web page domain name comprises:
extracting a secondary domain name from the web page hyperlinks in the web page hyperlink set to obtain at least one hyperlink secondary domain name;
extracting a second-level domain name from the webpage domain name to obtain a second-level domain name of the webpage;
comparing each hyperlink secondary domain name with a webpage secondary domain name respectively;
and determining the hyperlink secondary domain name with different comparison results as the target secondary domain name.
5. The method according to claim 1, wherein when the web page information to be detected is a web page domain name, determining a detection mode corresponding to the web page information to be detected and performing corresponding tamper detection, and determining a tamper detection result, comprises:
outputting the webpage domain name to a domain name detection platform through a preset webpage safety interface;
receiving a domain name detection result returned by the domain name detection platform;
and analyzing the domain name detection result to determine a tampering detection result.
6. The method according to claim 1, wherein when the web page information to be detected is a web page picture, determining a detection mode corresponding to the web page information to be detected and performing corresponding tamper detection, and determining a tamper detection result, comprises:
inputting the webpage picture into a predetermined picture detection network model, and training the picture detection network model according to a detection data set and a classification data set to obtain the webpage picture;
determining a target object according to an output result of the picture detection network model;
and carrying out anomaly detection on the characters to be detected in the target object, and determining a tampering detection result according to an anomaly detection result.
7. The method of claim 6, wherein the step of training the picture inspection network model comprises:
acquiring a detection data set and a classification data set, wherein the pictures to be trained in the detection data set and the classification data set are correspondingly associated with standard information, and the standard information comprises standard position information and standard category information;
inputting a corresponding picture to be trained under current iteration into a current network model to be trained to obtain prediction information, wherein the prediction information comprises prediction position information and prediction category information;
obtaining a corresponding loss function by adopting a given loss function expression and combining the standard information and the prediction information;
and performing back propagation on the network model to be trained based on the loss function to obtain the network model to be trained for the next iteration until an iteration convergence condition is met, and obtaining a picture detection network model.
8. The method according to claim 1, wherein when the web page information to be detected is web page text information, determining a detection mode corresponding to the web page information to be detected and performing corresponding tamper detection, and determining a tamper detection result, comprises:
acquiring a webpage source code and determining a text label in the webpage source code;
determining a target text according to each text label and webpage text information;
carrying out anomaly detection on each target text to determine an abnormal text;
judging whether the number of the abnormal texts is larger than a second preset number threshold value or not, and if so, determining that the tampering detection result is webpage tampering; otherwise, determining that the tampering detection result is that no tampering occurs.
9. The method of claim 8, wherein determining a target text based on each of the text labels and web page text information comprises:
determining the text length of the text corresponding to each text label in the webpage text information;
determining a target text length which meets a preset length condition in the text lengths;
and determining the text corresponding to the target text length as the target text.
10. The method according to claim 8, wherein the performing anomaly detection on each target text to determine an anomalous text comprises:
aiming at each target text, determining the editing distance between the target text and a predetermined abnormal character information base;
counting the number of abnormal words of which the editing distance meets a preset distance condition;
and when the number of the abnormal words is larger than a third preset number threshold, determining that the target text is an abnormal text.
11. The method of any one of claims 1-10, further comprising:
and generating an early warning work order according to at least one tampering detection result, and sending the early warning work order to a corresponding user.
12. A website intrusion tamper detection device, comprising:
the information set acquisition module is used for acquiring a webpage information set of a website to be detected, wherein the webpage information set at least comprises at least one of the following webpage information: the method comprises the following steps of (1) webpage source codes, webpage domain names, webpage pictures and webpage text information;
and the detection module is used for selecting the webpage information to be detected from the webpage information set, determining the detection mode corresponding to the webpage information to be detected, performing corresponding tampering detection and determining a tampering detection result.
13. A computer device, the device comprising:
one or more processors;
a memory for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement the method of website intrusion tamper detection according to any one of claims 1-11.
14. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out a method for detecting intrusion tampering with a website as claimed in any one of claims 1 to 11.
CN202111361696.7A 2021-11-17 2021-11-17 Website intrusion tampering detection method, device, equipment and storage medium Pending CN114117299A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111361696.7A CN114117299A (en) 2021-11-17 2021-11-17 Website intrusion tampering detection method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111361696.7A CN114117299A (en) 2021-11-17 2021-11-17 Website intrusion tampering detection method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN114117299A true CN114117299A (en) 2022-03-01

Family

ID=80396074

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111361696.7A Pending CN114117299A (en) 2021-11-17 2021-11-17 Website intrusion tampering detection method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114117299A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115396237A (en) * 2022-10-27 2022-11-25 浙江鹏信信息科技股份有限公司 Webpage malicious tampering identification method and system and readable storage medium
WO2024051017A1 (en) * 2022-09-08 2024-03-14 江苏省未来网络创新研究院 Distributed website tampering detection system and method

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024051017A1 (en) * 2022-09-08 2024-03-14 江苏省未来网络创新研究院 Distributed website tampering detection system and method
CN115396237A (en) * 2022-10-27 2022-11-25 浙江鹏信信息科技股份有限公司 Webpage malicious tampering identification method and system and readable storage medium

Similar Documents

Publication Publication Date Title
US20210034819A1 (en) Method and device for identifying a user interest, and computer-readable storage medium
CN104156490A (en) Method and device for detecting suspicious fishing webpage based on character recognition
CN102446255B (en) Method and device for detecting page tamper
CN110909531B (en) Information security screening method, device, equipment and storage medium
CN114117299A (en) Website intrusion tampering detection method, device, equipment and storage medium
CN111488623A (en) Webpage tampering detection method and related device
CN103605691A (en) Device and method used for processing issued contents in social network
CN111309910A (en) Text information mining method and device
CN103605690A (en) Device and method for recognizing advertising messages in instant messaging
CN115618371A (en) Desensitization method and device for non-text data and storage medium
US10878186B1 (en) Content masking attacks against information-based services and defenses thereto
CN112818200A (en) Data crawling and event analyzing method and system based on static website
CN104036190A (en) Method and device for detecting page tampering
CN110334180B (en) Mobile application security evaluation method based on comment data
KR102282025B1 (en) Method for automatically sorting documents and extracting characters by using computer
Wang et al. Validating multimedia content moderation software via semantic fusion
CN111859862B (en) Text data labeling method and device, storage medium and electronic device
CN108460049B (en) Method and system for determining information category
CN111125704B (en) Webpage Trojan horse recognition method and system
KR20240013640A (en) Method for detecting harmful url
CN111488452A (en) Webpage tampering detection method, detection system and related equipment
CN115186240A (en) Social network user alignment method, device and medium based on relevance information
CN111563276B (en) Webpage tampering detection method, detection system and related equipment
CN113688346A (en) Illegal website identification method, device, equipment and storage medium
CN111581950A (en) Method for determining synonym and method for establishing synonym knowledge base

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination