CN113495836A

CN113495836A - Page detection method and device for page detection

Info

Publication number: CN113495836A
Application number: CN202010264953.4A
Authority: CN
Inventors: 张静军
Original assignee: Beijing Sogou Technology Development Co Ltd
Current assignee: Beijing Sogou Technology Development Co Ltd
Priority date: 2020-04-03
Filing date: 2020-04-03
Publication date: 2021-10-12

Abstract

The embodiment of the invention provides a page detection method and device and a device for page detection. The method specifically comprises the following steps: according to a preset abnormal type, modifying a Cascading Style Sheet (CSS) of a first page to obtain a second page containing abnormal elements; matching a first picture corresponding to the first page with a second picture corresponding to the second page, and determining the position information of the abnormal element in the second picture; generating marking information corresponding to the second picture according to the abnormal type and the position information corresponding to the abnormal element; training an anomaly detection model according to the second pictures in batches and the labeling information corresponding to the second pictures; and carrying out anomaly detection on the page picture acquired in real time through the anomaly detection model. The method and the device can improve the efficiency and accuracy of the detection of the UI abnormity of the page.

Description

Page detection method and device for page detection

Technical Field

The present invention relates to the field of computer technologies, and in particular, to a page detection method and apparatus, and an apparatus for page detection.

Background

With the rapid development of the internet, people's daily life, such as shopping, reading, entertainment, etc., can be completed easily on the internet. Because people are increasingly unable to leave the network, the usability and stability of the website are important.

In order to ensure the usability and stability of a website, it is necessary to detect a page of the website, for example, to detect whether a UI (User Interface) abnormality exists, so as to ensure that the abnormality of the website page is found in time and reduce the influence on a User.

Currently, there is no good technical solution for detecting UI (User Interface) abnormality, and detection is usually performed manually, for example, test coverage is performed by writing test cases. However, in practical applications, UI exceptions are caused by many reasons, for example, the reasons may be caused by page code problems or browser compatibility problems. Therefore, the method of manually writing test cases and the like is difficult to implement due to the numerous UI abnormity types and the complex scene problems, and the efficiency and the accuracy of UI abnormity detection are low.

Disclosure of Invention

The embodiment of the invention provides a page detection method and device and a page detection device, which can improve the efficiency and accuracy of UI abnormity detection.

In order to solve the above problem, an embodiment of the present invention discloses a page detection method, where the method includes:

according to a preset abnormal type, modifying a Cascading Style Sheet (CSS) of a first page to obtain a second page containing abnormal elements;

matching a first picture corresponding to the first page with a second picture corresponding to the second page, and determining the position information of the abnormal element in the second picture;

generating marking information corresponding to the second picture according to the abnormal type and the position information corresponding to the abnormal element;

training an anomaly detection model according to the second pictures in batches and the labeling information corresponding to the second pictures;

and carrying out anomaly detection on the page picture acquired in real time through the anomaly detection model.

On the other hand, the embodiment of the invention discloses a page detection device, which comprises:

the style modification module is used for modifying the CSS of the cascading style sheet of the first page according to the preset abnormal type to obtain a second page containing the abnormal elements;

the position determining module is used for matching a first picture corresponding to the first page with a second picture corresponding to the second page and determining the position information of the abnormal element in the second picture;

the label generation module is used for generating label information corresponding to the second picture according to the abnormal type and the position information corresponding to the abnormal element;

the model training module is used for training the abnormal detection model according to the second pictures in batches and the marking information corresponding to the second pictures;

and the online detection module is used for carrying out abnormity detection on the page picture acquired in real time through the abnormity detection model.

In yet another aspect, an embodiment of the present invention discloses an apparatus for page detection, including a memory, and one or more programs, where the one or more programs are stored in the memory and configured to be executed by one or more processors, the one or more programs including instructions for:

In yet another aspect, an embodiment of the invention discloses a machine-readable medium having stored thereon instructions, which, when executed by one or more processors, cause an apparatus to perform a page detection method as described in one or more of the preceding.

The embodiment of the invention has the following advantages:

according to the embodiment of the invention, the CSS of the first page without the UI abnormity is modified according to the preset abnormity type, and a second page with the UI abnormity is obtained; and matching a first picture corresponding to the first page with a second picture corresponding to the second page, so as to determine the position information of the abnormal element in the second picture. Based on the modification process, a batch of training samples can be formed by the aid of an automatic process. Each training sample may include a second picture and labeling information of the second picture, where the labeling information includes an abnormal type and position information corresponding to an abnormal element in the second picture. Through the batch training samples, an abnormity detection model can be obtained through training, and further abnormity detection can be carried out on the page pictures acquired in real time through the abnormity detection model.

In the process of constructing the training sample, the CSS of the real page is modified according to the preset abnormal type obtained by pre-analyzing and summarizing, and the second page containing the abnormal elements of the preset abnormal type is obtained. On the basis, the training sample is constructed, so that the constructed training sample is more consistent with a real scene, and the accuracy of the abnormity detection model for identifying the UI abnormity can be improved.

In addition, the process of constructing the training samples can be realized through an automatic process, the identification accuracy can be continuously improved in the training process according to the abnormity detection model obtained by training a large number of training samples, UI abnormity in the page can be automatically detected by using the abnormity detection model after the training is finished, the problems of low testing efficiency caused by writing a large number of test cases, insufficient coverage of the test cases, insufficient detection accuracy and the like can be avoided, and the efficiency and the accuracy of the UI abnormity detection of the page can be improved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments of the present invention will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to these drawings without inventive labor.

FIG. 1 is a flowchart illustrating the steps of an embodiment of a page detection method of the present invention;

FIG. 2 is a block diagram of a page detection apparatus according to an embodiment of the present invention;

FIG. 3 is a block diagram of an apparatus 800 for page detection of the present invention; and

fig. 4 is a schematic diagram of a server in some embodiments of the invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Method embodiment

Referring to fig. 1, a flowchart illustrating steps of an embodiment of a page detection method according to the present invention is shown, which may specifically include the following steps:

step 101, according to a preset abnormal type, modifying a CSS (Cascading Style Sheets) of a first page to obtain a second page containing abnormal elements;

102, matching a first picture corresponding to the first page with a second picture corresponding to the second page, and determining position information of the abnormal element in the second picture;

103, generating marking information corresponding to the second picture according to the abnormal type and the position information corresponding to the abnormal element;

104, training an anomaly detection model according to the batch second pictures and the labeling information corresponding to the second pictures;

and 105, carrying out anomaly detection on the page picture acquired in real time through the anomaly detection model.

The page detection method provided by the embodiment of the invention can be used for detecting whether the UI abnormity exists in the page of the website, and if the UI abnormity exists, the abnormal type of the UI abnormity and the position information of the UI abnormity in the page can be identified. Therefore, UI abnormal information can be recorded, and corresponding abnormal error information data can be generated, so that technicians can adjust or optimize the website pages according to the data.

The page detection method of the embodiment of the invention can be applied to electronic equipment, and the electronic equipment comprises but is not limited to: a server, a smart phone, a recording pen, a tablet computer, an e-book reader, an MP3 (Moving Picture Experts Group Audio Layer III) player, an MP4 (Moving Picture Experts Group Audio Layer IV) player, a laptop, a car computer, a desktop computer, a set-top box, a smart tv, a wearable device, and the like.

In order to improve the efficiency and accuracy of UI anomaly detection, the embodiment of the invention can acquire log data of various UI anomalies actually generated on line in advance, and analyze and summarize the log data to obtain a common preset UI anomaly type. And aiming at various preset abnormal types, a training sample is constructed, and then an abnormal detection model can be trained according to the training sample. In practical application, whether UI (user interface) abnormity exists in a page can be automatically detected through the abnormity detection model. Furthermore, the problems that the test efficiency is low due to the fact that a large number of test cases are compiled, the detection is not accurate enough due to the fact that the test cases are not completely covered can be solved, and the efficiency and the accuracy of the page UI abnormity detection can be improved.

In an optional embodiment of the present invention, the preset exception type may include at least any one of the following: font overlap, exception crawling, and exception masking.

The character overlap refers to the overlap between the upper line of characters and the lower line of characters in the page. The abnormal line folding means that the content which should be displayed on the same line in the page is divided into two lines for display. Abnormal coverage refers to the occlusion of portions of the content of some elements in the page by other elements.

It is to be understood that the three exception types described above are merely exemplary of the present invention. The preset exception type is not limited in the embodiment of the present invention, for example, the exception type may further include an image drawing exception in the page, an incomplete expansion of a key in the page, an excessively small display of an input frame in the page, and the like. For convenience of description, the exception types with overlapped fonts are taken as an example in the embodiment of the present invention for explanation, and the processing procedures of other exception types are similar and refer to each other.

The CSS provides a style rule for an HTML (HyperText Markup Language) page, and defines a display manner of each element in the page. According to the embodiment of the invention, the CSS of the first page is modified according to the preset abnormal type, so that the second page containing the abnormal element can be obtained, and the abnormal element conforms to the preset abnormal type. The first page refers to a normal page without UI abnormality, and the second page refers to an abnormal page containing UI abnormality obtained after CSS modification of the first page. For example, for a first page, the CSS of the first page is modified according to the exception type of font overlap, and the obtained second page includes an exception element, and the UI exception type of the exception element is font overlap.

In an optional embodiment of the present invention, the step 101 modifies the cascading style sheet CSS of the first page according to a preset exception type to obtain a second page including an exception element, which may specifically include:

step S11, randomly determining a target element in the CSS of the first page;

and step S12, modifying the style attribute value of the target element according to the preset abnormal type, and modifying the target element into the abnormal element of the abnormal type to obtain a second page.

The style rules of CSS may be used to describe elements in a web page, and a style rule consists of one or more style attributes and their values. Style attributes may include font, font size, background color, line spacing, line height, and the like. According to the embodiment of the invention, the target element can be modified into the abnormal element by modifying the style attribute value of the target element in the first page.

In a specific application, a div tag is usually used to define a separating block or a region part in the HTML page, and a separating block or a region part in the HTML page is called a div element. div elements are often used with CSS to layout web pages to format the div elements through CSS. The elements in the page mentioned in the embodiment of the present invention may refer to div elements in an HTML page.

Since a page usually contains a large number of div elements, the embodiment of the present invention may randomly select one div element as a target element in the CSS of the first page. And modifying the style attribute value of the target element according to a preset abnormal type so as to modify the target element into an abnormal element conforming to the abnormal type, and obtaining a second page. For example, the style attribute value of the target element is modified according to the abnormal type of the overlapping fonts, for example, the line spacing of the target element is reduced, and the target element can be modified into the abnormal element of the overlapping fonts, so as to obtain the second page.

In an optional embodiment of the invention, the method may further comprise: before modifying the CSS of a first page, acquiring a first picture corresponding to the first page; and after modifying the CSS of the cascading style sheet of the first page to obtain a second page containing abnormal elements, acquiring a second picture corresponding to the second page.

The first picture may be a page screenshot of a first page, and the first picture does not include an abnormal element of the UI abnormality. The second picture may be a page screenshot of a second page, and the second picture includes an abnormal element which is obtained by modifying the target element of the first page and accords with a preset UI abnormal type.

After a first picture corresponding to the first page and a second picture corresponding to the second page are obtained, the first picture and the second picture may be matched to determine the position information of the abnormal element in the second picture.

In an optional embodiment of the present invention, the step 102 of matching a first picture corresponding to the first page with a second picture corresponding to the second page, and determining the position information of the abnormal element in the second picture may specifically include:

step S21, identifying structural elements in the first picture and the second picture respectively by adopting an edge detection technology;

step S22, comparing the structural elements in the first picture and the second picture one by one, and determining that the unmatched structural elements at the corresponding positions in the second picture and the first picture are abnormal elements;

step S23, matching the region image corresponding to the abnormal element with the second picture, and determining the position information of the abnormal element in the second picture.

Edge detection is a fundamental problem in image processing and computer vision, and the purpose of edge detection is to identify points in an image where changes are significant. Significant changes in the image typically reflect significant events and changes in attributes such as discontinuities in depth, surface orientation discontinuities, material attribute changes, and scene lighting changes. The image can be divided into a plurality of areas through edge detection, and pixel points in the same area have the same or similar characteristics.

For an HTML page, it is typically composed of a large number of div elements. From an image perspective, there are typically discontinuities in depth, scene lighting variations, and other structurally significant variations between different div elements. Therefore, the structural elements described in the embodiments of the present invention refer to div elements in the first page and the second page, and the structural elements in the first picture and the second picture can be identified and obtained by performing edge detection on the first picture and the second picture according to the display area of each div element in the first page and the second page in the first picture and the second picture, where the structural elements are the div elements in the first page and the second page.

Therefore, the embodiment of the present invention may divide the first picture and the second picture into a plurality of regions, where each region corresponds to one structural element (e.g., div element). The edge detection algorithm may use an operator such as Canny (Canny) or Sobel (Sobel), which is not limited in this embodiment of the present invention.

Optionally, if the first picture and the second picture are color pictures, before edge detection, graying processing may be performed on the first picture and the second picture to obtain a grayscale image, the obtained grayscale image has color independence, and then a point where a grayscale value in a neighborhood changes suddenly may be used as an edge, so as to improve accuracy of edge detection. Then, according to the size of the gray level image, the down-sampling rate is determined in a self-adaptive mode so as to down-sample the gray level image; edge detection can then be performed on the down-sampled image to improve the efficiency of edge detection.

After the structural elements in the first picture and the second picture are respectively identified through edge detection, the structural elements in the two pictures can be compared one by one, and the structural element which is not matched with the structural element at the corresponding position in the first picture in the second picture is determined to be an abnormal element. Thereby, an abnormal element can be found in the second picture.

In an optional embodiment of the present invention, after the step S22 compares the structural elements in the first picture and the second picture one by one, the method may further include:

and if the structural elements in the second picture are determined to be completely matched with the structural elements in the first picture, or if the number of the structural elements in the second picture which are not matched with the structural elements in the first picture is determined to exceed a preset value, determining that the second picture is a failed picture.

In a specific application, there may be a case where the modification fails after the CSS of the first page is modified. For example, a target element is randomly determined in the CSS of the first page, and the style attribute value of the target element is modified according to a preset exception type (e.g., font overlap) to obtain a second page. If the difference of the style property value modification of the target element is small, it may cause the target element in the second page not to have font overlap. In this way, after comparing the structural elements in the first picture and the second picture one by one, it is found that the structural elements in the second picture are all matched with the structural elements in the first picture, and there are no unmatched structural elements, that is, after the CSS of the first page is modified, the first page is not changed, which indicates that the modification is failed. The embodiment of the invention determines the second picture as the failure picture, can discard the failure picture and revise the CSS of the first page again.

In a particular application, there may be another case where the modification fails. For example, a target element is randomly determined in the CSS of the first page, and the style attribute value of the target element is modified according to a preset exception type (e.g., font overlap). Since the target element is randomly determined, the target element may be some div element in the outer structure of the HTML page. The div elements in the HTML page are generally nested layer by layer, the change of the div elements of the inner structure may have a small influence on the overall structure of the page, and the change of the div elements of the outer structure may have a large influence on the overall structure of the page. For example, a change in the style attribute value of a certain div element may cause the positions of other div elements to be misaligned, thereby affecting the overall structure of the page. Alternatively, if the difference of the style attribute value modification of the target element is large, the display of other div elements may be affected.

Thus, after comparing the structural elements in the first picture and the second picture one by one, it may be found that there are a large number of structural elements in the second picture that do not match the first picture. This situation also indicates that the current modification fails, and in the embodiment of the present invention, the second picture is determined as a failed picture, and the failed picture may be discarded, and the CSS of the first page is modified again.

Optionally, in the process of comparing the structural elements in the first picture and the second picture one by one, the unmatched structural elements may be counted, and if it is determined that the number of the unmatched structural elements in the second picture and the number of the unmatched structural elements in the first picture exceeds a preset value, the second picture is determined to be a failed picture.

In order to avoid the situation of modification failure as much as possible, optionally, in the embodiment of the present invention, when the target element is randomly determined in the CSS of the first page, the target element may be determined in a div element of the inner layer structure of the first page. Therefore, the situation that other div elements are misplaced due to modification of the outer layer div element can be reduced as much as possible.

In addition, the number of target elements modified each time may be limited, for example, only one target element in the CSS of the first page may be modified each time, so as to reduce the probability of failure in modification.

After finding the abnormal element in the second picture, the position information of the abnormal element in the second picture needs to be determined. Specifically, a screenshot may be performed on an area where the abnormal element is located, so as to obtain an area image corresponding to the abnormal element. By matching the area image corresponding to the abnormal element with the second picture, the position information of the abnormal element in the second picture can be determined.

In the embodiment of the present invention, template matching may be performed on the region image corresponding to the abnormal element and the image corresponding to the second picture. Template matching is a technique for finding a given template image T (i.e., an area image corresponding to an abnormal element) in a source image S (i.e., an image corresponding to a second picture). The principle is to measure the Similarity (S, T) between two images by some Similarity criteria. Through the template matching technology, the position information of the area image corresponding to the abnormal element can be located in the second picture, and the position information can be the coordinate of the abnormal element in the second picture. Such as the coordinates of the upper left corner of the exception element in the second picture.

After the position information of the abnormal element in the second picture is determined, the labeling information corresponding to the second picture can be generated according to the abnormal type and the position information corresponding to the abnormal element, so as to obtain a training sample for training the abnormal detection model.

In an optional embodiment of the present invention, step 103 generates, according to the abnormal type and the position information corresponding to the abnormal element, the tagging information corresponding to the second picture, which may specifically include: and writing the abnormal type and the position information corresponding to the abnormal element into a preset file to generate a label file corresponding to the second picture.

Specifically, after determining the position information of the abnormal element in the second picture, the abnormal type and the position information corresponding to the abnormal element may be written into a preset file, and a markup file corresponding to the second picture is generated, where the markup file is used to record markup information corresponding to the second picture, and the markup information includes the abnormal type and the position information corresponding to the abnormal element in the second picture. The preset file may include a json file or an xml file.

Therefore, a second picture and a label file corresponding to the second picture can be used as a training sample, and the label file comprises the abnormal type and the position information corresponding to the abnormal element in the second picture.

In an optional embodiment of the present invention, the step 104 of training the anomaly detection model according to the batch of second pictures and the label information corresponding to the second pictures may specifically include:

step S31, taking the second picture and a label file corresponding to the second picture as training samples;

s32, constructing a training sample set according to the batch of second pictures and the label files corresponding to the second pictures;

and step S33, training an abnormal detection model according to the training sample set.

According to the embodiment of the invention, a large number of first pages can be collected, and the steps 101 to 103 are automatically executed on the collected first pages, so that batch second pictures and labeling information corresponding to the second pictures can be automatically acquired.

It should be noted that, the embodiment of the present invention does not limit the specific manner of modifying the first page CSS during the automatic execution of steps 101 to 103. In one example of the present invention, it is assumed that n types of exception types are preset and m first pages are collected, where n and m are positive integers. Steps 101 to 103 can be automatically executed for the m first pages, for example, for the ith first page (1 ≦ i ≦ m), 1 CSS for the ith first page can be sequentially selected from the n exception types to be modified, so as to obtain the ith second page. If the 1 st abnormal type is selected, modifying the 1 st CSS of the first surface to obtain a1 st second page; selecting a2 nd abnormal type to modify the CSS of the 2 nd first page to obtain a2 nd second page; and repeating the steps until the m first pages are modified, and obtaining m second pages.

As another example, n threads may also be created, each thread corresponding to an exception type. And modifying the CSS of the n first pages according to the n abnormal types by the n threads at the same time, and further generating second pages corresponding to the n abnormal types at the same time until the modification of the m first pages is completed to obtain m second pages.

It should be noted that, in the embodiments of the present invention, an example is taken in which one target element is determined in the first page, and the one target element is modified into an exception element of a certain preset exception type. In practical application, the embodiment of the present invention does not limit the number of target elements that need to be modified in the first page, and does not limit the type and number of preset exception types modified by a certain target element.

In one example of the present invention, for first page A1, assume that the target element therein is determined to be element div1, and that the type of anomaly to be modified is determined to be text overlap. And modifying the style attribute value of the div1 in the CSS of the first page a1, so that the div1 is modified into an abnormal element with overlapped characters, and a second page a2 is obtained, where a2 includes an abnormal element, and the abnormal type of the abnormal element is overlapped characters.

In another example of the present invention, for first page A, assume that the target elements therein are determined to include elements div1 and div2, and that the type of anomaly to be modified is a literal overlap. And modifying the style attribute values of div1 and div2 in the CSS of the first page A to modify div1 and div2 into abnormal elements with overlapped characters to obtain a second page A3, wherein A3 comprises two abnormal elements, and the abnormal types of the two abnormal elements are both overlapped characters.

In yet another example of the present invention, for first page A, assume that the target elements therein are determined to include elements div1 and div2, and that the anomaly types to be modified are literal overlap and anomaly break. And modifying style attribute values of div1 and div2 in the CSS of the first page A to enable div1 to be modified into abnormal elements with overlapped characters and div2 to be modified into abnormal elements with abnormal folding to obtain a second page A4, wherein the A4 comprises two abnormal elements, and the abnormal types of the two abnormal elements are character overlapping and abnormal folding respectively.

After a second page is obtained by automatically modifying the CSS of a certain first page according to a preset abnormal type each time, a first picture corresponding to the first page and a second picture corresponding to the second page can be automatically matched, the position information of an abnormal element in the second page is determined, and then the abnormal type and the position information corresponding to the abnormal element can be used as the marking information of the second picture and automatically written into a preset file. Therefore, the embodiment of the invention can automatically acquire the second pictures in batches and the marking information corresponding to the second pictures, further construct the training sample set and train the anomaly detection model. Each training sample in the training sample set comprises a second picture and labeling information corresponding to the second picture, wherein the labeling information can be a json file or an xml file, and the abnormal type and the position information corresponding to the abnormal element in the second picture are recorded.

The anomaly detection model can be obtained by carrying out supervised training on the existing neural network according to a large number of training samples and a machine learning method. It should be noted that, the embodiment of the present invention does not limit the model structure and the training method of the abnormality detection model. The anomaly detection model may be a classification model that incorporates a variety of neural networks. The neural network includes, but is not limited to, at least one or a combination, superposition, nesting of at least two of the following: CNN (Convolutional Neural Network), LSTM (Long Short-Term Memory) Network, RNN (Simple Recurrent Neural Network), attention Neural Network, and the like. In an alternative embodiment of the invention, the anomaly detection model may comprise a YOLO-based neural network model.

Firstly, the embodiment of the invention can construct and initialize an abnormal detection model based on a YOLO model structure, and set model parameters of the initial model; and then, inputting the training samples into the initial model one by one, performing iterative optimization on the initial model according to the difference between the output result of the initial model and the labeled information in the training samples and a gradient descent algorithm, adjusting model parameters, stopping iterative optimization until the optimized model reaches a preset condition, and taking the model obtained by the last optimization as an abnormal detection model after training.

It should be noted that the type of the YOLO model is not limited by the embodiment of the present invention. For example, any one of YOLOv1, YOLOv2, and YOLOv3 may be used. Preferably, the embodiment of the present invention uses a model structure of YOLOv 3. YOLOv3 is the third edition of the YOLO series target detection algorithm, and compared with the previous algorithm, the precision is obviously improved in the detection scene aiming at small targets.

It should be noted that, in the embodiment of the present invention, a training sample set of a single anomaly type may be constructed, and an anomaly detection model for detecting a certain UI anomaly type may be trained. For example, a training sample set consisting of training samples of font-overlapping exception types may be constructed, and an exception detection model trained from the training sample set may be used to detect font-overlapping UI exceptions in a page. Alternatively, a training sample set composed of training samples of the abnormal type of abnormal broken lines may be constructed, and the abnormality detection model trained according to the training sample set may be used to detect UI abnormalities and the like of abnormal broken lines in a page.

Alternatively, the embodiment of the present invention may further construct a training sample set including multiple anomaly types, and train an anomaly detection model for detecting the multiple anomaly types. For example, the constructed training sample set includes 5 types of anomalies, and the trained anomaly detection model may output probability values of the 5 types of anomalies corresponding to the anomaly elements, respectively, and position information corresponding to the anomaly elements of each type of anomalies.

In an alternative embodiment of the present invention, the second picture in the training sample set may also be preprocessed before the anomaly detection model is trained using the training sample set. The preprocessing may include binarization processing, and the binarization processing may set the gray value of a pixel point in the image to 0 or 255, that is, a process of rendering the entire image to have an obvious black-and-white effect. The data amount in the image can be greatly reduced through the binarization processing, so that the outline of each element in the second picture can be highlighted. And training the abnormality detection model according to the preprocessed training sample set, so that the accuracy of the abnormality detection model identification can be further improved.

In an optional embodiment of the present invention, step 105 performs anomaly detection on the page picture obtained in real time through the anomaly detection model, which may specifically include:

step S41, acquiring a page to be detected in real time;

step S42, screenshot is carried out on the page to be detected, and a page picture corresponding to the page to be detected is obtained;

and step S43, inputting the page picture into a trained anomaly detection model, and outputting the anomaly type and position information corresponding to the anomaly element in the page picture through the anomaly detection model.

After the anomaly detection model is trained, whether UI anomalies exist in the page can be detected by using the trained anomaly detection model, and if the UI anomalies exist, the anomaly detection model can output the anomaly types corresponding to the anomaly elements with the UI anomalies and the positions of the anomaly elements in the page.

Specifically, a page to be detected can be obtained in real time, and screenshot is performed on the page to be detected to obtain a page picture corresponding to the page to be detected; inputting the page picture into the trained anomaly detection model, namely outputting the abnormal type and position information of the UI anomaly corresponding to the abnormal element in the page picture through the anomaly detection model.

Optionally, after the online real-time detection is performed on the page by using the anomaly detection model, the embodiment of the invention can collect an anomaly detection result. The abnormal detection result comprises a detected abnormal page containing the UI abnormality and the abnormal type and position information corresponding to the abnormal element in the abnormal page, and then the abnormal detection result can be used as a training sample for continuously training the abnormal detection model so as to continuously optimize the abnormal detection model.

To sum up, according to a preset exception type, the embodiment of the invention modifies the CSS of the first page without the UI exception to obtain a second page with the UI exception; and matching a first picture corresponding to the first page with a second picture corresponding to the second page, so as to determine the position information of the abnormal element in the second picture. Based on the modification process, a batch of training samples can be formed by the aid of an automatic process. Each training sample may include a second picture and labeling information of the second picture, where the labeling information includes an abnormal type and position information corresponding to an abnormal element in the second picture. Through the batch training samples, an abnormity detection model can be obtained through training, and further abnormity detection can be carried out on the page pictures acquired in real time through the abnormity detection model.

It should be noted that, for simplicity of description, the method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present invention is not limited by the illustrated order of acts, as some steps may occur in other orders or concurrently in accordance with the embodiments of the present invention. Further, those skilled in the art will appreciate that the embodiments described in the specification are presently preferred and that no particular act is required to implement the invention.

Device embodiment

Referring to fig. 2, a block diagram of a structure of an embodiment of a page detection apparatus of the present invention is shown, where the apparatus may specifically include:

the style modification module 201 is configured to modify a cascading style sheet CSS of the first page according to a preset exception type to obtain a second page including an exception element;

a position determining module 202, configured to match a first picture corresponding to the first page with a second picture corresponding to the second page, and determine position information of the abnormal element in the second picture;

the label generating module 203 is configured to generate label information corresponding to the second picture according to the abnormal type and the position information corresponding to the abnormal element;

the model training module 204 is used for training an abnormal detection model according to the batch of second pictures and the labeling information corresponding to the second pictures;

and the online detection module 205 is configured to perform anomaly detection on the page picture acquired in real time through the anomaly detection model.

Optionally, the style modification module 201 may specifically include:

the target determination submodule is used for randomly determining a target element in the CSS of the first page;

and the target modification submodule is used for modifying the style attribute value of the target element according to the preset abnormal type, modifying the target element into the abnormal element of the abnormal type and obtaining a second page.

Optionally, the position determining module 202 may specifically include:

the edge detection sub-module is used for respectively identifying the structural elements in the first picture and the second picture by adopting an edge detection technology;

the element matching sub-module is used for comparing the structural elements in the first picture and the second picture one by one and determining that the unmatched structural elements at the corresponding positions in the second picture and the first picture are abnormal elements;

and the template matching submodule is used for matching the area image corresponding to the abnormal element with the second picture and determining the position information of the abnormal element in the second picture.

Optionally, the apparatus may further include:

and the failure picture determining module is used for determining that the second picture is a failure picture if the structural elements in the second picture are all matched with the structural elements in the first picture or if the number of the structural elements in the second picture which are not matched with the structural elements in the first picture is determined to exceed a preset value.

Optionally, the annotation generating module 203 is specifically configured to write the exception type and the location information corresponding to the exception element into a preset file, and generate an annotation file corresponding to the second picture.

Optionally, the model training module 204 may specifically include:

the sample determination submodule is used for taking the second picture and the label file corresponding to the second picture as training samples;

the set construction sub-module is used for constructing a training sample set according to the batch second pictures and the label files corresponding to the second pictures;

and the model training submodule is used for training an anomaly detection model according to the training sample set.

Optionally, the model training module 205 may specifically include:

the page acquisition submodule is used for acquiring a page to be detected in real time;

the image acquisition sub-module is used for carrying out screenshot on the page to be detected to obtain a page image corresponding to the page to be detected;

and the model detection submodule is used for inputting the page picture into a trained abnormity detection model and outputting the abnormity type and position information corresponding to the abnormal elements in the page picture through the abnormity detection model.

Optionally, the anomaly detection model comprises a YOLO-based neural network model.

Optionally, the preset exception type at least includes any one of the following: font overlap, exception crawling, and exception masking.

For the device embodiment, since it is basically similar to the method embodiment, the description is simple, and for the relevant points, refer to the partial description of the method embodiment.

The embodiments in the present specification are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other.

With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.

An embodiment of the present invention provides an apparatus for page detection, comprising a memory, and one or more programs, wherein the one or more programs are stored in the memory, and the one or more programs configured to be executed by one or more processors include instructions for: according to a preset abnormal type, modifying a Cascading Style Sheet (CSS) of a first page to obtain a second page containing abnormal elements; matching a first picture corresponding to the first page with a second picture corresponding to the second page, and determining the position information of the abnormal element in the second picture; generating marking information corresponding to the second picture according to the abnormal type and the position information corresponding to the abnormal element; training an anomaly detection model according to the second pictures in batches and the labeling information corresponding to the second pictures; and carrying out anomaly detection on the page picture acquired in real time through the anomaly detection model.

Fig. 3 is a block diagram illustrating an apparatus 800 for page detection according to an example embodiment. For example, the apparatus 800 may be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a game console, a tablet device, a medical device, an exercise device, a personal digital assistant, and the like.

Referring to fig. 3, the apparatus 800 may include one or more of the following components: processing component 802, memory 804, power component 806, multimedia component 808, audio component 810, input/output (I/O) interface 812, sensor component 814, and communication component 816.

The processing component 802 generally controls overall operation of the device 800, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing elements 802 may include one or more processors 820 to execute instructions to perform all or a portion of the steps of the methods described above. Further, the processing component 802 can include one or more modules that facilitate interaction between the processing component 802 and other components. For example, the processing component 802 can include a multimedia module to facilitate interaction between the multimedia component 808 and the processing component 802.

The memory 804 is configured to store various types of data to support operation at the device 800. Examples of such data include instructions for any application or method operating on device 800, contact data, phonebook data, messages, pictures, videos, and so forth. The memory 804 may be implemented by any type or combination of volatile or non-volatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks.

Power components 806 provide power to the various components of device 800. The power components 806 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power for the apparatus 800.

The multimedia component 808 includes a screen that provides an output interface between the device 800 and a user. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive an input signal from a user. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 808 includes a front facing camera and/or a rear facing camera. The front-facing camera and/or the rear-facing camera may receive external multimedia data when the device 800 is in an operating mode, such as a shooting mode or a video mode. Each front camera and rear camera may be a fixed optical lens system or have a focal length and optical zoom capability.

The audio component 810 is configured to output and/or input audio signals. For example, the audio component 810 includes a Microphone (MIC) configured to receive external audio signals when the apparatus 800 is in an operational mode, such as a call mode, a recording mode, and a voice information processing mode. The received audio signals may further be stored in the memory 804 or transmitted via the communication component 816. In some embodiments, audio component 810 also includes a speaker for outputting audio signals.

The I/O interface 812 provides an interface between the processing component 802 and peripheral interface modules, which may be keyboards, click wheels, buttons, etc. These buttons may include, but are not limited to: a home button, a volume button, a start button, and a lock button.

The sensor assembly 814 includes one or more sensors for providing various aspects of state assessment for the device 800. For example, the sensor assembly 814 may detect the open/closed status of the device 800, the relative positioning of components, such as a display and keypad of the apparatus 800, the sensor assembly 814 may also detect a change in position of the apparatus 800 or a component of the apparatus 800, the presence or absence of user contact with the apparatus 800, orientation or acceleration/deceleration of the apparatus 800, and a change in temperature of the apparatus 800. Sensor assembly 814 may include a proximity sensor configured to detect the presence of a nearby object without any physical contact. The sensor assembly 814 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 814 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.

The communication component 816 is configured to facilitate communications between the apparatus 800 and other devices in a wired or wireless manner. The device 800 may access a wireless network based on a communication standard, such as WiFi, 2G or 3G, or a combination thereof. In an exemplary embodiment, the communication component 816 receives a broadcast signal or broadcast related information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communication component 816 further includes a Near Field Communication (NFC) module to facilitate short-range communications. For example, the NFC module may be implemented based on radio frequency information processing (RFID) technology, infrared data association (IrDA) technology, Ultra Wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.

In an exemplary embodiment, the apparatus 800 may be implemented by one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), controllers, micro-controllers, microprocessors or other electronic components for performing the above-described methods.

In an exemplary embodiment, a non-transitory computer-readable storage medium comprising instructions, such as the memory 804 comprising instructions, executable by the processor 820 of the device 800 to perform the above-described method is also provided. For example, the non-transitory computer readable storage medium may be a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.

Fig. 4 is a schematic diagram of a server in some embodiments of the invention. The server 1900 may vary widely by configuration or performance and may include one or more Central Processing Units (CPUs) 1922 (e.g., one or more processors) and memory 1932, one or more storage media 1930 (e.g., one or more mass storage devices) storing applications 1942 or data 1944. Memory 1932 and storage medium 1930 can be, among other things, transient or persistent storage. The program stored in the storage medium 1930 may include one or more modules (not shown), each of which may include a series of instructions operating on a server. Still further, a central processor 1922 may be provided in communication with the storage medium 1930 to execute a series of instruction operations in the storage medium 1930 on the server 1900.

The server 1900 may also include one or more power supplies 1926, one or more wired or wireless network interfaces 1950, one or more input-output interfaces 1958, one or more keyboards 1956, and/or one or more operating systems 1941, such as Windows Server, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, etc.

A non-transitory computer readable storage medium in which instructions, when executed by a processor of an apparatus (server or terminal), enable the apparatus to perform the page detection method shown in fig. 1.

A non-transitory computer readable storage medium in which instructions, when executed by a processor of an apparatus (server or terminal), enable the apparatus to perform a page detection method, the method comprising: according to a preset abnormal type, modifying a Cascading Style Sheet (CSS) of a first page to obtain a second page containing abnormal elements; matching a first picture corresponding to the first page with a second picture corresponding to the second page, and determining the position information of the abnormal element in the second picture; generating marking information corresponding to the second picture according to the abnormal type and the position information corresponding to the abnormal element; training an anomaly detection model according to the second pictures in batches and the labeling information corresponding to the second pictures; and carrying out anomaly detection on the page picture acquired in real time through the anomaly detection model.

The embodiment of the invention discloses A1 and a page detection method, which comprises the following steps:

A2, according to the method described in a1, modifying a Cascading Style Sheet (CSS) of the first page according to a preset exception type to obtain a second page including an exception element, including:

randomly determining a target element in the CSS of the first page;

and modifying the style attribute value of the target element according to the preset abnormal type, and modifying the target element into the abnormal element of the abnormal type to obtain a second page.

A3, according to the method in A1, the matching a first picture corresponding to the first page and a second picture corresponding to the second page, and determining the position information of the abnormal element in the second picture include:

respectively identifying structural elements in the first picture and the second picture by adopting an edge detection technology;

comparing the structural elements in the first picture with the structural elements in the second picture one by one, and determining that the unmatched structural elements at the corresponding positions in the second picture and the first picture are abnormal elements;

and matching the area image corresponding to the abnormal element with the second picture, and determining the position information of the abnormal element in the second picture.

A4, according to the method of A3, after the structural elements in the first picture and the second picture are aligned one by one, the method further comprises:

A5, according to the method in A1, the generating the annotation information corresponding to the second picture according to the abnormal type and the position information corresponding to the abnormal element includes:

and writing the abnormal type and the position information corresponding to the abnormal element into a preset file to generate a label file corresponding to the second picture.

A6, according to the method of A5, training an anomaly detection model according to the batch of second pictures and the labeling information corresponding to the second pictures, including:

taking the second picture and a label file corresponding to the second picture as training samples;

constructing a training sample set according to the batch of second pictures and the label files corresponding to the second pictures;

and training an anomaly detection model according to the training sample set.

A7, according to the method of A1, the detecting the abnormality of the page picture acquired in real time by the abnormality detection model includes:

acquiring a page to be detected in real time;

screenshot is carried out on the page to be detected, and a page picture corresponding to the page to be detected is obtained;

inputting the page picture into a trained anomaly detection model, and outputting the anomaly type and position information corresponding to the anomaly element in the page picture through the anomaly detection model.

A8, the method according to any one of a1 to a7, wherein the preset abnormality types include at least any one of: font overlap, exception crawling, and exception masking.

The embodiment of the invention discloses B9 and a page detection device, which comprises:

B10, the apparatus of B9, the style modification module comprising:

B11, the apparatus of B9, the position determination module comprising:

B12, the apparatus of B11, the apparatus further comprising:

And B13, according to the apparatus of B9, the label generating module is specifically configured to write the exception type and the location information corresponding to the exception element into a preset file, and generate a label file corresponding to the second picture.

B14, the apparatus of B13, the model training module comprising:

B15, the apparatus of B9, the model training module comprising:

B16, the device according to any one of B9 to B15, wherein the preset exception type at least includes any one of the following: font overlap, exception crawling, and exception masking.

The embodiment of the invention discloses C17, an apparatus for page detection, comprising a memory, and one or more programs, wherein the one or more programs are stored in the memory, and the one or more programs configured to be executed by the one or more processors comprise instructions for:

C18, according to the apparatus of C17, modifying the cascading style sheet CSS of the first page according to the preset exception type to obtain a second page including an exception element, including:

randomly determining a target element in the CSS of the first page;

C19, the matching the first picture corresponding to the first page and the second picture corresponding to the second page according to the apparatus of C17, and determining the position information of the abnormal element in the second picture, include:

C20, the device of C19, the device also configured to execute the one or more programs by one or more processors including instructions for:

C21, according to the apparatus of C17, the generating the annotation information corresponding to the second picture according to the abnormal type and the position information corresponding to the abnormal element includes:

C22, according to the apparatus of C21, according to the second picture of batch and the corresponding label information of second picture, train the unusual detection model, including:

and training an anomaly detection model according to the training sample set.

C23, according to the apparatus of C17, the detecting the abnormality of the page picture acquired in real time by the abnormality detecting model includes:

acquiring a page to be detected in real time;

C24, the device according to any one of C17 to C23, wherein the preset exception type at least includes any one of the following: font overlap, exception crawling, and exception masking.

Embodiments of the present invention disclose D25, a machine-readable medium having instructions stored thereon, which when executed by one or more processors, cause an apparatus to perform a page detection method as described in one or more of a 1-a 8.

Other embodiments of the invention will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This invention is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the invention being indicated by the following claims.

It will be understood that the invention is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the invention is limited only by the appended claims.

The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

The page detection method, the page detection device and the device for page detection provided by the invention are described in detail, specific examples are applied in the text to explain the principle and the implementation mode of the invention, and the description of the above embodiments is only used for helping to understand the method and the core idea of the invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims

1. A page detection method, characterized in that the method comprises:

2. The method according to claim 1, wherein the modifying the Cascading Style Sheet (CSS) of the first page according to the preset exception type to obtain the second page containing the exception element comprises:

randomly determining a target element in the CSS of the first page;

3. The method according to claim 1, wherein the matching a first picture corresponding to the first page and a second picture corresponding to the second page to determine the position information of the abnormal element in the second picture comprises:

4. The method of claim 3, wherein after comparing the structured elements in the first picture and the second picture one by one, the method further comprises:

5. The method according to claim 1, wherein the generating of the annotation information corresponding to the second picture according to the abnormal type and the position information corresponding to the abnormal element comprises:

6. The method according to claim 5, wherein training the anomaly detection model according to the batch of second pictures and the labeling information corresponding to the second pictures comprises:

and training an anomaly detection model according to the training sample set.

7. The method according to claim 1, wherein the performing anomaly detection on the page picture acquired in real time through the anomaly detection model comprises:

acquiring a page to be detected in real time;

8. A page detection apparatus, characterized in that the apparatus comprises:

9. An apparatus for page detection, comprising a memory, and one or more programs, wherein the one or more programs are stored in the memory, and wherein execution of the one or more programs by one or more processors comprises instructions for:

10. A machine-readable medium having stored thereon instructions, which when executed by one or more processors, cause an apparatus to perform a page detection method as claimed in one or more of claims 1 to 7.