CN110275820B

CN110275820B - Page compatibility testing method, system and equipment

Info

Publication number: CN110275820B
Application number: CN201810215223.8A
Authority: CN
Inventors: 李泽
Original assignee: Alibaba Group Holding Ltd
Current assignee: Alibaba Group Holding Ltd
Priority date: 2018-03-15
Filing date: 2018-03-15
Publication date: 2023-11-21
Anticipated expiration: 2038-03-15
Also published as: CN110275820A

Abstract

The embodiment of the application provides a page compatibility testing method, a system and equipment. The method comprises the following steps: acquiring at least one screenshot to be tested of a page to be tested displayed in a browser; extracting abstract features of each screenshot to be detected from the at least one screenshot to be detected respectively; and determining the compatibility of the page to be tested in the browser according to the abstract characteristics of each screenshot to be tested. According to the technical scheme provided by the embodiment of the application, whether a plurality of screen shots of the page are normally displayed in the browser is identified by extracting the abstract feature of at least one screen shot to be tested of the page to be tested, so that the compatibility of the page to be tested in the browser is determined, the automatic test of the page compatibility is realized, the influence of page adjustment is avoided, the maintenance cost is low, and the accuracy is high.

Description

Page compatibility testing method, system and equipment

Technical Field

The present application relates to the field of computer technologies, and in particular, to a method, a system, and an apparatus for testing page compatibility.

Background

With the increasing number of network users, web browsers designed to meet different user demands are more and more, and common browsers include IE (Internet Explorer), firefox, chrome, dog search browser, 360 browser and the like, and the kernels of the browsers are different, so that the supportability of the browsers for various web pages is different, and thus, the problem of page compatibility is brought.

The page compatibility test, or browser compatibility test, is a test for whether the display effect of the same front-end page in different browsers is consistent. Because of the large number of browsers, if compatibility tests of each browser are manually performed, that is, if a page is normally checked on each browser manually, the repeated workload is very large, and multiple operating systems and browsers on the PC side and the mobile side need to be covered.

Disclosure of Invention

In view of the foregoing, the present application has been developed to solve, or at least partially solve, the above-described problems, and page compatibility testing methods, systems, and devices.

Thus, in one embodiment of the present application, a page compatibility test method is provided. The method comprises the following steps:

acquiring at least one screenshot to be tested of a page to be tested displayed in a browser;

extracting abstract features of each screenshot to be detected from the at least one screenshot to be detected respectively;

and determining the compatibility of the page to be tested in the browser according to the abstract characteristics of each screenshot to be tested.

In another embodiment of the present application, a page compatibility test method is provided. The method is suitable for the server and comprises the following steps:

Receiving at least one screenshot to be tested of a page to be tested, which is uploaded by a client and displayed in a browser;

according to the abstract features of each screenshot to be tested, determining the compatibility of the page to be tested in the browser;

and feeding back the determined result to the client.

In yet another embodiment of the present application, a page compatibility test method is provided. The method is suitable for the client and comprises the following steps:

loading a page to be tested in a browser;

the partition captures the page to be detected to obtain at least one screenshot to be detected;

uploading the at least one screenshot to be tested to a server to determine the compatibility of the page to be tested in the browser by the server;

the determination basis is the abstract feature of each screenshot to be detected, and the abstract feature of each screenshot to be detected is respectively extracted from at least one screenshot to be detected.

In yet another embodiment of the present application, a page compatibility test system is provided. The system comprises:

the client is used for loading the page to be tested in the browser; the partition captures the page to be detected to obtain at least one screenshot to be detected; uploading the at least one screenshot to be tested to a server;

The server side is used for receiving at least one screenshot to be tested of a page to be tested, which is displayed in a browser and uploaded by the client side; extracting abstract features of each screenshot to be detected from the at least one screenshot to be detected respectively; and determining the compatibility of the page to be tested in the browser according to the abstract features of the screenshots to be tested, and feeding back a determination result to the client.

In yet another embodiment of the present application, an electronic device is provided. The electronic device includes: a first memory and a first processor, wherein,

the first memory is used for storing programs;

the first processor is coupled to the first memory for executing the program stored in the first memory for:

In yet another embodiment of the present application, a server device is provided. The server device comprises: a second memory and a second processor, wherein,

The second memory is used for storing programs;

the second processor is coupled with the second memory, and is configured to execute the program stored in the second memory, for:

and feeding back the determined result to the client.

In yet another embodiment of the present application, a client device is provided. The client device includes: a third memory and a third processor, wherein,

the third memory is used for storing programs;

the third processor is coupled with the third memory, and is configured to execute the program stored in the third memory, for:

loading a page to be tested in a browser;

According to the technical scheme provided by the embodiment of the application, whether the multiple screenshots of the page are normally displayed in the browser or not is identified by extracting the abstract feature of at least one screenshot to be detected of the page to be detected, so that whether the page to be detected has compatibility in the browser or not is determined, automatic test of page compatibility is realized, the influence of page adjustment is avoided, the maintenance cost is low, and the accuracy is high.

Drawings

The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this specification, illustrate embodiments of the application and together with the description serve to explain the application and do not constitute a limitation on the application. In the drawings:

FIG. 1 is a flow chart illustrating a method for testing page compatibility according to an embodiment of the present application;

FIG. 2 is an exemplary picture correspondingly provided for illustrating semantic tags according to the present application;

FIG. 3 is a schematic diagram of results obtained by calculating each layer of the convolutional neural network structure of image data in the page compatibility test method according to an embodiment of the present application;

FIG. 4 is a schematic diagram of a page compatibility testing system according to an embodiment of the present application;

FIG. 5 is a flowchart illustrating a method for testing page compatibility according to another embodiment of the present application;

FIG. 6 is a flowchart illustrating a page compatibility testing method according to another embodiment of the present application;

FIG. 7 is a flowchart illustrating a page compatibility testing method according to another embodiment of the present application;

FIG. 8 is a schematic diagram of a page compatibility testing apparatus according to an embodiment of the present application;

FIG. 9 is a schematic diagram of a page compatibility testing apparatus according to another embodiment of the present application;

FIG. 10 is a schematic diagram of a page compatibility testing apparatus according to another embodiment of the present application;

fig. 11 is a schematic structural diagram of an electronic device according to an embodiment of the present application;

fig. 12 is a schematic structural diagram of a server device according to an embodiment of the present application;

fig. 13 is a schematic structural diagram of a client device according to an embodiment of the present application.

Detailed Description

In order to enable those skilled in the art to better understand the present application, the following description will make clear and complete descriptions of the technical solutions according to the embodiments of the present application with reference to the accompanying drawings.

In some of the flows described in the specification, claims, and figures above, a number of operations occurring in a particular order are included, and the operations may be performed in a non-sequential or parallel manner, depending on the order in which they occur. The sequence numbers of operations such as 101, 102, etc. are merely used to distinguish between the various operations, and the sequence numbers themselves do not represent any order of execution. In addition, the flows may include more or fewer operations, and the operations may be performed sequentially or in parallel. It should be noted that, the descriptions of "first" and "second" herein are used to distinguish different messages, devices, modules, etc., and do not represent a sequence, and are not limited to the "first" and the "second" being different types.

Currently, the automation tools or schemes for page compatibility testing are largely divided into two categories.

The first type of scheme or tool requires manual input of parameters and configuration, such as broadservhes, IE testers, spoon Browser Sandbox, etc. Such schemes are typically based on a virtual machine, such as a Windows virtual machine, that has been pre-installed with multiple browsers. The virtual machine provides some configurable software interface for the user to input or configure the content to be tested, such as the format of the page elements, the font of the page, the presentation of the page at different resolutions, etc. After the content to be tested is configured, after the user provides the link address of the page to be tested, the virtual machine can automatically test the compatibility of a plurality of installed browsers under a corresponding operating system (such as Windows). While for a cell phone operating system, such as an IOS system or android system, there is typically no suitable virtual machine provisioning. The existing method for grabbing the page elements for testing by the virtual machine is often ineffective to the page style problem, and the problem that the page elements are normal but the page styles are disordered under certain browsers can occur.

The second type of scheme or tool is based on contrast of image pixels, e.g., selenium. Such tools typically require manual programming of a program script to record the contents of each test page, after which the page is revisited by playback each time of testing, and pixel-level comparison of the current page shots with the recorded page shots is performed in such a way that compatibility testing of the page is achieved. This approach is effective for some types of problematic pages, but is susceptible to operating environments such as network transmissions, machines of different resolutions, and page adjustments, not only is the stability inadequate, but also the page verification needs to be performed manually once after the page adjustment, and then the script is re-recorded. And just the front page usually changes more frequently, the manual maintenance cost of this method for recording scripts is also higher.

In order to overcome part or all of the problems in the prior art, the embodiment of the application provides a page compatibility testing method, system and equipment with wide applicability and low manual maintenance cost. The technical scheme provided by the embodiment of the application has the main ideas that: the visual page image is expressed as a high-level abstract feature, and the compatibility of the page is determined based on the abstract feature of the page. The technical scheme provided by the embodiment of the application is not influenced by page adjustment, so that the problem that the existing second-class scheme or tool needs to re-write the script after page adjustment every time is avoided, the maintenance cost is greatly reduced, and the test efficiency is improved; on the other hand, the problem that the page patterns are disordered and can not be tested in the existing first scheme or tool can be solved to a great extent.

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application. It will be apparent that the described embodiments are only some, but not all, embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.

Fig. 1 is a flow chart illustrating a page compatibility testing method according to an embodiment of the present application. The execution subject of the technical scheme provided in this embodiment may be a client or a server. The client may be hardware integrated on the terminal and provided with an embedded program, or may be an application software installed in the terminal, or may be a tool software embedded in an operating system of the terminal, which is not limited in particular in the embodiment of the present application. The terminal can be any terminal equipment such as a mobile phone, a tablet computer, a PDA (PersonalDigital Assistant ), a POS (Point of Sales) and a vehicle-mounted computer. The server may be a conventional server, a cloud, a virtual server, etc., which is not particularly limited in the embodiment of the present application. As shown in fig. 1, the method includes:

101. And obtaining at least one screenshot to be tested of the page to be tested displayed in the browser.

102. And extracting abstract features of each screenshot to be detected from the at least one screenshot to be detected respectively.

103. And determining the compatibility of the page to be tested in the browser according to the abstract characteristics of each screenshot to be tested.

In the above 101, displaying a full page in a screen, and taking a screenshot to be tested; for longer pages, i.e. pages with no displayed content need to be checked by manual sliding (such as a mobile phone) or mouse scrolling (such as a PC), the pages need to be subjected to screenshot in different areas, and the screenshot to be tested of the pages is more than one. It should be noted that, in the embodiment of the present application, at least one screenshot to be tested needs to cover the whole area of the page to be tested. If part of the area content of the page to be tested is missed, the compatibility determination result is inaccurate due to the fact that the untested screenshot to be tested exists. In a specific implementation example, an automatic screenshot tool may be deployed on a device (such as a client or a server) in which different browsers are installed, so as to call an interface of the browser on the device to access a page to be tested and traverse all areas of the page to be tested, so as to intercept the page to be tested in a partition to obtain at least one screenshot to be tested.

People understand the image content in meaning through objects, events, expressed emotions and the like described by the image. These meanings can be simply understood as high-level abstract features of the image. Such high-level abstract features are not directly available from visual features of the image (such as features of color, texture, shape, etc.). The underlying visual features only describe information on a certain aspect of the image and have limited expressive power. Therefore, the key point of accurately identifying the image is to extract the abstract features with strong expression capability. The characteristics with strong expressive force can describe the nature and invariable information of the object in the image. The technical scheme provided by the embodiment of the application can extract abstract features from images by adopting an artificial intelligence learning algorithm. The human perception system is a clear hierarchical structure, human processing on visual information is a layer-by-layer transmission and continuous abstraction process, and essential information hidden in the data can be better described through layer-by-layer expression of the data. Therefore, the artificial intelligence learning algorithm simulating the human brain can extract more comprehensive and more expressive rear road features from the image.

Therefore, in the embodiment of the present application, each screenshot to be tested in step 102 may extract the abstract feature by the following method. The abstract feature extraction process is described below taking one of the multiple shots to be tested (i.e., the first shot to be tested) as an example. Namely extracting abstract features of the first screenshot to be tested from the first screenshot to be tested, wherein the abstract features comprise:

1021. And preprocessing the first screenshot to be tested to obtain image data.

1022. And taking the image data as input of an extraction model, and executing the extraction model to obtain abstract features of the first screenshot to be tested.

The first screenshot to be tested is any one of at least one screenshot to be tested. The extraction model is constructed based on an artificial intelligence learning algorithm. Specifically, in one implementation, the artificial intelligence learning algorithm may be a convolutional neural network algorithm. Convolutional neural network, convolutional Neural Network (CNN): is a feed-forward neural network consisting of one or more convolutional layers, pooling layers and fully-connected layers (corresponding to classical neural networks), and also includes shared weights (shared weights) and shared biases (shared bias). This structure enables the convolutional neural network to take advantage of the two-dimensional structure of the input data. Its artificial neurons can respond to surrounding cells in a portion of the coverage area, giving better results in image processing than other deep learning structures.

In 1021, the purpose of preprocessing the first screenshot to be tested is to convert the screenshot to be tested into a format that can be processed by the extraction model. For example, preprocessing the first screenshot to be tested in 1021 includes: and adjusting the first screenshot to be tested into three-channel RGB image data with a set size. For example, the first screenshot to be tested is adjusted to 3-channel RGB image data of 224×224 size.

The artificial intelligence learning algorithm requires training to obtain the extraction model usable in 1022 above. The training process is to adjust parameters in the algorithm (e.g., sharing weights and sharing biases) to make the extracted model output more accurate. The algorithm training process will be described in detail later.

The abstract feature obtained by performing the extraction model in 1022 above is typically a numerical value. In order to facilitate the determination of compatibility, the implementation may convert the abstract features of the digitized values into semantic tags, which may also be simply understood as semantic features. The picture as shown in fig. 2, and the corresponding semantic tags include: "airplane", "lawn", "sky", etc. The semantic tags have a one-to-one correspondence with their digitized values. The abstract feature is a numerical value, so that the corresponding relation between the abstract feature and the semantic tag is obtained.

Therefore, the 103 can obtain semantic tags corresponding to the screenshots to be tested according to the corresponding relation between the abstract features and the semantic tags, and then determine the compatibility of the page to be tested in the browser based on the semantic tags corresponding to the screenshots to be tested. Specifically, step 103 in the above embodiment may be implemented in two ways as follows.

Mode one:

1031. and acquiring semantic tags corresponding to the abstract features of each screenshot to be tested respectively according to the corresponding relation between the abstract features and the semantic tags.

1032. And determining the compatibility of the page to be tested in the browser by judging whether semantic tags corresponding to the screenshot to be tested belong to abnormal class tags or not.

Specifically, step 1032 includes: and if one semantic tag corresponding to the screenshot to be detected belongs to an abnormal class tag in the semantic tags corresponding to the screenshot to be detected, determining that the page to be detected has no compatibility in the browser. That is, as long as a semantic tag corresponding to a screenshot to be tested of the page to be tested belongs to an abnormal tag, the page to be tested is indicated to be abnormal in display, and the page to be tested does not have compatibility. And determining that the page to be tested has compatibility in the browser only when semantic tags corresponding to all the screenshot to be tested do not belong to abnormal class tags.

In the implementation, an abnormal class label list can be preset, and labels of various abnormal classes are collected in the list. And determining whether the semantic tag corresponding to the screenshot to be detected belongs to an abnormal class tag or not by inquiring whether the semantic tag corresponding to the screenshot to be detected is in the abnormal class tag list. When the method is implemented, the user can know which screenshot to be tested corresponds to the semantic tag belonging to the abnormal tag, namely, the user can know which area of the page to be tested is abnormal in display; according to the abnormal type of the semantic tag, determining which type the abnormal problem of the display of the block of the page to be tested belongs to; such information may help page designers to refine pages.

A second mode,

1031', obtaining a standard semantic tag set corresponding to the page to be tested.

1032', and judging whether the page to be tested is normally displayed in the browser or not by comparing the semantic features of the screenshot to be tested with the standard semantic feature set.

Essentially, each page is identified after the design is completed, with its corresponding semantic tag. After the page is designed, a worker can manually configure at least one semantic tag for the page to obtain a standard semantic tag set. Therefore, in the implementation, whether the page to be tested is normally displayed in the browser or not can be tested and judged by comparing the semantic tags of all the screenshot to be tested with the standard semantic tag set so as to obtain the compatibility of the page to be tested. Specifically, if one of the following conditions exists between the semantic features of each screenshot to be tested and the standard semantic feature set through comparison, it is determined that the page to be tested is abnormal in display in the browser.

In case 1, the semantic features of one screenshot to be tested in the at least one screenshot to be tested are not a subset of the standard semantic feature set.

For example, the standard semantic feature set for a page contains { A, B, C, D, E, F }. The page to be tested contains two screenshots to be tested. The semantic feature of the first screenshot to be tested is { C, D }; the semantic features of the second screenshot to be tested are { A, B, E, G } or { A, B, E, F, G }; the semantic features of the first screenshot to be tested are a subset of the standard semantic feature set, while the semantic features of the second screenshot to be tested are not a subset of the standard semantic feature set.

The union of semantic features of each screenshot to be tested is a subset of the standard semantic feature set.

For example, the standard semantic feature set for a page contains { A, B, C, D, E, F }. The page to be tested contains two screenshots to be tested. The semantic feature of the first screenshot to be tested is { C, D }; the semantic feature of the second screenshot to be tested is { A, B, E }. Although the first screenshot to be tested and the second screenshot to be tested are all subsets of the standard semantic feature set, the union set of the semantic features of the first screenshot to be tested and the semantic features of the second screenshot to be tested is a subset of the standard semantic feature set, and the lack of the semantic tag 'F' in the union set indicates that the page lacks image content corresponding to the semantic tag.

Furthermore, in the above embodiment, the extraction model may specifically be selected from convolutional neural network structures similar to AlexNet. The convolutional neural network structure similar to AlexNet comprises 5 layers of convolutional layers, 3 layers of pooling layers and 3 layers of full connection layers, and because pooling layers are usually calculated in corresponding convolutional layers and are not independently calculated, the convolutional neural network of the application comprises 8 layers of convolutional layers, 5 layers of convolutional layers and 3 layers of full connection layers, as shown in figure 3.

In the above embodiment, step 1022, taking the image data as an input of an extraction model, and executing the extraction model to obtain an output result corresponding to the first screenshot to be tested may be implemented specifically by the following steps:

s1, convolving the image data with a first convolution kernel to obtain a first convolution result.

Specifically, referring to fig. 2, the image data is convolved with 64 first convolution kernels of 11×11×3 to output a first convolution result C1 of 55×55×64.

S2, carrying out pooling operation on the first convolution result, and then convolving with a second convolution kernel to obtain a second convolution result.

The pooling operation is to perform downsampling operation on an input feature map and equalize a region with a steep product change in a first convolution result. With continued reference to fig. 3, the pooled first convolution result C1 is convolved with 192 second convolution kernels of 5×5×64 to yield an output second convolution result C2 of 27×27×192.

S3, carrying out pooling operation on the second convolution result, and then convolving with a third convolution kernel to obtain a third convolution result.

With continued reference to fig. 3, the second convolution result C2 after the pooling operation is convolved with 384 third convolution kernels of 3×3×192, resulting in an output third convolution result C3 of 13×13×384.

S4, convolving the third convolution result with a fourth convolution kernel to obtain a fourth convolution result.

With continued reference to fig. 3, the third convolution result C3 is directly convolved with 256 of the fourth convolution kernels of 3×3×384 to yield a fourth layer convolution result C4 of size 13×13×256.

S5, convolving the fourth convolution result with a fifth convolution kernel to obtain a fifth convolution result.

With continued reference to fig. 3, the fourth layer convolution result C4 is convolved with 256 fifth convolution kernels of 3 x 256 to yield a fifth layer convolution result C5 of size 13 x 256.

S6, carrying out pooling operation on the fifth convolution result, and carrying out at least one full connection operation on the fifth convolution result after pooling operation to obtain a sixth full connection result.

With continued reference to FIG. 3, the fifth layer convolution result is C5' with an output size of 6X6X256 after the C5 pooling operation; c5' passes through the first full connection layer to obtain a first full connection layer output FC6 with the size of 1 multiplied by 4096 (4096);

FC6 then passes through the second full-connection layer to obtain a second full-connection layer output FC7 with the size of 4096;

and FC passes through the third full-connection layer to obtain a third full-connection layer output FC7' serving as a sixth full-connection result.

And S7, classifying the sixth full-connection result to obtain the abstract feature of the first screenshot to be tested.

With continued reference to fig. 3, the sixth full-connection result FC7' passes through a softmax classification layer to obtain an eighth full-connection layer output FC8, with a size of 2, that is, the output result.

Each time the image data passes through the convolution layer, i.e., convolved with the convolution kernel, the resulting calculation formula for each neuron in the output (C1-C5 as shown in fig. 3) to connect with a corresponding number of neurons in the input image data can be expressed as:

wherein,the j-th neuron in the output characteristic diagram of the first layer is represented by +.>Representing all and neurons->Neurons in connected input images, < >>Representation and->Corresponding weights, ++>Representation and->The corresponding offset, reLU, represents Rectified Linear Unit activation function. The relationship of the output neurons and the input neurons after the image data passes through the full-connection layer is similar to (1), except that the activation function of the last full-connection layer is softmax.

What needs to be explained here is: the embodiment of the application is not limited to the specific implementation of the convolutional neural network structure, and the above list is only one specific implementation of the convolutional neural network structure. In essence, the input image size, number of layers, each layer function, each layer parameter, etc. may be changed as desired.

From the above, in order to improve the accuracy of high-level semantic feature extraction, the embodiment of the application adopts an artificial intelligence learning algorithm simulating human brain, such as a convolutional neural network algorithm. The learning algorithm needs training and testing processes to enable network output to be more accurate. Taking a convolutional neural network algorithm as an example, the algorithm needs to transmit training samples layer by layer through forward propagation to obtain a predicted value corresponding to the training samples; and then, the error layer between the predicted value and the true value is transmitted back to each layer by using a back propagation (Backpropagation Algorithm, BP) algorithm, and finally, the partial derivative of the error cost function on the parameters of each layer is calculated, the weight of each layer can be adjusted and updated by using a random gradient descent method, and the network output is more accurate by modifying the weight of each layer.

Therefore, the extraction model in the technical scheme provided by the embodiment of the application can be obtained by adopting the following modes:

104. a training sample set and a test sample set are obtained.

105. And training the training model by using the training sample set to obtain a trained model.

106. And testing the trained model by using the test sample set to obtain the accuracy of the trained model.

107. And taking the trained model with the accuracy meeting the requirement as the extraction model.

In 104, the training sample set and the test sample set may be obtained by intercepting a sample page. For example, a browser automation tool is deployed on a standardized tester that installs different browsers to invoke an interface of each browser on the tester to access each sample page and traverse page elements while taking a partition screenshot of each sample page. All the screenshots of each browser are uploaded to the database in batches, and the information such as the number and the format of the images is automatically counted. The system (such as a server or a client) can support the label of the online labeling image, so that the subsequent training is convenient. In the subsequent training process, parameters in the training model are adjusted by using labels corresponding to images and comparison results of output results of the training model, so that the output of the trained model is more accurate. When the method is implemented, the system can automatically label the images, or the system provides a label configuration interface for a user, and the user can label each image through the label configuration interface; the embodiment of the present application is not particularly limited thereto.

Some of all the shots stored in the database may be used as training samples and another part may be used as test samples. For example, the system automatically separates 10% of the shots as test samples and the remaining 90% of the shots as training samples, thus creating a training sample set and a test sample set.

The screenshot format may be png or jpg, which is not limited in the embodiment of the present application.

It can be seen that this step 104 may include the following steps:

1041. a plurality of screenshot samples are obtained.

In specific implementation, sample pages can be loaded in a plurality of different browsers; and calling the corresponding interfaces of the browsers to access the sample page displayed in each browser, and intercepting the sample page displayed in each browser in a regional way to obtain the screenshot samples.

1042. And labeling semantic tags for each screenshot sample.

The label marking can be completed manually by a user or automatically by a system. For example, the step 1042 may specifically include:

responding to a label marking operation triggered by a user aiming at the first screenshot sample, and associating at least one semantic label pointed by the label marking operation with the first screenshot sample; or alternatively

And taking the first screenshot sample as input, and executing an image multi-label labeling model to obtain at least one semantic label corresponding to the first screenshot sample.

The image multi-label labeling model can refer to related contents in the prior art, and is not described herein.

1043. And separating a part of screenshot samples from the plurality of screenshot samples to serve as training samples, and taking the part of screenshot samples as test samples.

For example, 10% of the multiple screenshot samples are randomly selected as test samples, and the remaining 90% are training samples.

1044. And establishing the training sample set based on the training sample and the semantic tags of the training sample.

1045. And establishing the test sample set based on the test sample and the semantic tags of the test sample.

Further, in order to improve the generalization capability of the model, that is, improve the accuracy of the model in practical application. The training sample set may further include training samples, that is, the step 104 may further include:

1046. and performing image adjustment on the screenshot sample serving as a training sample.

1047. And adding the screenshot sample with the adjusted image as a new training sample to the training sample set.

Wherein, the image adjustment of 1046 includes:

changing the size of a screenshot sample; and/or

Changing the shape of the screenshot sample; and/or

Intercepting 85% -90% of areas in the screenshot sample; and/or

Changing the direction of a screenshot sample; and/or

The color of the screenshot sample is changed.

Wherein changing the color of the screenshot sample includes: changing the brightness of the screenshot sample; and/or changing the saturation of the screenshot samples; and/or changing the contrast of the screenshot samples.

What needs to be explained here is: the size of the screenshot samples can be changed by randomly reducing the pixel values of the image by 10%; the shape of the screenshot sample is adjusted by stretching or compressing the shape of the graph; the screenshot sample is subjected to secondary screenshot in a mode of randomly intercepting 85% -90% (such as 87.5%) of areas from the image so as to distinguish and adjust pixels of the image or compress the image; the direction of the screenshot sample is changed by reversing the screenshot sample left and right or up and down.

After the training sample set is ready, the size of the image may not match the training model at this time, so that the training samples in the training sample set need to be preprocessed before training. And then, training is completed by taking the data obtained after pretreatment as the input of a training model, so as to obtain the extraction model required in the embodiment of the application. That is, the step 105 may specifically include the following steps:

1051. And preprocessing a first training sample in the training sample set to obtain first sample data.

For example, the training samples are resized to three-channel RGB image data. Wherein the set size is related to the training model. The design network structure of the training model is different, and the corresponding setting size is different. For example, the convolutional neural network model with the structure shown in fig. 3 may be 224×224. Correspondingly, the pretreatment process can be specifically as follows: the training samples are adjusted to 224 x 224 three channel RGB image data.

What is needed here is that: in order to increase the training speed during training the model, a multithread concurrent processing mode can be utilized. For example, 2100 images are processed concurrently using 100 threads, requiring only 2 seconds for the trial. In addition, in order to accelerate training, a plurality of images can be synthesized into a batch (batch) during specific implementation, and a batch of data is converted into a format which can be rapidly processed by a training model, so that the training process can be accelerated by more than 1 time, and the prediction speed of the processing method for the trained model is also improved.

1052. And taking the first sample data as input of a training model, and executing the training model to obtain a first result.

1053. And optimizing parameters in the training model according to the difference between the numerical value of the semantic label corresponding to the training sample and the first result.

1054. Preprocessing a second training sample in the training sample set, taking the preprocessed second training sample as input of the training model, and executing the training model to obtain a second result until the difference between the numerical value of the semantic label corresponding to the training sample in the training sample set and the result obtained by executing the training model meets a preset condition.

When the difference between the abstract features obtained by executing the training model by taking the training sample as input and the numerical value of the semantic tag corresponding to the training sample meets the preset condition, the training model obtained at the moment is trained, and the trained model is obtained. That is, the model training process can be simply understood as: after the training model is designed, iterative optimization is started, parameters such as weight and bias of the model are optimized according to the result of each iteration, and the calculation process is divided into two processes of forward propagation (namely the step 1052) and backward propagation (namely the steps 1053 and 1054). The forward propagation is to input training samples into a training model, and calculate the actual output by using the current parameters; the back propagation process is to calculate the difference between the actual output and the ideal output, and then reversely optimize the network parameters from the output layer to the input layer according to the difference, iterate until the stopping condition is met, and after the stopping condition is met, training is finished, thus obtaining the trained model. The ideal output is a preset value, namely, each semantic label corresponding to the training sample serving as input. Each semantic label has a corresponding numerical value, and the difference value can be obtained by comparing the result obtained by inputting the training sample into the training model with the numerical value of the semantic label. And then, the test samples are subjected to the same pretreatment and then are input into a trained extraction model for testing the accuracy, the process only comprises forward transmission, namely, the input images are transmitted layer by layer in a network, the actual output result is obtained, the actual output and the ideal output are compared, and the accuracy of the model is obtained by integrating all the test samples. The same ideal output is the label corresponding to the test sample; and comparing the actual output with the numerical value of the label corresponding to the test sample to obtain whether the output of the extraction model is accurate.

Namely, in the above 106, the following method may be specifically adopted:

1061. and respectively taking each test sample in the test sample set as input, and executing the trained model to obtain an execution result corresponding to each test sample.

1062. And calculating the proportion of the number of the execution results corresponding to the test samples in the test sample set and the corresponding numerical values of the semantic tags to the number of the execution results corresponding to the test samples in the test sample set to the total number of the test samples, and obtaining the accuracy.

For example, the number of test samples in the test sample set is 100, where the number of execution results corresponding to the test samples is 98 consistent with the value of the corresponding semantic tags; the trained model has an accuracy of 98%.

The system may train a plurality of trained models while training the models. The purpose of testing the trained model by adopting the test sample set is to select the trained model with the accuracy meeting the requirement from a plurality of models as an extraction model. In specific implementation, the model with the highest accuracy rate can be selected as the extraction model, or the model with the accuracy rate more than 97% can be selected as the extraction model. If the accuracy of two or more trained models meets the requirement, one trained model can be randomly selected from the two or more trained models to serve as the extraction model.

After the extraction model is obtained, the extraction model can be used for extracting abstract features of the screenshot of the page to be tested, and the process is similar to that of a test sample: and opening the page to be tested on the testing machine by using various browsers, and automatically capturing the screen of the page to be tested by using a tool to obtain the screen capture to be tested. Firstly, a screenshot to be measured is stored in a database, so that the screenshot to be measured is convenient to use for fine tuning of an extraction model, then the screenshot to be measured is preprocessed and then is input into the extraction model to obtain abstract features of each screenshot to be measured of a page to be measured, and the extraction model only comprises forward propagation.

In order to illustrate the technical effects of the technical scheme provided by the application, the inventor collects page images of a plurality of applications for experimental verification. The total number of images in the experiment was about 2100, with 1400 images being displayed normally and 700 images being displayed abnormally. 140 images showing normal and 70 images showing abnormal were randomly extracted as test sample sets, accounting for about 10% of the whole sample set, and the other 90% was used as training sample set. As a result of the experiment, the detection accuracy of the system is more than 97%.

It should be noted that, the execution subjects of each step of the method provided in the above embodiment may be the same device, or the method may also be executed by different devices. For example, the execution subject of steps 501 to 503 may be device a; for another example, the execution subject of steps 501 and 502 may be device a, and the execution subject of step 503 may be device B; etc.

The technical scheme provided by the implementation of the application does not need to provide a special virtual machine for testing, and only needs to cooperate with the functions of automatic browsing/traversing and screenshot of pages of a plurality of operating systems or clients of the clients (such as a PC (personal computer) end and a mobile end). The client uploads the intercepted screenshot of the page to be tested to the server, the server performs compatibility test on the page to be tested based on the technology of extracting semantic features, and then the test result is fed back to the client. Thus, as shown in fig. 4, a schematic structural diagram of a page compatibility testing system according to an embodiment of the present application is provided. As shown in fig. 4, the system provided in this embodiment includes: and the client and the server. Wherein,

a client 201, configured to load a page to be tested in a browser; the partition captures the page to be detected to obtain at least one screenshot to be detected; uploading the at least one screenshot to be tested to the server 202;

the server 202 is configured to receive at least one screenshot to be tested of a page to be tested displayed in a browser, which is uploaded by the client 201; extracting abstract features of each screenshot to be detected from the at least one screenshot to be detected respectively; and determining the compatibility of the page to be tested in the browser according to the abstract features of each screenshot to be tested, and feeding back the determination result to the client 201.

The specific workflow of each component unit, such as the client and the server, and the signaling interaction between the component units in the page compatibility testing system provided by the embodiment of the application will be further described in the following embodiments.

Fig. 5 is a flow chart of a page compatibility testing method according to an embodiment of the application. The method provided by the embodiment is suitable for the server. The server may be a conventional server, a cloud end, a virtual server, etc., which is not particularly limited in the embodiment of the present application. As shown in fig. 5, the page compatibility testing method includes:

301. and receiving at least one screenshot to be tested of the page to be tested, which is uploaded by the client and displayed in the browser.

302. And extracting abstract features of each screenshot to be detected from the at least one screenshot to be detected respectively.

303. And determining the compatibility of the page to be tested in the browser according to the abstract characteristics of each screenshot to be tested.

304. And feeding back the determined result to the client.

In 301, a client performs screenshot on a page to be tested displayed in a browser by calling a screenshot function of the client; and uploading the intercepted screenshot to be tested to a server to judge the compatibility of the page to be tested in the browser based on the screenshot to be tested by the server. By adopting the architecture, a virtual machine special for testing is not required to be provided, and page compatibility testing can be conveniently performed by matching with each operating system of different clients.

The specific implementation of the foregoing 302 and 303 may be referred to in the foregoing embodiment of fig. 1, and will not be described herein.

Further, the page compatibility testing method provided by the embodiment of the application further comprises the following steps:

304. a training sample set and a test sample set are obtained.

305. And training the training model by using the training sample set to obtain a trained model.

306. And testing the trained model by using the test sample set to obtain the accuracy of the trained model.

307. And taking the trained model with the accuracy meeting the requirement as the extraction model.

The foregoing 305 to 306 may be specific to the relevant content in the embodiment shown in fig. 1, and will not be described herein.

In 304, the obtaining manner of the training sample set and the test sample set may specifically be:

3041. a plurality of screenshot samples uploaded by a client are received.

When the method is implemented, the client can call the self screenshot function to screenshot the sample page after loading the sample page in the browser; and then uploading the intercepted image to a server as a screenshot sample.

3042. And labeling semantic tags for each screenshot sample.

The semantic tags can be marked manually by a user through a client and uploaded to a server; or the labeling of the semantic tags is automatically realized by the server. That is, the step 3042 may specifically include:

After receiving at least one semantic tag appointed by a user uploaded by the client for the first screenshot sample, associating the at least one semantic tag with the first screenshot sample; or alternatively

3043. And separating a part of screenshot samples from the plurality of screenshot samples to serve as training samples, and taking the part of screenshot samples as test samples.

3045. And establishing the training sample set based on the training sample and the semantic tags of the training sample.

3046. And establishing the test sample set based on the test sample and the semantic tags of the test sample.

In the same way as in the embodiment shown in fig. 1, the training sample set in this embodiment may also be added with a screenshot sample obtained after image adjustment of the training sample as a new training sample. Namely, the above 304 may further include:

3047. and performing image adjustment on the screenshot sample serving as a training sample.

Specifically, the image adjustment includes: changing the size of a screenshot sample; and/or changing the shape of the screenshot sample; and/or intercepting 85% -90% of the area in the screenshot sample; and/or changing the direction of the screenshot samples; and/or changing the color of the screenshot samples.

For more details of image adjustment, reference may be made to the embodiment shown in fig. 1 and will not be repeated here.

3048. And adding the screenshot sample with the adjusted image as a new training sample to the training sample set.

Fig. 6 is a schematic flow chart of a page compatibility testing method according to another embodiment of the present application. The method provided in this embodiment is applicable to a client, where the client may be a piece of hardware integrated on a terminal and having an embedded program, or may be an application software installed in the terminal, or may be a tool software embedded in an operating system of the terminal, and the embodiment of the present application is not limited to this. The terminal may be any terminal device including a mobile phone, a tablet computer, a PDA (Personal Digital Assistant ), a POS (Point of Sales), a car computer, and the like. Specifically, as shown in fig. 6, the method provided in this embodiment includes:

401. and loading the page to be tested in the browser.

402. And the partition captures the page to be tested to obtain at least one screenshot to be tested.

403. Uploading the at least one screenshot to be tested to a server, so that the server can determine the compatibility of the page to be tested in the browser.

In 401 above, the page to be tested may be loaded in the browser after the user inputs a website or clicks a link in the browser.

In 402, the screenshot operation may be performed after receiving the screenshot instruction sent by the server, or may be performed automatically after loading the page to be tested in the browser, which is not limited in particular in the embodiment of the present application.

The server in 403 determines the compatibility of the page to be tested in the browser based on the abstract features of each screenshot to be tested, and how the abstract features of each screenshot to be tested are extracted can refer to the related content in each embodiment, which is not described herein.

Further, after receiving the determination result fed back by the server, the client may display the determination result fed back by the server; and/or semantically outputting a determination result fed back by the server; and/or outputting an alarm prompt when the screenshot to be detected is not compatible in the browser as a result of the determination.

Furthermore, the client can upload the page to be tested to the server, and can provide training samples and test samples for training the extraction model for the server. Namely, the method provided by the embodiment of the application can further comprise the following steps:

404. sample pages are loaded in a plurality of different browsers.

405. And calling the corresponding interfaces of the browsers to access the sample page displayed in each browser, and intercepting the sample page displayed in each browser in a regional way to obtain the screenshot samples.

406. And uploading the screenshot samples to a server to train and test a training model by using the screenshot samples by the server to obtain an extraction model.

The abstract features of each screenshot to be tested are obtained by executing the extraction model by taking each screenshot to be tested as an input parameter.

The training and testing process of the extraction model may refer to the corresponding content in the above embodiments, which is not described herein.

From the above, it can be seen that the training model also needs to label the semantic tags on the screenshot samples during training. The semantic tags can be marked manually by a user through a client, and then uploaded to a server by the client; and the label can be automatically marked by the server side. If the labeling of the semantic tags is implemented by manual labeling of the user, the technical scheme provided by the embodiment may further include the following steps:

and responding to label labeling operation triggered by a user for each screenshot sample, uploading semantic labels labeled by the user for each screenshot sample to the server, establishing a training sample set and a test sample set by the server based on each screenshot sample and the semantic labels corresponding to each screenshot sample, and training and testing the training model by using the training sample set and the test sample set to obtain the extraction model.

The technical scheme provided by the embodiment of the application can be simply understood as follows: collecting page images under various browsers through an automation tool, making statistics, and storing the statistics into a database; reasonable in design extracts the characteristic of the model structure learning page image, and stores the model with highest accuracy; and deploying the extraction model which is well trained and tested in a storage mode into a system, and testing the page images of the new multiple browsers. As shown in figure 7 of the drawings,

501. The client uploads a plurality of screenshot samples to the server and semantic tags specified by a user for each screenshot sample.

502. The service end establishes a training sample set and a testing sample set based on the plurality of screenshot samples.

503. The server trains the training model by using training samples in the training sample set to obtain a plurality of trained models.

504. The server side tests the trained models by using test samples in the test sample set, and selects a model with highest test accuracy from a plurality of trained models as an extraction model.

505. And uploading at least one screenshot to be tested of the page to be tested to the server by the client.

506. The server pre-processes each screenshot to be tested, takes each preprocessed screenshot to be tested as input of an extraction model, and executes the extraction model to obtain abstract features of each screenshot to be tested.

507. And the server acquires semantic tags corresponding to the abstract features of each screenshot to be tested according to the corresponding relation between the abstract features and the semantic tags.

508. And the server determines the compatibility of the page to be tested in the browser by judging whether semantic tags corresponding to the screenshots to be tested belong to abnormal class tags or not.

509. The server feeds back the determination result to the client.

510. The client outputs the determination result.

Specific implementations of 501-510 are described in the foregoing embodiments, and are not specifically limited herein.

Firstly, by adopting the technical scheme provided by the embodiment of the application, a virtual machine special for testing is not required to be provided, and the page compatibility test can be conveniently carried out only by matching with the automatic page browsing/traversing and screenshot functions of a plurality of operating systems or clients of the PC end and the mobile end. On the other hand, the scheme does not need to manually write a program script, and for a page of an application, once model training is completed, testing can be completed only by automatically capturing and detecting the program. Because convolutional neural networks extract high-level features of images, rather than comparing images at pixel level, even if pages are changed frequently, if the style of the overall features is unchanged, the screenshot labeling or model training is not required again. On the other hand, the scheme is not affected by the environment, even if the pixels of the page to be tested change, the size of the page to be tested changes, and the convolutional neural network extracts the high-level features of the image instead of comparing the image at the pixel level, so that the scheme is not affected on the page to be tested. Even if the model is retrained, the cost of the model is far lower than that of the re-recording script of other schemes, because the model training system is ready, a new model can be generated and deployed in 3 hours by training only by inputting a new image sample, the training process does not need manual intervention, the model does not need manual maintenance, and the labor cost is far lower than that of the recording script. In addition, the time for identifying the single image is about 0.2s, compared with the time for manually carrying out page compatibility test on the single image is more than 2s, if the switching time of the image and the page is considered, the identification efficiency is improved by about 2 orders of magnitude, and the accuracy of the model can be improved to more than 99% by adding training data.

Fig. 8 is a schematic structural diagram of a page compatibility testing apparatus according to an embodiment of the present application. As shown in fig. 8, the apparatus provided in this embodiment includes:

the first obtaining module 601 is configured to obtain at least one screenshot to be tested of a page to be tested displayed in a browser;

a first extraction module 602, configured to extract abstract features of each screenshot to be tested from the at least one screenshot to be tested respectively;

the first determining module 603 is configured to determine compatibility of the page to be tested in the browser according to the abstract features of each screenshot to be tested.

Further, the first extraction module 602 is further configured to: preprocessing a first screenshot to be detected to obtain image data; and taking the image data as input of an extraction model, and executing the extraction model to obtain abstract features of the first screenshot to be tested. The first screenshot to be tested is any one screenshot to be tested in the at least one screenshot to be tested.

Further, the first extraction module 602 is further configured to: and adjusting the first screenshot to be tested into three-channel RGB image data with a set size.

Wherein, the first extraction module 602 may select a convolutional neural network model.

Further, the first extraction module 602 is further configured to:

convolving the image data with a first convolution kernel to obtain a first convolution result;

carrying out pooling operation on the first convolution result and then convolving with a second convolution kernel to obtain a second convolution result;

performing pooling operation on the second convolution result, and then convolving with a third convolution kernel to obtain a third convolution result;

the third convolution result and the fourth convolution kernel are convolved to obtain a fourth convolution result;

convolving the fourth convolution result with a fifth convolution kernel to obtain a fifth convolution result;

carrying out pooling operation on the fifth convolution result, and carrying out at least one full connection operation on the fifth convolution result after pooling operation to obtain a sixth full connection result;

and classifying the sixth full-connection result to obtain the abstract feature of the first screenshot to be tested.

Further, the first determining module 603 is further configured to: acquiring semantic tags corresponding to the abstract features of each screenshot to be tested respectively according to the corresponding relation between the abstract features and the semantic tags; and determining the compatibility of the page to be tested in the browser by judging whether semantic tags corresponding to the screenshot to be tested belong to abnormal class tags or not.

Further, the first determining module 603 is further configured to: and if one semantic tag corresponding to the screenshot to be detected belongs to an abnormal class tag in the semantic tags corresponding to the screenshot to be detected, determining that the page to be detected has no compatibility in the browser.

Further, the page compatibility testing device provided in this embodiment further includes:

the second acquisition module is used for acquiring a training sample set and a test sample set;

the training module is used for training the training model by utilizing the training sample set so as to obtain a trained model;

the test module is used for testing the trained model by utilizing the test sample set so as to obtain the accuracy of the trained model;

and the selection module is used for taking the trained model with the accuracy meeting the requirement as the extraction model.

Further, the second obtaining module is further configured to: acquiring a plurality of screenshot samples; labeling semantic tags for each screenshot sample; separating a part of screenshot samples from the plurality of screenshot samples to be used as training samples, and the other part of screenshot samples to be used as test samples; establishing the training sample set based on the training samples and semantic tags of the training samples; and establishing the test sample set based on the test sample and the semantic tags of the test sample.

Further, the second obtaining module is further configured to: loading sample pages in a plurality of different browsers; and calling the corresponding interfaces of the browsers to access the sample page displayed in each browser, and intercepting the sample page displayed in each browser in a regional way to obtain the screenshot samples.

Further, the second obtaining module is further configured to: responding to a label marking operation triggered by a user aiming at the first screenshot sample, and associating at least one semantic label pointed by the label marking operation with the first screenshot sample; or taking the first screenshot sample as input, and executing an image multi-label labeling model to obtain at least one semantic label corresponding to the first screenshot sample.

Further, the second obtaining module is further configured to: performing image adjustment on a screenshot sample serving as a training sample; and adding the screenshot sample with the adjusted image as a new training sample to the training sample set.

Further, the second obtaining module is further configured to: changing the size of a screenshot sample; and/or changing the shape of the screenshot sample; and/or intercepting 85% -90% of the area in the screenshot sample; and/or changing the direction of the screenshot samples; and/or changing the color of the screenshot samples.

Further, the second obtaining module is further configured to: changing the brightness of the screenshot sample; and/or changing the saturation of the screenshot samples; and/or changing the contrast of the screenshot samples.

Further, the training module is further configured to: preprocessing a first training sample in the training sample set to obtain first sample data; taking the first sample data as input of a training model, and executing the training model to obtain a first result; according to the difference between the numerical value of the semantic label corresponding to the training sample and the first result, optimizing the parameters in the training model; preprocessing a second training sample in the training sample set, taking the preprocessed second training sample as input of the training model, and executing the training model to obtain a second result until the difference between the numerical value of the semantic label corresponding to the training sample in the training sample set and the result obtained by executing the training model meets a preset condition, thereby obtaining the trained model.

Further, the test module is further configured to: taking each test sample in the test sample set as input, and executing the trained model to obtain an execution result corresponding to each test sample; and calculating the proportion of the number of the execution results corresponding to the test samples in the test sample set and the corresponding numerical values of the semantic tags to the number of the execution results corresponding to the test samples in the test sample set to the total number of the test samples, and obtaining the accuracy.

What needs to be explained here is: the page compatibility testing device provided in the foregoing embodiment may implement the technical solution described in the foregoing method embodiment shown in fig. 1, and the specific implementation principle of each module or unit may refer to the corresponding content in the foregoing corresponding method embodiment, which is not repeated herein.

Fig. 9 is a schematic structural diagram of a page compatibility testing apparatus according to an embodiment of the present application. As shown in fig. 9, the apparatus provided in this embodiment includes:

the receiving module 701 is configured to receive at least one screenshot to be tested of a page to be tested displayed in a browser, which is uploaded by a client;

a second extraction module 702, configured to extract abstract features of each screenshot to be tested from the at least one screenshot to be tested, respectively;

a second determining module 703, configured to determine, according to the abstract features of each screenshot to be tested, compatibility of the page to be tested in the browser;

And the feedback module 704 is used for feeding back the determined result to the client.

Further, the second extraction module 702 is further configured to pre-process the first screenshot to be tested to obtain image data; and taking the image data as input of an extraction model, and executing the extraction model to obtain abstract features of the first screenshot to be tested. The first screenshot to be tested is any one screenshot to be tested in the at least one screenshot to be tested.

Further, the second extraction module 702 is further configured to adjust the first screenshot to be tested to three-channel RGB image data with a set size.

Further, the second extraction model is a convolutional neural network model.

Further, the second extraction module 702 is further configured to convolve the image data with a first convolution kernel to obtain a first convolution result; carrying out pooling operation on the first convolution result and then convolving with a second convolution kernel to obtain a second convolution result; performing pooling operation on the second convolution result, and then convolving with a third convolution kernel to obtain a third convolution result; the third convolution result and the fourth convolution kernel are convolved to obtain a fourth convolution result; convolving the fourth convolution result with a fifth convolution kernel to obtain a fifth convolution result; carrying out pooling operation on the fifth convolution result, and carrying out at least one full connection operation on the fifth convolution result after pooling operation to obtain a sixth full connection result; and classifying the sixth full-connection result to obtain the abstract feature of the first screenshot to be tested.

Further, the second determining module 703 is further configured to obtain a semantic tag corresponding to the abstract feature of each screenshot to be tested according to the corresponding relationship between the abstract feature and the semantic tag; and determining the compatibility of the page to be tested in the browser by judging whether semantic tags corresponding to the screenshot to be tested belong to abnormal class tags or not.

Further, the second determining module 703 is further configured to determine that the page to be tested does not have compatibility in the browser if one of the semantic tags corresponding to the screenshot to be tested belongs to an abnormal tag.

Further, the page compatibility testing device further includes:

the acquisition module is used for acquiring a training sample set and a test sample set;

Further, the acquisition module is further used for receiving a plurality of screenshot samples uploaded by the client; labeling semantic tags for each screenshot sample; separating a part of screenshot samples from the plurality of screenshot samples to be used as training samples, and the other part of screenshot samples to be used as test samples; establishing the training sample set based on the training samples and semantic tags of the training samples; and establishing the test sample set based on the test sample and the semantic tags of the test sample.

Further, the obtaining module is further configured to: after receiving at least one semantic tag appointed by a user uploaded by the client for the first screenshot sample, associating the at least one semantic tag with the first screenshot sample; or alternatively

Further, the acquisition module is further used for carrying out image adjustment on a screenshot sample serving as a training sample; and adding the screenshot sample with the adjusted image as a new training sample to the training sample set.

Further, the obtaining module is further configured to: changing the size of a screenshot sample; and/or changing the shape of the screenshot sample; and/or intercepting 85% -90% of the area in the screenshot sample; and/or changing the direction of the screenshot samples; and/or changing the color of the screenshot samples.

Further, the training module is further configured to pre-process a first training sample in the training sample set to obtain first sample data; taking the first sample data as input of a training model, and executing the training model to obtain a first result; according to the difference between the numerical value of the semantic label corresponding to the training sample and the first result, optimizing the parameters in the training model; preprocessing a second training sample in the training sample set, taking the preprocessed second training sample as input of the training model, and executing the training model to obtain a second result until the difference between the numerical value of the semantic label corresponding to the training sample in the training sample set and the result obtained by executing the training model meets a preset condition, thereby obtaining the trained model.

Further, the test module is further configured to respectively take each test sample in the test sample set as input, and execute the trained model to obtain an execution result corresponding to each test sample; and calculating the proportion of the number of the execution results corresponding to the test samples in the test sample set and the corresponding numerical values of the semantic tags to the number of the execution results corresponding to the test samples in the test sample set to the total number of the test samples, and obtaining the accuracy.

What needs to be explained here is: the page compatibility testing device provided in the foregoing embodiment may implement the technical solution described in the foregoing method embodiment shown in fig. 5, and the specific implementation principle of each module or unit may refer to the corresponding content in the foregoing corresponding method embodiment, which is not repeated herein.

Fig. 10 is a schematic structural diagram of a page compatibility testing apparatus according to an embodiment of the present application. As shown in fig. 10, the apparatus includes:

A loading module 801, configured to load a page to be tested in a browser;

the screenshot module 802 is configured to partition the to-be-tested page to obtain at least one to-be-tested screenshot;

an uploading module 803, configured to upload the at least one screenshot to be tested to a server, so that the server determines compatibility of the page to be tested in the browser;

Further, the device further comprises:

the output module is used for displaying the determination result fed back by the server; and/or semantically outputting a determination result fed back by the server; and/or outputting an alarm prompt when the screenshot to be detected is not compatible in the browser as a result of the determination.

Further, the device further comprises:

the loading module 801 is further configured to load sample pages in a plurality of different browsers;

the screenshot module 802 is further configured to call an interface corresponding to each browser to access a sample page displayed in each browser, so as to intercept the sample page displayed in each browser in a split area to obtain the plurality of screenshot samples;

The uploading module 803 is further configured to upload the plurality of screenshot samples to a server, so that the server trains and tests the training model by using the plurality of screenshot samples to obtain an extraction model;

Further, the uploading module 803 is further configured to respond to a label labeling operation triggered by a user for each screenshot sample, upload semantic labels labeled by the user for each screenshot sample to the server, so that a training sample set and a test sample set are established by the server based on each screenshot sample and the semantic labels corresponding to each screenshot sample, and train and test the training model by using the training sample set and the test sample set respectively to obtain the extraction model.

What needs to be explained here is: the page compatibility testing device provided in the foregoing embodiment may implement the technical solution described in the foregoing method embodiment shown in fig. 6, and the specific implementation principle of each module or unit may refer to the corresponding content in the foregoing corresponding method embodiment, which is not repeated herein.

Fig. 11 is a schematic structural diagram of an electronic device according to an embodiment of the present application. As shown in fig. 11, the electronic device includes: a first memory 901 and a first processor 902, wherein,

the first memory 901 is configured to store a program;

the first processor 902 is coupled to the first memory 901, and is configured to execute the program stored in the first memory 901, for:

The first memory 901 may be configured to store various other data to support operations on the cloud device. Examples of such data include instructions for any application or method operating on the cloud device. The first memory 901 may be implemented by any type or combination of volatile or non-volatile memory devices, such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disk.

The first processor 902 may realize other functions in addition to the above functions when executing the program in the first memory 901, and the above description of the embodiments may be referred to specifically.

Further, as shown in fig. 11, the electronic device further includes: a first communication component 903, a first display 904, a first power supply component 905, a first audio component 906, and other components. Only some of the components are schematically shown in fig. 11, which does not mean that the electronic device only comprises the components shown in fig. 11.

Accordingly, the embodiments of the present application also provide a computer-readable storage medium storing a computer program, where the computer program when executed by a computer can implement the steps or functions of the page compatibility testing method provided in the foregoing embodiments.

Fig. 12 is a schematic structural diagram of a server device according to an embodiment of the present application. As shown in fig. 12, the server device includes: a second memory 1001 and a second processor 1002, wherein,

the second memory 1001 is configured to store a program;

the second processor 1002 is coupled to the second memory 1001, and is configured to execute the program stored in the second memory 1001, for:

and feeding back the determined result to the client.

The second memory 1001 may be configured to store various other data to support operations on the cloud device. Examples of such data include instructions for any application or method operating on the cloud device. The second memory 1001 may be implemented by any type of volatile or non-volatile memory device or combination thereof, such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disk.

The second processor 1002 may realize other functions in addition to the above functions when executing the program in the second memory 1001, and the above description of the embodiments may be referred to specifically.

Further, as shown in fig. 12, the server device further includes: a second communication component 1003, a second display 1004, a second power component 1005, a second audio component 1006, and the like. Only some of the components are schematically shown in fig. 12, which does not mean that the server device only includes the components shown in fig. 12.

Fig. 13 is a schematic structural diagram of a client device according to another embodiment of the present application. As shown in fig. 13, the client device includes: a third memory 1101, and a third processor 1102, wherein,

the third memory 1101 is configured to store a program;

the third processor 1102 is coupled to the third memory 1101 for executing the program stored in the third memory 1101 for:

Loading a page to be tested in a browser;

The third memory 1101 described above may be configured to store various other data to support operations on the cloud device. Examples of such data include instructions for any application or method operating on the cloud device. The third memory 1101 may be implemented by any type of volatile or non-volatile memory device or combination thereof, such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disk.

The third processor 1102 may realize other functions in addition to the above functions when executing the program in the third memory 1101, and the above description of the embodiments may be referred to specifically.

Further, as shown in fig. 13, the client device further includes: a third communication component 1103, a third display 1104, a third power supply component 1105, a third audio component 1106, and the like. Only some of the components are schematically shown in fig. 13, which does not mean that the client device only comprises the components shown in fig. 13.

The display in fig. 11, 12 and 13 may include a screen, which may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive input signals from a user. The touch panel includes one or more touch sensors to sense touches, swipes, and gestures on the touch panel. The touch sensor may sense not only the boundary of a touch or slide action, but also the duration and pressure associated with the touch or slide operation.

The power supply assembly in fig. 11, 12 and 13 provides power to the various components of the device to which the power supply assembly belongs. The power components may include a power management system, one or more power sources, and other components associated with generating, managing, and distributing power for the devices to which the power components pertain.

The audio component in fig. 11, 12 and 13 is configured to output and/or input an audio signal. For example, the audio component includes a Microphone (MIC) configured to receive external audio signals when the device to which the audio component belongs is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signal may be further stored in a memory or transmitted via a communication component. In some embodiments, the audio assembly further comprises a speaker for outputting audio signals.

It will be appreciated by those skilled in the art that embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises the element.

The foregoing is merely exemplary of the present application and is not intended to limit the present application. Various modifications and variations of the present application will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. which come within the spirit and principles of the application are to be included in the scope of the claims of the present application.

Claims

1. A method for testing page compatibility, comprising:

extracting abstract features of each screenshot to be detected from the at least one screenshot to be detected by using a convolutional neural network model;

the first screenshot to be tested is any one screenshot to be tested in at least one screenshot to be tested; extracting abstract features of a first screenshot to be tested from the first screenshot to be tested by using a convolutional neural network model, wherein the abstract features comprise: convolving the image data of the first screenshot to be tested with a first convolution kernel to obtain a first convolution result; carrying out pooling operation on the first convolution result and then convolving with a second convolution kernel to obtain a second convolution result; performing pooling operation on the second convolution result, and then convolving with a third convolution kernel to obtain a third convolution result; convolving the third convolution result with a fourth convolution kernel to obtain a fourth convolution result; convolving the fourth convolution result with a fifth convolution kernel to obtain a fifth convolution result; carrying out pooling operation on the fifth convolution result, and carrying out at least one full connection operation on the fifth convolution result after pooling operation to obtain a sixth full connection result; and classifying the sixth full-connection result to obtain the abstract feature of the first screenshot to be tested.

2. The method as recited in claim 1, further comprising:

preprocessing the first screenshot to be tested to obtain the image data;

and taking the image data as input of the convolutional neural network model, and executing the convolutional neural network model to obtain abstract features of the first screenshot to be tested.

3. The method of claim 2, wherein preprocessing the first screenshot to be tested comprises:

and adjusting the first screenshot to be tested into three-channel RGB image data with a set size.

4. A method according to any one of claims 1 to 3, wherein determining the compatibility of the page under test in the browser based on the abstract features of the respective screenshot under test comprises:

acquiring semantic tags corresponding to the abstract features of each screenshot to be tested respectively according to the corresponding relation between the abstract features and the semantic tags;

and determining the compatibility of the page to be tested in the browser by judging whether semantic tags corresponding to the screenshot to be tested belong to abnormal class tags or not.

5. The method according to claim 4, wherein determining the compatibility of the page to be tested in the browser by determining whether the semantic tags corresponding to the respective screenshots to be tested belong to abnormal class tags comprises:

And if one semantic tag corresponding to the screenshot to be detected belongs to an abnormal class tag in the semantic tags corresponding to the screenshot to be detected, determining that the page to be detected has no compatibility in the browser.

6. A method according to any one of claims 1 to 3, further comprising:

acquiring a training sample set and a test sample set;

training the training model by using the training sample set to obtain a trained model;

testing the trained model by using the test sample set to obtain the accuracy of the trained model;

and taking the trained model with the accuracy meeting the requirement as the convolutional neural network model.

7. The method of claim 6, wherein the obtaining a training sample set and a test sample set comprises:

acquiring a plurality of screenshot samples;

labeling semantic tags for each screenshot sample;

separating a part of screenshot samples from the plurality of screenshot samples to be used as training samples, and the other part of screenshot samples to be used as test samples;

establishing the training sample set based on the training samples and semantic tags of the training samples;

and establishing the test sample set based on the test sample and the semantic tags of the test sample.

8. The method of claim 7, wherein the obtaining a plurality of screenshot samples comprises:

loading sample pages in a plurality of different browsers;

and calling the corresponding interfaces of the browsers to access the sample page displayed in each browser, and intercepting the sample page displayed in each browser in a regional way to obtain the screenshot samples.

9. The method of claim 7, wherein labeling a semantic tag for a first screenshot sample of the plurality of screenshot samples comprises:

10. The method of any of claims 7 to 9, wherein the acquiring a training sample set further comprises:

performing image adjustment on a screenshot sample serving as a training sample;

and adding the screenshot sample with the adjusted image as a new training sample to the training sample set.

11. The method of claim 10, wherein performing image adjustment on a screenshot sample that is a training sample comprises:

changing the size of a screenshot sample; and/or

Changing the shape of the screenshot sample; and/or

Intercepting 85% -90% of areas in the screenshot sample; and/or

Changing the direction of a screenshot sample; and/or

The color of the screenshot sample is changed.

12. The method of claim 11, wherein changing the color of the screenshot samples comprises:

changing the brightness of the screenshot sample; and/or

Changing the saturation of the screenshot sample; and/or

Changing the contrast of the screenshot sample.

13. The method of claim 6, wherein training the training model with the training sample set to obtain a trained model comprises:

preprocessing a first training sample in the training sample set to obtain first sample data;

taking the first sample data as input of a training model, and executing the training model to obtain a first result;

according to the difference between the numerical value of the semantic label corresponding to the training sample and the first result, optimizing the parameters in the training model;

Preprocessing a second training sample in the training sample set, taking the preprocessed second training sample as input of the training model, and executing the training model to obtain a second result until the difference between the numerical value of the semantic label corresponding to the training sample in the training sample set and the result obtained by executing the training model meets a preset condition, thereby obtaining the trained model.

14. The method of claim 6, wherein testing the trained model with the set of test samples to obtain the accuracy of the trained model comprises:

taking each test sample in the test sample set as input, and executing the trained model to obtain an execution result corresponding to each test sample;

and calculating the proportion of the number of the execution results corresponding to the test samples in the test sample set and the corresponding numerical values of the semantic tags to the number of the execution results corresponding to the test samples in the test sample set to the total number of the test samples, and obtaining the accuracy.

15. A method for testing page compatibility, comprising:

feeding back a determination result to the client;

16. The method of claim 15, further comprising:

preprocessing the first screenshot to be detected to obtain the image data;

17. The method of claim 16, wherein preprocessing the first screenshot to be tested comprises:

18. The method according to any one of claims 15 to 17, wherein determining compatibility of the page under test in the browser according to the abstract features of each screenshot under test comprises:

acquiring semantic tags corresponding to the abstract features of each screenshot to be tested according to the corresponding relation between the abstract features and the semantic tags;

19. The method according to claim 18, wherein determining the compatibility of the page to be tested in the browser by determining whether the semantic tags corresponding to the respective screenshot to be tested belong to abnormal class tags comprises:

20. The method according to any one of claims 15 to 17, further comprising:

acquiring a training sample set and a test sample set;

21. The method of claim 20, wherein the obtaining a training sample set and a test sample set comprises:

receiving a plurality of screenshot samples uploaded by a client;

labeling semantic tags for each screenshot sample;

22. The method of claim 21, wherein labeling a semantic tag for a first screenshot sample of the plurality of screenshot samples comprises:

23. The method of claim 21, wherein the acquiring a training sample set further comprises:

24. The method of claim 23, wherein performing image adjustment on the screenshot sample as a training sample comprises:

changing the size of a screenshot sample; and/or

Changing the shape of the screenshot sample; and/or

Intercepting 85% -90% of areas in the screenshot sample; and/or

Changing the direction of a screenshot sample; and/or

The color of the screenshot sample is changed.

25. The method of claim 20, wherein training the training model with the training sample set to obtain a trained model comprises:

26. The method of claim 20, wherein testing the trained model with the set of test samples to obtain the accuracy of the trained model comprises:

27. A method for testing page compatibility, comprising:

loading a page to be tested in a browser;

the method comprises the steps that a basis is determined to be the abstract feature of each screenshot to be detected, and the abstract feature of each screenshot to be detected is extracted from at least one screenshot to be detected by using a convolutional neural network model; the first screenshot to be tested is any one screenshot to be tested in at least one screenshot to be tested; extracting abstract features of a first screenshot to be tested from the first screenshot to be tested by using a convolutional neural network model, wherein the abstract features comprise: convolving the image data of the first screenshot to be tested with a first convolution kernel to obtain a first convolution result; carrying out pooling operation on the first convolution result and then convolving with a second convolution kernel to obtain a second convolution result; performing pooling operation on the second convolution result, and then convolving with a third convolution kernel to obtain a third convolution result; convolving the third convolution result with a fourth convolution kernel to obtain a fourth convolution result; convolving the fourth convolution result with a fifth convolution kernel to obtain a fifth convolution result; carrying out pooling operation on the fifth convolution result, and carrying out at least one full connection operation on the fifth convolution result after pooling operation to obtain a sixth full connection result; and classifying the sixth full-connection result to obtain the abstract feature of the first screenshot to be tested.

28. The method as recited in claim 27, further comprising:

displaying a determination result fed back by the server; and/or

Semantically outputting a determination result fed back by the server; and/or

And outputting an alarm prompt when the determined result is that the screenshot to be detected has no compatibility in the browser.

29. The method according to claim 27 or 28, further comprising:

loading sample pages in a plurality of different browsers;

calling an interface corresponding to each browser to access a sample page displayed in each browser, and intercepting the sample page displayed in each browser in a regional manner to obtain a plurality of screenshot samples;

uploading the screenshot samples to a server to train and test a training model by the server by using the screenshot samples to obtain an extraction model;

30. The method as recited in claim 29, further comprising:

31. A page compatibility test system, comprising:

the server side is used for receiving at least one screenshot to be tested of a page to be tested, which is displayed in a browser and uploaded by the client side; extracting abstract features of each screenshot to be detected from the at least one screenshot to be detected by using a convolutional neural network model; determining the compatibility of the page to be tested in the browser according to the abstract features of each screenshot to be tested, and feeding back a determination result to the client;

32. An electronic device, comprising: a first memory and a first processor, wherein,

the first memory is used for storing programs;

33. A server device, comprising: a second memory and a second processor, wherein,

the second memory is used for storing programs;

feeding back a determination result to the client;

34. A client device, comprising: a third memory and a third processor, wherein,

the third memory is used for storing programs;

loading a page to be tested in a browser;