CN112115043A - Image-based on-end intelligent page quality inspection method - Google Patents

Image-based on-end intelligent page quality inspection method Download PDF

Info

Publication number
CN112115043A
CN112115043A CN202010807204.1A CN202010807204A CN112115043A CN 112115043 A CN112115043 A CN 112115043A CN 202010807204 A CN202010807204 A CN 202010807204A CN 112115043 A CN112115043 A CN 112115043A
Authority
CN
China
Prior art keywords
page
resource
empty
elements
strip
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010807204.1A
Other languages
Chinese (zh)
Other versions
CN112115043B (en
Inventor
陈琴
周小群
邓水光
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN202010807204.1A priority Critical patent/CN112115043B/en
Publication of CN112115043A publication Critical patent/CN112115043A/en
Application granted granted Critical
Publication of CN112115043B publication Critical patent/CN112115043B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/3668Software testing
    • G06F11/3672Test management
    • G06F11/3688Test management for test execution, e.g. scheduling of test suites
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0641Shopping interfaces

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • Finance (AREA)
  • General Engineering & Computer Science (AREA)
  • Accounting & Taxation (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Hardware Design (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Quality & Reliability (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses an image-based on-end intelligent page quality inspection method, which comprises the steps of firstly obtaining DOM element rectangular frame coordinates, screenshots and rectangular frame coordinates of abnormal region positions of normal and abnormal pages, training a page abnormal problem classification and detection model and an element semantic recognition model, then obtaining all DOM element position information and screenshots of a page to be inspected, presorting various category modules according to DOM element distribution characteristics, adopting the element semantic recognition model to recognize a universal module, adopting the trained detection model to directly detect empty pit problems and file overlapping problems, recognizing empty screen problems according to effective SURF characteristics, identifying empty floor problems through the distance of adjacent floor titles, and calculating the similarity of commodity resource positions to judge the same element problems. The invention can liberate testing personnel to carry out repeated manual testing work, and has remarkable effects of reducing cost and improving efficiency on testing work and on-line monitoring and inspection work.

Description

Image-based on-end intelligent page quality inspection method
Technical Field
The invention belongs to the technical field of testing technology and computer vision, and particularly relates to an image-based on-end intelligent page quality inspection method.
Background
With the vigorous development of the mobile internet, the commodity object information service system has been changed from the PC age to the wireless age. In the digital e-commerce industry, due to the fact that marketing promotion activities are various and flexible in change, the page publishing of a meeting place H5 of the marketing activities supports marketing rule makers (non-technical staff) to flexibly create configuration or change configuration at any time according to operation and marketing requirements. The flexibility of the change may bring greater risk of online quality, and the online quality of these pages often directly affects the user experience, marketing campaign effect, and even deal. However, the quality assurance difficulty is greatly improved due to the strong dispersibility of the pages, and particularly, the labor input test cost is extremely high due to thousands of changes and releases of hundreds or even thousands of H5 activity/meeting place pages in a large promotion period (multi-person interactive communication, multi-platform switching operation, fragmented and scattered tasks). Therefore, a universal and low-cost scheme is needed to effectively detect common serious problems such as empty screens, empty windows (some resource bits are missing) "empty floors" (module content is missing) and the like which may occur on pages, so as to efficiently ensure the online quality of the pages.
Disclosure of Invention
The invention aims to provide an image-based on-end intelligent page quality inspection method aiming at the defects of the prior art.
The purpose of the invention is realized by the following technical scheme: an image-based on-end intelligent page quality inspection method comprises the following steps:
s1: acquiring DOM element position information and page screenshots of a page, wherein the page screenshots comprise normal page screenshots and abnormal page screenshots; the abnormal state screenshots comprise a page screenshot containing a hollow and a page screenshot containing a file overlapping coverage problem; additionally acquiring DOM element position information corresponding to the empty pit area and the file overlapping coverage area;
s2: respectively training a strip module classification model, a resource position module classification model, a pit anomaly detection model and a pattern overlapping and covering identification model according to the DOM element position information and the page screenshot obtained in the step S1;
the training process of the strip module classification model and the resource level module classification model specifically comprises the following steps: screening and intercepting bar-shaped elements and resource bit elements according to the DOM element position information and the page screenshot obtained in the step S1; training by adopting a deep learning residual error network or an automatic machine learning technology according to the strip element ROI and the corresponding category thereof to obtain a strip module classification model, inputting the strip element, and outputting the strip element as a strip element category; training by adopting a deep learning residual error network or an automatic machine learning technology according to the resource bit elements and the corresponding categories thereof to obtain a resource bit module classification model, inputting the resource bit elements, and outputting the resource bit elements as the categories of the resource bit elements;
the categories of the strip-shaped elements comprise an elevator navigation module, a floor title, a bottom Tab navigation bar and other categories; the categories of the resource position elements comprise commodity resource position categories, card tickets categories, stores categories and other categories;
the training process of the empty pit abnormity detection model specifically comprises the following steps: after S1 is executed in batch, firstly, a round of manual rechecking marking is carried out on the page screenshot containing empty pits obtained in S1 and the label of the bounding box coordinate of the DOM corresponding to the empty pit element, namely, a sample which is actually a non-empty pit label is deleted, an unmarked empty pit area is added, after the rechecked page screenshot containing the empty pits and the sample data of the marked bounding box thereof are obtained, an R-FCN target detection frame is adopted for training to obtain an empty pit type detection model, the input of the model is an image, and the output is whether the empty pits are contained or not;
the training process of the pattern overlapping coverage recognition model specifically comprises the following steps: after the page screenshot containing the case overlap coverage problem and the DOM element position information of the problem area are obtained according to the step S1, a case overlap coverage detection model is obtained by adopting an R-FCN target detection framework for training, the input is the page screenshot, and the output is whether case overlap coverage exists or not;
s3: acquiring DOM element position information of a page to be inspected and a screenshot of the page to be inspected, and screening and intercepting strip elements to be inspected and resource bit elements to be inspected;
s4: classifying the strip element to be inspected and the resource bit element to be inspected which are obtained in the step S3 respectively by using the strip module classification model and the resource bit module classification model which are obtained in the step S2;
s5: the method aims at identifying different page problems, and specifically comprises the following steps:
(1) for the empty pit type page problem, detecting whether empty pits exist in the page to be detected obtained in the step S3 by using the empty pit type abnormity detection model obtained in the step S2;
(2) for the problem of the empty floor page, judging whether the number of the floor title strip-shaped elements classified in the step S4 is more than or equal to 2 and the longitudinal relative distance between two adjacent floor title strip-shaped elements is less than a set empty floor threshold; if so, the page to be checked has an empty floor, otherwise, the page to be checked has no empty floor;
(3) for the problem of the blank screen page, taking the middle area of the screenshot of the page to be checked obtained in the step S3, extracting SURF characteristic key points in the middle area, if the number of the extracted SURF characteristic key points is smaller than a set key point threshold value, judging that the blank screen problem exists in the page to be checked, and otherwise, judging that the blank screen problem does not exist;
(4) for the problem of repeated materials, comparing the similarity of every two commodity resource level elements classified in the step S5, and judging whether an abnormal problem that different commodity resource level elements are associated with the same target object information exists or not, wherein the abnormal problem is that the similarity is greater than a set threshold value;
(5) and aiming at the problem of the page with the document overlapping coverage, detecting whether the screenshot of the page to be detected, which is obtained in the step S3, has the document overlapping coverage by using the document overlapping coverage identification model obtained in the step S2.
Further, identifying different types of strip-shaped elements and resource bit elements in the page by using a strip-shaped module classification model and a resource bit module classification model, and setting corresponding interactive control positions; configuring the position and the sequence of the interactive control required to be checked, simulating the position of the interactive control appointed in the clicked page according to the configuration, and acquiring the DOM element position information and the page screenshot of the clicked page; and if the appointed clicked position of the interactive control does not exist in the screenshot of the current screen, executing a downslide operation to reach the next screen position, continuing to simulate clicking the appointed position of the interactive control, and acquiring DOM element position information and the screenshot of the clicked page.
Further, the strip-shaped elements and the resource bit elements in different categories in the page are identified by utilizing the strip-shaped module classification model and the resource bit module classification model, and the corresponding interactive control positions are specifically set as follows:
(a) the positions of the resource position elements of the identified commodity resource position class, the card ticket class and the shop class are directly used as the positions of the interactive controls of the commodity resource position class, the card ticket class and the shop class;
(b) recognizing the recognized strip-shaped elements of the elevator navigation module by adopting an OCR technology to obtain the position of an interactive control used for reaching a certain floor in the elevator navigation module, and recognizing by utilizing a template matching method to obtain the position of the interactive control of an arrow button used for expanding all contents in the elevator navigation module;
(c) and recognizing the strip-shaped elements of the identified bottom Tab navigation bar by adopting an OCR technology to obtain interactive control positions for jumping to other column pages.
Further, the template matching method is a SURF feature extraction algorithm combined with FLANN.
Further, the DOM element position information includes relative coordinates of a rectangular enclosure frame of the DOM element in the page screenshot, an aspect ratio of the rectangular enclosure frame, a ratio of each side corresponding to the rectangular enclosure frame and the screenshot, and an area of the rectangular enclosure frame.
Furthermore, the screening condition of the strip-shaped elements is W/h epsilon (5,6), and W/W epsilon [1, 3); the screening conditions of the resource site elements are h/W <3, W/h <3.5 and 15W < h W < 150W; w is the width of the element, h is the height of the element, and W is the width of the image in which the element is located.
Further, the empty floor threshold in the step S5 is 10-20 pixels.
Further, the height and width of the middle area in the step S5 are both greater than the height and width half of the corresponding screenshot of the page to be checked.
Further, the key point threshold value in the step S5 is 10-100.
Further, the similarity of every two of the resource allocation elements of the commodity resource allocation class classified in the comparing step S4 in the step S5 is specifically as follows: and (5) extracting the HOG features of the commodity resource position elements classified in the step (S4), and calculating the cosine similarity of the HOG features of every two commodity resource position elements.
The invention has the beneficial effects that: the invention relates to a general automatic checking solution aiming at the common page quality problem of E-commerce activity meeting place pages, which is generally applicable to new business of off-line store type and traditional electronic commerce business. The checking capability of the common page abnormal problem is the checking capability of simulating a tester to the abnormal problem visually; the method has the advantages that the position of the interactive control is detected and identified, the control is clicked through the javaScript script, view changes such as position switching and floating layer popping in the page and automatic jumping among the pages can be realized, and the operation cost of testers is saved. Therefore, the invention realizes automatic online page quality inspection, and can liberate testers to carry out repeated manual testing work aiming at large-scale online page change during the period of concentrated e-commerce activities, and has obvious effects of reducing cost and improving efficiency on the testing work and the online monitoring inspection work.
Drawings
FIG. 1 is a schematic flow chart of an image-based intelligent page quality inspection method according to the present invention;
FIG. 2 is a schematic diagram of an application example of the image-based intelligent page quality inspection method;
FIG. 3 is a schematic diagram of the identification of an abnormal problem in the present method; wherein, (a) is a schematic diagram of a problem of empty pits; (b) the problem is represented by a schematic diagram of an empty screen problem; (c) an empty floor problem schematic diagram; (d) the problem is shown schematically in the same material class.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments that can be derived from the embodiments given herein by a person of ordinary skill in the art are intended to be within the scope of the present disclosure.
The invention relates to an image-based on-end intelligent page quality inspection method, which comprises the steps of classifying a page common module and identifying a page abnormal state based on a plurality of learned image classification models and image target detection models; the strip-shaped module classification model, the resource location module classification model and the pattern overlapping/covering problem recognition model are image classification models, the empty pit anomaly detection model is an image target detection model, and the model based on learning comprises a training stage and a testing stage (namely an application stage), and comprises the following steps as shown in fig. 1:
s1: and running a script on the mobile device to save the DOM element position information and the screenshot of the page.
In a training phase based on a learning model, saving all DOM element position information and screenshots in the current screen display range of the H5 page and controlling the page to slide by running a javaScript script on a mobile device; the DOM element position information comprises the relative coordinates of a rectangular surrounding frame of the DOM element in the screenshot, the length-width ratio of the rectangular surrounding frame to the corresponding screenshot, the area of the rectangular surrounding frame and other characteristics; the screenshots comprise normal state page screenshots and abnormal state page screenshots formed by controlling and modifying an HTML structure and a CSS of page elements. The abnormal screenshot comprises a page screenshot containing a blank pit and an image containing a pattern overlapping coverage problem, and the javaScript script additionally stores bounding box coordinates of a DOM corresponding to the element of the blank pit, sample data of a marked bounding box of the DOM corresponding to the pattern overlapping coverage area, and the sample data of the marked bounding box of the DOM corresponding to the pattern overlapping coverage area. The empty pit refers to that the content of the resource bit is empty or lacks the resource bit; for example, the content of the resource position is empty when the main information such as the page display commodity and the picture in the area of the shop is lost; the module used in a certain area is a row of standard modules for displaying three resource bits, and only two resource bits are actually displayed to belong to the lack resource bits.
S2: after marking or precipitating image data of page normal state and abnormal state, respectively training a strip module classification model, a resource bit module classification model, a pit abnormality detection model and a pattern overlapping and covering identification model;
the training process of the strip module classification model and the resource level module classification model specifically comprises the following steps:
(1) aiming at the training of the strip module classification model and the resource level module classification model, the screenshot output by S1 and the position information of all DOM elements in the screenshot range are utilized, and the strip element ROI (Region of Interest) and the resource level element ROI which meet the conditions are screened out according to certain rule constraint. The screening condition of the strip-shaped elements is W/h epsilon (5,6), and W/W epsilon [1, 3); the screening conditions of the resource site elements are h/W <3, W/h <3.5 and 15W < h W < 150W; w is the width of the element, h is the height of the element, and W is the width of the image in which the element is located.
(2) Marking according to element types after obtaining the strip-shaped element ROI in the step (1) in batches, wherein common strip-shaped element types comprise an elevator navigation module, a floor title, a bottom Tab navigation bar and other types; similarly, marking the resource position element ROI, wherein the common resource position element categories comprise commodity resource positions, card tickets, shops and other categories. The elevator navigation module is used for realizing that clicking corresponding classification can skip to a specific floor below; and the other classes are divided according to the common page layout modes and the common modules of different services.
(3) After the batch marking data in the step (2) are obtained, respectively training by adopting a mature deep learning residual error network Resnet or an automatic machine learning (AutoML) technology to obtain a strip module classification model and a resource location module classification model. The input of the strip module classification model is strip elements, and the output is a strip element category; the resource bit module classification model has the input of resource bit elements and the output of resource bit element classes.
The empty pit type abnormity detection model is used for checking whether empty pits exist in a page or whether pits fall, and the training process specifically comprises the following steps: after the batch execution of the S1, firstly, a round of manual rechecking marking is carried out on the page screenshot containing the empty pits obtained in the S1 and the label of the bounding box coordinate of the DOM corresponding to the empty pit elements, namely, the sample which is actually a non-empty pit label is deleted, the unmarked empty pit area is added, after the rechecked page screenshot containing the empty pits and the sample data of the marked bounding box thereof are obtained, an R-FCN target detection frame is adopted for training to obtain an empty pit type detection model, the input of the model is an image, and the output is whether the empty pits are contained or not.
The training process of the pattern overlapping coverage recognition model specifically comprises the following steps: after the batch execution of the step S1, firstly, a round of manual review marking is performed on the bounding box coordinate label of the area containing the problem of document overlapping display obtained in the step S1, that is, the sample actually covered by non-document overlapping is deleted, the unmarked document overlapping coverage area is added, the image containing the problem of document overlapping coverage after review and the sample data of the marked bounding box thereof are obtained, then the R-FCN target detection framework is adopted to train to obtain the detection model of document overlapping coverage, the input is the image, and the output is whether the document overlapping coverage exists or not.
S3: after the training of each model is finished, in an application stage, storing coordinate information and screenshots of all DOM element bounding boxes in a current screen display range in a current H5 page to be checked by running a javaScript script on mobile equipment;
s4: and pre-classifying the modules of each category according to the DOM element position distribution characteristics, and screening out candidate strip element ROIs and candidate resource bit element ROIs according to the length-width ratio and the area of each bounding box of the image output by the S3 and the bounding box coordinate information of all DOM elements in the image range and the characteristics of the relative positions and the like in the image.
S5: and classifying the strip elements and the resource bit elements obtained in the step S4 respectively by using the strip module classification model and the resource bit module classification model obtained in the step S2, namely recognizing the semantics of the common elements.
S6: the method aims at identifying different page problems, and specifically comprises the following steps:
(1) and detecting the current screenshot obtained in the step S3 by using the empty pit abnormity detection model obtained in the step S2 aiming at the empty pit page problem, and judging whether empty pits exist or not.
(2) For the problem of the empty floor class page, if the number of the floor title class bar-shaped elements output by S5 is greater than or equal to 2 and the longitudinal relative distance Δ y between two adjacent floor title class bar-shaped elements ROI is smaller than a certain threshold, usually 10-20 pixels, it is determined that the page has an "empty floor", which means that the content of the whole resource location module is empty.
(3) For the problem of the blank screen page, taking a middle region ROI of the current screenshot in S3 on the space, wherein the height and width of the middle region ROI are larger than 1/2 of the height and width of the current screenshot in S3; and extracting SURF feature key points in the middle region ROI, if the number of the extracted SURF feature key points is smaller than a set threshold, the threshold is set according to the actual middle region ROI scale and a specific service requirement scene, and is usually 10-100, and then the page is judged to have a blank screen problem.
(4) For the problem of repeated materials, all commodity resource level element ROIs output in S5 are taken, the similarity of the commodity resource level element ROIs in different regions is compared, the HOG features of each commodity resource level element ROI are extracted, the cosine similarity of the HOG features of all commodity resource level element ROIs is calculated pairwise and serves as the similarity calculation result of each pair of commodity resource level element ROIs, a similarity threshold is set, whether an abnormal problem that different commodity resource level elements are associated with the same target object information exists is judged, and the similarity is larger than the threshold and is abnormal.
(5) And aiming at the problem of the document overlapping coverage type page, detecting the current screenshot obtained in the step S3 by using the document overlapping coverage identification model obtained in the step S2, and judging whether document overlapping coverage exists.
If it is identified in the above S6 that the page has the abnormal problem as shown in fig. 3, the page quality alarm information is output. Meanwhile, a user can configure a specified operation path which the system wants to automatically check the page in the system, and if the user wants to click any store class control from a certain current activity page to jump to a store receiving page for continuous checking, the user needs to configure a 'store class' in the system as a specified category of the control of the script for executing the click operation next step.
Identifying interactive operation points according to types of the positions of next operation interaction to be executed by a system javaScript script, and identifying areas such as commodity resource positions, coupons, store resource positions and the like by utilizing an element semantic identification model to directly serve as the category interactive operation points; the ROI of the elevator and the bottom Tab navigation bar is identified by utilizing an element semantic identification model, and then an OCR technology and a template matching method are combined to be used as a module for clicking direct interaction points and other main columns for clicking direct interaction points, and the method specifically comprises the following steps:
(1) identifying by using an S5 strip module classification model to obtain an elevator navigation module, and detecting and identifying in the module by adopting an OCR optical character identification technology to obtain a control of a certain floor in the elevator navigation module; arrow buttons that expand the entire contents of the elevator navigation module are identified using a template matching method, such as the SURF (speeded Up Robust features) feature extraction algorithm in combination with a Fast Nearest Neighbor approximation Search Function Library (FLANN).
(2) The bottom Tab navigation module is a bottom Tab navigation bar module obtained by utilizing the S5 bar module classification model for identification, OCR technology is adopted in the bottom Tab navigation bar module, the positions of controls which can generate certain interactive actions after clicking can be obtained by capturing the positions of key texts, namely the positions of the interactive controls, and the interactive actions generally refer to jumping to other column pages.
(3) The positions of the elements such as the commodity resource position, the card and ticket resource position, the shop resource position and the like identified by the S5 resource position module classification model can be directly output as commodity type interactive controls, card and ticket type interactive controls and shop type interactive controls.
For all the identified positions of the interactive controls of the categories in the screen page, the system can execute clicking any one of all the interactive operation point positions corresponding to the categories or executing all the positions in a traversing manner through the corresponding categories configured by the user; such as a control of a floor in the elevator navigation module, a control of a card ticket class, etc., which the user wishes to execute. If no specified category interactive operation point exists in the screenshot, executing a certain step length of downslide operation on the screenshot by default to reach the next screen position of the screenshot, and then repeatedly executing step S3.
The user can directly trigger the system to execute the automatic inspection of the current changed page after the page is changed, the page interactive operation mode supports user configuration or default gliding, and the output inspection result informs the user in real time.
Fig. 2 is a schematic diagram of an application example of the present invention, and an input is a screenshot of a typical mobile-end provider news gathering place page. And performing pre-classification screening according to the characteristics of the aspect ratio, the area, the relative position and the like of DOM elements in the screenshot range to respectively obtain two types of ROIs (region of interest) of strip elements and resource position elements, and identifying the ROIs by using a trained strip module classification model and a trained resource position module classification model to further obtain a commodity resource position ROI, a card resource position ROI, an elevator ROI, a floor title ROI, a bottom Tab navigation bar ROI and other types of ROIs. After the floor title ROI is obtained, whether the problem of empty floors exists can be further judged; after the commodity resource position ROI is obtained, whether the same commodity problems exist can be further judged. Meanwhile, after the ROI of the elevator and the bottom navigation bar are obtained, the interactive operation points can be detected in the ROI area through OCR and template matching technology.
It will be apparent to those skilled in the art that modifications may be made to the embodiments described in the foregoing examples, or equivalents may be substituted for elements thereof. All modifications, equivalents and the like which come within the spirit and principle of the invention are intended to be included within the scope of the invention.

Claims (10)

1. An image-based on-end intelligent page quality inspection method is characterized by comprising the following steps:
s1: acquiring DOM element position information and page screenshots of a page, wherein the page screenshots comprise normal page screenshots, abnormal page screenshots and the like. The abnormal state screenshots comprise a page screenshot containing a hollow and a page screenshot containing a file overlapping coverage problem; and additionally acquiring DOM element position information corresponding to the empty pit area and the pattern overlapping coverage area.
S2: and respectively training a bar module classification model, a resource position module classification model, a pit anomaly detection model and a pattern overlapping and covering identification model according to the DOM element position information and the page screenshot obtained in the step S1.
S3: acquiring DOM element position information of a page to be inspected and a screenshot of the page to be inspected, and screening and intercepting strip elements to be inspected, resource location elements to be inspected and the like.
S4: and (4) classifying the to-be-inspected strip element and the to-be-inspected resource bit element obtained in the step (S3) respectively by using the strip module classification model and the resource bit module classification model obtained in the step (S2).
S5: and identifying different page problems, including empty pit type, empty floor type, empty screen type, repeated material type and pattern overlapping type.
2. The image-based intelligent end-on page quality inspection method according to claim 1, wherein the DOM element position information includes relative coordinates of a rectangular bounding box of the DOM element in the page screenshot, an aspect ratio of the rectangular bounding box, a ratio of each side corresponding to the rectangular bounding box to the screenshot, and an area of the rectangular bounding box.
3. The image-based intelligent end-to-end page quality inspection method according to claim 1, wherein in the step S2, the training process of the bar module classification model and the resource location module classification model specifically comprises: screening and intercepting bar-shaped elements and resource bit elements according to the DOM element position information and the page screenshot obtained in the step S1; training by adopting a deep learning residual error network or an automatic machine learning technology according to the strip element ROI and the corresponding category thereof to obtain a strip module classification model, inputting the strip element, and outputting the strip element as a strip element category; training by adopting a deep learning residual error network or an automatic machine learning technology according to the resource bit elements and the corresponding categories thereof to obtain a resource bit module classification model, inputting the resource bit elements, and outputting the resource bit elements as the categories of the resource bit elements;
the training process of the empty pit abnormity detection model specifically comprises the following steps: after S1 is executed in batch, firstly, a round of manual rechecking marking is carried out on the page screenshot containing empty pits obtained in S1 and the label of the bounding box coordinate of the DOM corresponding to the empty pit element, namely, a sample which is actually a non-empty pit label is deleted, an unmarked empty pit area is added, after the rechecked page screenshot containing the empty pits and the sample data of the marked bounding box thereof are obtained, an R-FCN target detection frame is adopted for training to obtain an empty pit type detection model, the input of the model is an image, and the output is whether the empty pits are contained or not;
the training process of the pattern overlapping coverage recognition model specifically comprises the following steps: and (4) after the page screenshot containing the case overlap coverage problem and the DOM element position information of the problem area are obtained according to the step S1, training by adopting an R-FCN target detection framework to obtain a case overlap coverage type detection model, inputting the page screenshot, and outputting whether case overlap coverage exists or not.
4. The image-based intelligent end-to-end page quality inspection method according to claim 3, wherein the screening condition of the strip-shaped elements is W/h e (5,6), W/W e [1, 3); the screening conditions of the resource site elements are h/W <3, W/h <3.5 and 15W < h W < 150W; w is the width of the element, h is the height of the element, and W is the width of the image in which the element is located.
5. The image-based intelligent end-to-end page quality inspection method according to claim 3, wherein the categories of the bar-shaped elements include elevator navigation modules, floor titles, bottom Tab navigation bars, and other categories; the categories of the resource allocation elements include a commodity resource allocation category, a card ticket category, a store category, and other categories.
6. The image-based intelligent end-to-end page quality inspection method according to claim 1, wherein the step S5 specifically comprises:
(1) for the empty pit type page problem, detecting whether empty pits exist in the page to be detected obtained in the step S3 by using the empty pit type abnormity detection model obtained in the step S2;
(2) for the problem of the empty floor page, judging whether the number of the floor title strip-shaped elements classified in the step S4 is more than or equal to 2 and the longitudinal relative distance between two adjacent floor title strip-shaped elements is less than a set empty floor threshold; if so, the page to be checked has an empty floor, otherwise, the page to be checked has no empty floor;
(3) for the problem of the blank screen page, taking the middle area of the screenshot of the page to be checked obtained in the step S3, extracting SURF characteristic key points in the middle area, if the number of the extracted SURF characteristic key points is smaller than a set key point threshold value, judging that the blank screen problem exists in the page to be checked, and otherwise, judging that the blank screen problem does not exist;
(4) for the problem of repeated materials, comparing the similarity of every two commodity resource level elements classified in the step S5, and judging whether an abnormal problem that different commodity resource level elements are associated with the same target object information exists or not, wherein the abnormal problem is that the similarity is greater than a set threshold value;
(5) and aiming at the problem of the page with the document overlapping coverage, detecting whether the screenshot of the page to be detected, which is obtained in the step S3, has the document overlapping coverage by using the document overlapping coverage identification model obtained in the step S2.
7. The image-based intelligent end-on page quality inspection method according to claim 6, wherein the height and width of the middle area in the step S5 are both greater than half of the height and width corresponding to the screenshot of the page to be inspected.
8. The image-based intelligent end-to-end page quality inspection method according to claim 6, wherein the similarity of every two of the commodity resource position class resource position elements classified in the comparing step S4 in the step S5 is specifically as follows: and (5) extracting the HOG features of the commodity resource position elements classified in the step (S4), and calculating the cosine similarity of the HOG features of every two commodity resource position elements.
9. The image-based intelligent end-on page quality inspection method according to claim 1, wherein the strip-shaped module classification model and the resource bit module classification model are used for identifying strip-shaped elements and resource bit elements of different categories in the page, and corresponding interactive control positions are set; configuring the position and the sequence of the interactive control required to be checked, simulating the position of the interactive control appointed in the clicked page according to the configuration, and acquiring the DOM element position information and the page screenshot of the clicked page; and if the appointed clicked position of the interactive control does not exist in the screenshot of the current screen, executing a downslide operation to reach the next screen position, continuing to simulate clicking the appointed position of the interactive control, and acquiring DOM element position information and the screenshot of the clicked page.
10. The image-based intelligent end-on page quality inspection method according to claim 9, wherein the strip-like module classification model and the resource bit module classification model are used to identify strip-like elements and resource bit elements of different categories in the page, and corresponding interactive control positions are set, specifically:
(a) the positions of the resource position elements of the identified commodity resource position class, the card ticket class and the shop class are directly used as the positions of the interactive controls of the commodity resource position class, the card ticket class and the shop class;
(b) recognizing the recognized strip-shaped elements of the elevator navigation module by adopting an OCR technology to obtain the position of an interactive control used for reaching a certain floor in the elevator navigation module, and recognizing by utilizing a template matching method to obtain the position of the interactive control of an arrow button used for expanding all contents in the elevator navigation module;
(c) and recognizing the strip-shaped elements of the identified bottom Tab navigation bar by adopting an OCR technology to obtain interactive control positions for jumping to other column pages.
CN202010807204.1A 2020-08-12 2020-08-12 Image-based on-end intelligent page quality inspection method Active CN112115043B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010807204.1A CN112115043B (en) 2020-08-12 2020-08-12 Image-based on-end intelligent page quality inspection method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010807204.1A CN112115043B (en) 2020-08-12 2020-08-12 Image-based on-end intelligent page quality inspection method

Publications (2)

Publication Number Publication Date
CN112115043A true CN112115043A (en) 2020-12-22
CN112115043B CN112115043B (en) 2021-10-08

Family

ID=73804087

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010807204.1A Active CN112115043B (en) 2020-08-12 2020-08-12 Image-based on-end intelligent page quality inspection method

Country Status (1)

Country Link
CN (1) CN112115043B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114816610A (en) * 2021-01-29 2022-07-29 华为技术有限公司 Page classification method, page classification device and terminal equipment

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104077221A (en) * 2014-06-27 2014-10-01 百度在线网络技术(北京)有限公司 Style test method and device for front-end page
US20180157386A1 (en) * 2016-12-05 2018-06-07 Jiawen Su System and Method for detection, exploration, and interaction of graphic application interface
CN109086203A (en) * 2018-07-20 2018-12-25 百度在线网络技术(北京)有限公司 The detection method and device of the page
US20190213116A1 (en) * 2018-01-10 2019-07-11 Accenture Global Solutions Limited Generation of automated testing scripts by converting manual test cases
US20200004667A1 (en) * 2018-06-29 2020-01-02 Wipro Limited Method and system of performing automated exploratory testing of software applications
CN110674442A (en) * 2019-09-17 2020-01-10 中国银联股份有限公司 Page monitoring method, device, equipment and computer readable storage medium
CN110705596A (en) * 2019-09-04 2020-01-17 北京三快在线科技有限公司 White screen detection method and device, electronic equipment and storage medium
CN110955590A (en) * 2019-10-15 2020-04-03 北京海益同展信息科技有限公司 Interface detection method, image processing method, device, electronic equipment and storage medium
CN110968822A (en) * 2018-09-30 2020-04-07 阿里巴巴集团控股有限公司 Page detection method and device, electronic equipment and storage medium
CN111078552A (en) * 2019-12-16 2020-04-28 腾讯科技(深圳)有限公司 Method and device for detecting page display abnormity and storage medium

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104077221A (en) * 2014-06-27 2014-10-01 百度在线网络技术(北京)有限公司 Style test method and device for front-end page
US20180157386A1 (en) * 2016-12-05 2018-06-07 Jiawen Su System and Method for detection, exploration, and interaction of graphic application interface
US20190213116A1 (en) * 2018-01-10 2019-07-11 Accenture Global Solutions Limited Generation of automated testing scripts by converting manual test cases
US20200004667A1 (en) * 2018-06-29 2020-01-02 Wipro Limited Method and system of performing automated exploratory testing of software applications
CN109086203A (en) * 2018-07-20 2018-12-25 百度在线网络技术(北京)有限公司 The detection method and device of the page
CN110968822A (en) * 2018-09-30 2020-04-07 阿里巴巴集团控股有限公司 Page detection method and device, electronic equipment and storage medium
CN110705596A (en) * 2019-09-04 2020-01-17 北京三快在线科技有限公司 White screen detection method and device, electronic equipment and storage medium
CN110674442A (en) * 2019-09-17 2020-01-10 中国银联股份有限公司 Page monitoring method, device, equipment and computer readable storage medium
CN110955590A (en) * 2019-10-15 2020-04-03 北京海益同展信息科技有限公司 Interface detection method, image processing method, device, electronic equipment and storage medium
CN111078552A (en) * 2019-12-16 2020-04-28 腾讯科技(深圳)有限公司 Method and device for detecting page display abnormity and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
李旬 等: "基于异常特征的社交网页检测技术研究", 《信息网络安全》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114816610A (en) * 2021-01-29 2022-07-29 华为技术有限公司 Page classification method, page classification device and terminal equipment
WO2022160958A1 (en) * 2021-01-29 2022-08-04 华为技术有限公司 Page classification method, page classification apparatus, and terminal device
CN114816610B (en) * 2021-01-29 2024-02-02 华为技术有限公司 Page classification method, page classification device and terminal equipment

Also Published As

Publication number Publication date
CN112115043B (en) 2021-10-08

Similar Documents

Publication Publication Date Title
EP3502966A1 (en) Data generation apparatus, data generation method, and data generation program
CN111080622B (en) Neural network training method, workpiece surface defect classification and detection method and device
CN112101335B (en) APP violation monitoring method based on OCR and transfer learning
US20140189576A1 (en) System and method for visual matching of application screenshots
JP2003100826A (en) Inspecting data analyzing program and inspecting apparatus and inspecting system
CN114202543B (en) Method, device, equipment and medium for detecting dirt defects of PCB (printed circuit board)
EP1729251A1 (en) Process management apparatus and process management method
CN112070076B (en) Text paragraph structure reduction method, device, equipment and computer storage medium
CN113657361A (en) Page abnormity detection method and device and electronic equipment
CN116052193B (en) RPA interface dynamic form picking and matching method and system
CN103491116A (en) Method and device for processing text-related structural data
CN112308069A (en) Click test method, device, equipment and storage medium for software interface
CN112115043B (en) Image-based on-end intelligent page quality inspection method
CN111738252B (en) Text line detection method, device and computer system in image
CN113111903A (en) Intelligent production line monitoring system and monitoring method
CN113239227A (en) Image data structuring method and device, electronic equipment and computer readable medium
CN117252842A (en) Aircraft skin defect detection and network model training method
CN114972880A (en) Label identification method and device, electronic equipment and storage medium
CN113762257A (en) Identification method and device for marks in makeup brand images
CN112612990A (en) Webpage analysis method, system and computer readable storage medium
CN116091503B (en) Method, device, equipment and medium for discriminating panel foreign matter defects
CN115830599A (en) Industrial character recognition method, model training method, device, equipment and medium
CN114972500A (en) Checking method, marking method, system, device, terminal, equipment and medium
CN115546824A (en) Taboo picture identification method, equipment and storage medium
JP7322560B2 (en) Program, information processing method and information processing apparatus

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant