CN111611503B - Page processing method and device, electronic equipment and storage medium - Google Patents

Page processing method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN111611503B
CN111611503B CN202010462451.2A CN202010462451A CN111611503B CN 111611503 B CN111611503 B CN 111611503B CN 202010462451 A CN202010462451 A CN 202010462451A CN 111611503 B CN111611503 B CN 111611503B
Authority
CN
China
Prior art keywords
skeleton
page
candidate
region
cluster set
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010462451.2A
Other languages
Chinese (zh)
Other versions
CN111611503A (en
Inventor
王亚楠
尹飞
葛鹏
薛大伟
刘兰英
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN202010462451.2A priority Critical patent/CN111611503B/en
Publication of CN111611503A publication Critical patent/CN111611503A/en
Application granted granted Critical
Publication of CN111611503B publication Critical patent/CN111611503B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/954Navigation, e.g. using categorised browsing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/958Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application discloses a page processing method and device, and relates to the technical field of page processing. The specific embodiment comprises the following steps: acquiring a plurality of candidate pages and determining display interfaces of the candidate pages; extracting a skeleton structure diagram of the display interface, and identifying a skeleton region in the skeleton structure diagram; clustering the skeleton region to generate at least one cluster set; and selecting at least one skeleton region from each cluster set of the at least one cluster set, and taking a candidate page corresponding to the at least one skeleton region as a target page. The method and the device can avoid the interference of the detailed content of the page on the selected target page by identifying the skeleton of the page and clustering the skeleton area. In addition, the screening page can be reduced in dimension to the screening skeleton area, so that the goal page is selected for each type of skeleton area, the selected goal page is comprehensively covered on various page characteristics, and the recall rate of selecting different page types is improved.

Description

Page processing method and device, electronic equipment and storage medium
Technical Field
The embodiment of the application relates to the technical field of computers, in particular to the technical field of webpage processing, and particularly relates to a webpage processing method, device, electronic equipment and storage medium.
Background
With the development of internet technology, more and more users are working and leisure with the internet. Visual content about the internet is typically presented to the user in the form of pages, such as landing pages.
In floor page related project testing, it was found that there were page presentation problems due to data flow, customer operation, rendering, etc., whereas commercial platforms floor page orders of magnitude are millions. Under the huge order of magnitude, how to perform full quality assurance on the page within the cost controllable range becomes a problem to be solved by the commercial platform landing page test.
Disclosure of Invention
A page processing method, a page processing device, electronic equipment and a storage medium are provided.
According to a first aspect, there is provided a method of processing a page, comprising: acquiring a plurality of candidate pages and determining display interfaces of the candidate pages; extracting a skeleton structure diagram of a display interface, and identifying a skeleton region in the skeleton structure diagram; clustering the skeleton areas to generate at least one cluster set; and selecting at least one skeleton region from each cluster set of at least one cluster set, and taking a candidate page corresponding to the at least one skeleton region as a target page.
According to a second aspect, there is provided a processing apparatus for a page, comprising: an acquisition unit configured to acquire a plurality of candidate pages, and determine display interfaces of the plurality of candidate pages; the extraction unit is used for extracting a skeleton structure diagram of the display interface and identifying a skeleton region in the skeleton structure diagram; the clustering unit is configured to cluster the skeleton areas to generate at least one cluster set; the selecting unit is configured to select at least one skeleton region from each cluster set of the at least one cluster set, and take a candidate page corresponding to the at least one skeleton region as a target page.
According to a third aspect, there is provided an electronic device comprising: one or more processors; and a storage means for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to implement the method as any of the embodiments of the processing method for a page.
According to a fourth aspect, there is provided a computer readable storage medium having stored thereon a computer program which when executed by a processor implements a method as in any of the embodiments of the processing method of a page.
Drawings
Other features, objects and advantages of the present application will become more apparent upon reading of the detailed description of non-limiting embodiments, made with reference to the following drawings, in which:
FIG. 1 is an exemplary system architecture diagram in which some embodiments of the present application may be applied;
FIG. 2a is a flow chart of one embodiment of a method of processing a page according to the present application;
FIG. 2b is a schematic diagram of a skeleton region of a method of processing a page according to the present application;
FIG. 3 is a schematic diagram of an application scenario of a page processing method according to the present application;
FIG. 4 is a flow chart of yet another embodiment of a method of processing a page according to the present application;
FIG. 5 is a schematic diagram of one embodiment of a processing device for pages according to the present application;
fig. 6 is a block diagram of an electronic device for implementing a method of processing a page according to an embodiment of the present application.
Detailed Description
Exemplary embodiments of the present application are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present application to facilitate understanding, and should be considered as merely exemplary. Accordingly, one of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present application. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
It should be noted that, in the case of no conflict, the embodiments and features in the embodiments may be combined with each other. The present application will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.
FIG. 1 illustrates an exemplary system architecture 100 of an embodiment of a processing method of a page or processing apparatus of a page to which the present application may be applied.
As shown in fig. 1, a system architecture 100 may include terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 is used as a medium to provide communication links between the terminal devices 101, 102, 103 and the server 105. The network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
The user may interact with the server 105 via the network 104 using the terminal devices 101, 102, 103 to receive or send messages or the like. Various communication client applications, such as video-type applications, live applications, instant messaging tools, mailbox clients, social platform software, etc., may be installed on the terminal devices 101, 102, 103.
The terminal devices 101, 102, 103 may be hardware or software. When the terminal devices 101, 102, 103 are hardware, they may be various electronic devices with display screens, including but not limited to smartphones, tablets, electronic book readers, laptop and desktop computers, and the like. When the terminal devices 101, 102, 103 are software, they can be installed in the above-listed electronic devices. Which may be implemented as multiple software or software modules (e.g., multiple software or software modules for providing distributed services) or as a single software or software module. The present invention is not particularly limited herein.
The server 105 may be a server providing various services, such as a background server providing support for the terminal devices 101, 102, 103. The background server may analyze and process the received data such as multiple candidate pages, and feed back the processing result (for example, the target page) to the terminal device.
It should be noted that, the processing method of the page provided in the embodiment of the present application may be executed by the server 105 or the terminal devices 101, 102, 103, and correspondingly, the processing apparatus of the page may be set in the server 105 or the terminal devices 101, 102, 103.
It should be understood that the number of terminal devices, networks and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
With continued reference to FIG. 2a, a flow 200 of one embodiment of a method of processing a page according to the present application is shown. The page processing method comprises the following steps:
step 201, a plurality of candidate pages are acquired, and display interfaces of the plurality of candidate pages are determined.
In this embodiment, an execution body (for example, a server or a terminal device shown in fig. 1) on which the page processing method is executed may acquire a plurality of candidate pages, and determine a display interface of each candidate page in the plurality of candidate pages.
In practice, the executing entity may acquire the plurality of candidate pages in various manners, for example, the executing entity may directly acquire the prestored plurality of candidate pages from the present electronic device or other electronic devices. In addition, the execution body can also select a plurality of candidate pages from the designated candidate page set in real time.
The execution body may determine the display interfaces of the candidate pages in various manners. For example, the executing body may directly obtain a display interface of a plurality of pre-stored candidate pages. In addition, the whole visualized candidate page can be subjected to screenshot, and the screenshot result is used as a display interface.
Step 202, extracting a skeleton structure diagram of a display interface, and identifying a skeleton region in the skeleton structure diagram.
In this embodiment, the execution body may extract a skeleton structure of the display interface. The execution body may identify a skeleton region in the skeleton structure diagram. The skeleton structure diagram refers to a structure diagram formed by areas where contents such as characters and pictures in a page are located. That is, the skeleton structure diagram may indicate which regions in the page are composed of text, which regions are composed of pictures, and their permutation and combination.
Different areas in the page can display different contents, and correspondingly, in the skeleton structure diagram of the page, different skeleton areas corresponding to the areas are also respectively arranged. In practice, each skeleton region is an equivalent class in the skeleton structure diagram. The execution body may identify the skeleton region in various manners, for example, the execution body may divide the skeleton structure diagram into at least one equivalence class and use the at least one equivalence class as an identification result of the skeleton region. For another example, the executing body may identify each skeleton region based on the gray levels of the pixel points in the skeleton structure, and use the pixel points with adjacent gray levels (for example, the difference between the gray levels is smaller than a preset threshold value) as the pixel points in the same skeleton region. In addition, the execution body can perform edge detection on the skeleton structure diagram so as to obtain different skeleton areas surrounded by different edges.
As shown in fig. 2b, the left hand diagram shows the display interface of one candidate page. And the right image is an upper framework region and a lower framework region in a framework structure diagram of the display interface. Text elements and image elements are included in the skeleton region.
Step 203, clustering the skeleton areas to generate at least one cluster set.
In this embodiment, the executing body may cluster each skeleton region, and a clustering result is at least one cluster set. Specifically, the above-described execution subjects may be clustered in various ways, such as an image clustering algorithm, a K-means (K-means) clustering algorithm, and the like. In practice, the execution bodies described above may also be clustered in the following manner: encoding skeleton structures of the candidate pages to obtain encoding features corresponding to the skeleton structures of the candidate pages, and clustering the encoding features corresponding to the skeleton structures of the candidate pages to generate a plurality of first-level clustering sets. In addition, the execution body may further extract a plurality of elements from each candidate page in the first-level clustering set, and obtain element features of the plurality of elements in each candidate page. Then, the execution body may perform second-level clustering on the pages in the first-level clustering set according to element characteristics of a plurality of elements in each page, so as to generate the plurality of clustering sets. The elements comprise characters and images in the skeleton structure diagram, namely two element types, namely a character element and an image element.
Step 204, selecting at least one skeleton region from each cluster set of at least one cluster set, and taking a candidate page corresponding to the at least one skeleton region as a target page.
In this embodiment, the executing body may select at least one skeleton area from each of the generated multiple cluster sets, and use a candidate page corresponding to the selected skeleton area as the target page. The target page may be used to conduct page testing to verify the quality of the candidate page. Specifically, the candidate page corresponding to the skeleton region may refer to that the display interface where the skeleton region is located is a display interface determined for the candidate page.
In practice, the execution body may select the at least one skeleton region in various ways. For example, the execution body may randomly select a skeleton region in each cluster set.
The method provided by the embodiment of the application can avoid the interference of the detailed content of the page on the selected target page by identifying the skeleton of the page and clustering the skeleton area. In addition, the embodiment can reduce the dimension of the screening page to the screening skeleton area, thereby realizing that each type of skeleton area is selected with the target page, so that the selected target page can comprehensively cover various page characteristics, and the recall rate of selecting different page types is improved.
In some optional implementations of the present embodiment, selecting at least one skeleton region from each of the at least one cluster set in step 204 may include: for each cluster set of the at least one cluster set, responsive to determining that a first candidate page of the cluster set includes a skeleton region of a second candidate page based on features of the skeleton region, selecting the skeleton region of the first candidate page as the skeleton region of the at least one skeleton region.
In these optional implementations, in the process that the executing body selects the skeleton region in any cluster set, if it is determined that one candidate page of the cluster set includes the skeleton region of another candidate page, that is, the skeleton region of the one candidate page is equal to or greater than the skeleton region of the other candidate page, the skeleton region of the one candidate page may be selected as at least part of the skeleton region in the selected at least one skeleton region. In practice, the execution body may take one of the candidate pages as a first candidate page and the other candidate page as a second candidate page.
The number of skeleton areas in one of the candidate pages may be equal to or greater than the number of skeleton areas in the other candidate page. For example, the skeleton region of one candidate page of one cluster set includes skeleton region No. 1, no. 2, and No. 3, and the skeleton region of the other candidate page includes skeleton region No. 1 and No. 3.
In practice, the execution subject may determine the skeleton region included in the candidate page based on the features of the skeleton region. For example, the characteristics of the skeleton region may include the size and element type of the skeleton region. If one skeleton region of one candidate page is consistent with the size and element type between one skeleton region in another candidate page, then the two skeleton regions may be considered to be the same one, i.e., both candidate pages include the skeleton region.
The implementation methods can select the skeleton region of one candidate page under the condition that the skeleton region of the other candidate page is completely included in the candidate page, so that under the condition that the number of the selected pages is limited, as many skeleton regions as possible are selected, and the richness of the skeleton regions in the finally determined target page is improved.
In some optional implementations of the present embodiment, selecting at least one skeleton region from each of the plurality of cluster sets in step 204 may include: for each cluster set of the plurality of cluster sets, responding to the feature based on the skeleton region to determine that the skeleton regions in at least two candidate pages exist in the cluster set are the same and the arrangement order of the skeleton regions is different, and selecting the skeleton regions of the at least two candidate pages as the skeleton regions in the at least one skeleton region.
In these alternative implementations, if the skeleton areas in at least two candidate pages exist in the cluster set and the arrangement order of the skeleton areas is different, the skeleton areas of the at least two candidate pages are selected as at least part of the skeleton areas of the selected at least one candidate page. The same skeleton region may refer to the same characteristics of the skeleton region.
Specifically, the arrangement sequence of the skeleton regions may refer to a default arrangement sequence of the skeleton regions in the skeleton structure, for example, the arrangement sequence numbers of the skeleton regions sequentially increase from top to bottom and from left to right.
For example, the three candidate pages include skeleton regions in the order of No. 1, no. 2, no. 3, no. 1, and No. 1, no. 3, no. 2, respectively. The execution body may select each of the three candidate pages as a skeleton region of at least one skeleton region.
The implementation modes can select the skeleton areas with different arrangement sequences, so that the characteristics of the pages with different arrangement sequences of the skeleton areas can be acquired, and more comprehensive page testing can be performed.
In some optional application scenarios of the foregoing implementations, the features of the skeleton region may be obtained by: extracting elements of a skeleton region for the skeleton region; determining an elemental signature of an element in the skeleton region as a signature of the skeleton region, wherein the elemental signature comprises at least one of: element type, element size, element number, and element order.
In these optional application scenarios, the execution body may acquire the features of the elements in the skeleton region as the features of the skeleton region, so as to refine the features of the skeleton region, so as to acquire more comprehensive and accurate skeleton region features.
With continued reference to fig. 3, fig. 3 is a schematic diagram of an application scenario of the page processing method according to the present embodiment. In the application scenario of fig. 3, the execution body 301 acquires a plurality of candidate pages 302, for example, 5 ten thousand candidate pages, and determines the display interfaces 303 of the plurality of candidate pages 302. The execution body 301 extracts the skeleton structure diagram 304 of the display interface 303, and identifies a skeleton region 305 in the skeleton structure diagram 304. The execution body 301 clusters the skeleton region 305 to generate at least one cluster set 306. The execution body 301 selects at least one skeleton region from each cluster set of the at least one cluster set 306, and takes a candidate page corresponding to the at least one skeleton region as target pages 307, where the number of target pages may be 500.
With further reference to FIG. 4, a flow 400 of yet another embodiment of a method of processing a page is shown. The process 400 includes the steps of:
step 401, acquiring a candidate page set, and determining an attribute state of a page attribute of a candidate page in the candidate page set.
In this embodiment, an execution body (e.g., a server or a terminal device shown in fig. 1) on which the processing method of a page is executed may acquire a candidate page set composed of candidate pages, and determine an attribute state of a page attribute of a candidate page in the candidate page set.
In practice, the page attributes may include at least one of: account attributes, page types, and page characteristics. Specifically, an account may refer to an account that is logged in on a page, i.e., the page is a user-triggered page that is logged into the account. The account attributes may include at least one of: consumption level, account liveness and account grade corresponding to the account. The page types may include a home page, a detail page, a list page, and the like. The page features may include: page quality score, page hotness (search and/or click hotness), page update frequency, etc.
The attribute status may refer to an attribute value, such as an attribute value may be a specific amount consumed in a detail page in a page type or account attribute. In addition, in the case where the attribute value is a numerical value, the attribute state may also correspond to at least two numerical value ranges in which the attribute value is divided. For example, the consumption is higher than 1 kiloyuan, the attribute state is high consumption, the consumption is between 1 kiloyuan and 1 kiloyuan, the attribute state is medium consumption, the consumption is less than 1 kiloyuan, and the attribute state is low consumption.
Step 402, for each attribute state in at least two determined attribute states of the page attribute, selecting a candidate page corresponding to the attribute state from the candidate page set.
In this embodiment, the execution body may determine, for each of at least one page attribute, a candidate page corresponding to at least two determined attribute states of the page attribute in the candidate page set. Specifically, the candidate page corresponding to the attribute state may refer to that the page attribute of the candidate page is the attribute state.
Step 403, using the selected candidate pages as a plurality of candidate pages, and determining display interfaces of the plurality of candidate pages.
In this embodiment, the execution body may use all the selected candidate pages as the plurality of candidate pages, and determine display interfaces of the candidate pages.
Step 404, extracting a skeleton structure diagram of the display interface, and identifying a skeleton region in the skeleton structure diagram.
In this embodiment, the execution body may extract a skeleton structure of the display interface. The execution body may identify a skeleton region in the skeleton structure diagram. The skeleton structure diagram refers to a structure diagram formed by areas where contents such as characters and pictures in a page are located.
Step 405, clustering the skeleton region to generate at least one cluster set.
In this embodiment, the execution body may cluster each skeleton region, and the clustering result is a plurality of cluster sets. Specifically, the execution subject may perform clustering in various manners, such as an image clustering algorithm, a k-means clustering algorithm, and the like.
Step 406, selecting at least one skeleton region from each cluster set of the at least one cluster set, and taking a candidate page corresponding to the at least one skeleton region as a target page.
In this embodiment, the executing body may select at least one skeleton area from each of the generated multiple cluster sets, and use a candidate page corresponding to the selected skeleton area as the target page. The target page may be used to conduct page testing to verify the quality of the candidate page. Specifically, the candidate page corresponding to the skeleton region may refer to a candidate page to which the display interface where the skeleton region is located belongs.
The embodiment can select at least two candidate pages corresponding to the determined attribute states so as to ensure that the pages with various attribute states are selected, thereby improving the recall rate of selecting the pages with different types.
In some alternative implementations of the present embodiment, step 402 may include: and selecting a candidate page corresponding to each determined attribute state of each page attribute from the candidate page set.
In these optional implementations, the execution body may select, for each page attribute of the candidate pages, a candidate page corresponding to each determined attribute state, so as to ensure that pages in all attribute states are selected, thereby further improving recall rates for selecting pages of different types.
With further reference to fig. 5, as an implementation of the method shown in the foregoing figures, the present application provides an embodiment of a page processing apparatus, where the embodiment of the apparatus corresponds to the embodiment of the method shown in fig. 2a, and the embodiment of the apparatus may further include the same or corresponding features or effects as the embodiment of the method shown in fig. 2a, except for the features described below. The device can be applied to various electronic equipment.
As shown in fig. 5, the processing apparatus 500 of the page of the present embodiment includes: an acquisition unit 501, an extraction unit 502, a clustering unit 503, and a selection unit 504. Wherein, the acquiring unit 501 is configured to acquire a plurality of candidate pages and determine display interfaces of the plurality of candidate pages; an extraction unit 502 configured to extract a skeleton structure diagram of the display interface, and identify a skeleton region in the skeleton structure diagram; a clustering unit 503 configured to cluster the skeleton regions to generate at least one cluster set; the selecting unit 504 is configured to select at least one skeleton region from each cluster set of the at least one cluster set, and take a candidate page corresponding to the at least one skeleton region as a target page.
In this embodiment, the specific processing of the acquiring unit 501, the extracting unit 502, the clustering unit 503 and the selecting unit 504 of the processing device 500 of the page and the technical effects thereof may refer to the related descriptions of the step 201, the step 202, the step 203 and the step 204 in the corresponding embodiment of fig. 2a, and are not repeated herein.
In some optional implementations of the present embodiment, the acquiring unit is further configured to perform acquiring the plurality of candidate pages as follows: acquiring a candidate page set, and determining the attribute state of the page attribute of the candidate page in the candidate page set; for each of at least two determined attribute states of the page attribute, selecting a candidate page corresponding to the attribute state from the candidate page set; and taking the selected candidate pages as a plurality of candidate pages.
In some optional implementations of this embodiment, the obtaining unit is further configured to perform, for each of the at least two determined attribute states of the page attribute, selecting a candidate page corresponding to the attribute state from the candidate page set as follows: and selecting a candidate page corresponding to each determined attribute state of each page attribute from the candidate page set.
In some optional implementations of the present embodiment, the selecting unit is further configured to perform selecting the at least one skeleton region from each of the at least one cluster set as follows: for each cluster set of the at least one cluster set, responsive to determining that a first candidate page of the cluster set includes a skeleton region of a second candidate page based on features of the skeleton region, selecting the skeleton region of the first candidate page as the skeleton region of the at least one skeleton region.
In some optional implementations of the present embodiment, the selecting unit is further configured to perform selecting the at least one skeleton region from each of the plurality of cluster sets as follows: for each cluster set of the plurality of cluster sets, responding to the feature based on the skeleton region to determine that the skeleton regions in at least two candidate pages exist in the cluster set are the same and the arrangement order of the skeleton regions is different, and selecting the skeleton regions of the at least two candidate pages as the skeleton regions in the at least one skeleton region.
In some optional implementations of this embodiment, the features of the skeleton region are obtained by: extracting elements of a skeleton region for the skeleton region; determining an elemental signature of an element in the skeleton region as a signature of the skeleton region, wherein the elemental signature comprises at least one of: element type, element size, element number, and element order.
According to embodiments of the present application, an electronic device and a readable storage medium are also provided.
As shown in fig. 6, a block diagram of an electronic device according to a method for processing a page according to an embodiment of the present application is shown. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the application described and/or claimed herein.
As shown in fig. 6, the electronic device includes: one or more processors 601, memory 602, and interfaces for connecting the components, including high-speed interfaces and low-speed interfaces. The various components are interconnected using different buses and may be mounted on a common motherboard or in other manners as desired. The processor may process instructions executing within the electronic device, including instructions stored in or on memory to display graphical information of the GUI on an external input/output device, such as a display device coupled to the interface. In other embodiments, multiple processors and/or multiple buses may be used, if desired, along with multiple memories and multiple memories. Also, multiple electronic devices may be connected, each providing a portion of the necessary operations (e.g., as a server array, a set of blade servers, or a multiprocessor system). One processor 601 is illustrated in fig. 6.
Memory 602 is a non-transitory computer-readable storage medium provided herein. The memory stores instructions executable by the at least one processor to cause the at least one processor to perform the method for processing pages provided herein. The non-transitory computer readable storage medium of the present application stores computer instructions for causing a computer to execute the processing method of the page provided by the present application.
The memory 602 is used as a non-transitory computer readable storage medium, and may be used to store a non-transitory software program, a non-transitory computer executable program, and modules, such as program instructions/modules (e.g., the acquisition unit 501, the extraction unit 502, the clustering unit 503, and the selection unit 504 shown in fig. 5) corresponding to the page processing method in the embodiment of the present application. The processor 601 executes various functional applications of the server and data processing, i.e., implements the processing method of the page in the above-described method embodiment, by running non-transitory software programs, instructions, and modules stored in the memory 602.
The memory 602 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, at least one application program required for a function; the storage data area may store data created from the use of the processing electronics of the page, and the like. In addition, the memory 602 may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid-state storage device. In some embodiments, memory 602 may optionally include memory located remotely from processor 601, which may be connected to the processing electronics of the page via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The electronic device of the page processing method may further include: an input device 603 and an output device 604. The processor 601, memory 602, input device 603 and output device 604 may be connected by a bus or otherwise, for example in fig. 6.
The input device 603 may receive input numeric or character information and generate key signal inputs related to user settings and function control of the processing electronics of the page, such as a touch screen, keypad, mouse, trackpad, touchpad, pointer stick, one or more mouse buttons, trackball, joystick, and like input devices. The output means 604 may include a display device, auxiliary lighting means (e.g., LEDs), tactile feedback means (e.g., vibration motors), and the like. The display device may include, but is not limited to, a Liquid Crystal Display (LCD), a Light Emitting Diode (LED) display, and a plasma display. In some implementations, the display device may be a touch screen.
Various implementations of the systems and techniques described here can be realized in digital electronic circuitry, integrated circuitry, application specific ASIC (application specific integrated circuit), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.
These computing programs (also referred to as programs, software applications, or code) include machine instructions for a programmable processor, and may be implemented in a high-level procedural and/or object-oriented programming language, and/or in assembly/machine language. As used herein, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, apparatus, and/or device (e.g., magnetic discs, optical disks, memory, programmable Logic Devices (PLDs)) used to provide machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the internet.
The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units involved in the embodiments of the present application may be implemented by software, or may be implemented by hardware. The described units may also be provided in a processor, for example, described as: a processor includes an acquisition unit, an extraction unit, a clustering unit, and a selection unit. Where the names of the units do not constitute a limitation on the unit itself in some cases, for example, a clustering unit may also be described as "a unit that clusters skeleton regions, generating at least one cluster set".
As another aspect, the present application also provides a computer-readable medium that may be contained in the apparatus described in the above embodiments; or may be present alone without being fitted into the device. The computer readable medium carries one or more programs which, when executed by the apparatus, cause the apparatus to: acquiring a plurality of candidate pages and determining display interfaces of the candidate pages; extracting a skeleton structure diagram of a display interface, and identifying a skeleton region in the skeleton structure diagram; clustering the skeleton areas to generate at least one cluster set; and selecting at least one skeleton region from each cluster set of at least one cluster set, and taking a candidate page corresponding to the at least one skeleton region as a target page.
The foregoing description is only of the preferred embodiments of the present application and is presented as a description of the principles of the technology being utilized. It will be appreciated by persons skilled in the art that the scope of the invention referred to in this application is not limited to the specific combinations of features described above, but it is intended to cover other embodiments in which any combination of features described above or equivalents thereof is possible without departing from the spirit of the invention. Such as the above-described features and technical features having similar functions (but not limited to) disclosed in the present application are replaced with each other.

Claims (14)

1. A method of processing a page, the method comprising:
acquiring a plurality of candidate pages, and determining display interfaces of the plurality of candidate pages;
extracting a skeleton structure diagram of the display interface, and identifying a skeleton region in the skeleton structure diagram;
clustering the skeleton areas to generate at least one cluster set;
selecting at least one skeleton region from each cluster set of the at least one cluster set, and taking a candidate page corresponding to the at least one skeleton region as a target page for page testing.
2. The method of claim 1, wherein the obtaining a plurality of candidate pages comprises:
acquiring a candidate page set, and determining the attribute state of the page attribute of a candidate page in the candidate page set;
for each attribute state in at least two determined attribute states of the page attribute, selecting a candidate page corresponding to the attribute state from the candidate page set;
and taking the selected candidate pages as the plurality of candidate pages.
3. The method of claim 2, wherein, for each of the at least two determined attribute states of the page attribute, selecting a candidate page from the candidate page set corresponding to the attribute state, comprises:
and selecting a candidate page corresponding to each determined attribute state of each page attribute from the candidate page set.
4. The method of claim 1, wherein the selecting at least one skeleton region from each cluster set of the at least one cluster set comprises:
for each cluster set of the at least one cluster set, responsive to determining that a first candidate page of the cluster set includes a skeleton region of a second candidate page based on features of the skeleton region, selecting the skeleton region of the first candidate page as the skeleton region of the at least one skeleton region.
5. The method of claim 1, wherein the selecting at least one skeleton region from each cluster set of the at least one cluster set comprises:
for each cluster set of the at least one cluster set, responding to the fact that skeleton areas in at least two candidate pages exist in the cluster set, which are determined based on the characteristics of the skeleton areas, are identical, and the arrangement sequence of the skeleton areas is different, and selecting the skeleton areas of the at least two candidate pages as the skeleton areas in the at least one skeleton area.
6. The method according to claim 4 or 5, wherein the characteristics of the skeleton region are obtained by:
extracting elements of a skeleton region for the skeleton region;
determining an element characteristic of an element in the skeleton region as a characteristic of the skeleton region, wherein the element characteristic comprises at least one of: element type, element size, element number, and element order.
7. A processing apparatus for a page, the apparatus comprising:
an acquisition unit configured to acquire a plurality of candidate pages, and determine display interfaces of the plurality of candidate pages;
an extraction unit for extracting a skeleton structure diagram of the display interface and identifying a skeleton region in the skeleton structure diagram;
a clustering unit configured to cluster the skeleton regions to generate at least one cluster set;
the selecting unit is configured to select at least one skeleton area from each cluster set of the at least one cluster set, and the candidate page corresponding to the at least one skeleton area is used as a target page for page testing.
8. The apparatus of claim 7, wherein the obtaining unit is further configured to perform the obtaining the plurality of candidate pages as follows:
acquiring a candidate page set, and determining the attribute state of the page attribute of a candidate page in the candidate page set;
for each attribute state in at least two determined attribute states of the page attribute, selecting a candidate page corresponding to the attribute state from the candidate page set;
and taking the selected candidate pages as the plurality of candidate pages.
9. The apparatus of claim 8, wherein the obtaining unit is further configured to perform each of the at least two determined attribute states for the page attribute by selecting a candidate page from the candidate page set corresponding to the attribute state as follows:
and selecting a candidate page corresponding to each determined attribute state of each page attribute from the candidate page set.
10. The apparatus of claim 7, wherein the selection unit is further configured to perform the selecting at least one skeleton region from each of the at least one set of clusters as follows:
for each cluster set of the at least one cluster set, responsive to determining that a first candidate page of the cluster set includes a skeleton region of a second candidate page based on features of the skeleton region, selecting the skeleton region of the first candidate page as the skeleton region of the at least one skeleton region.
11. The apparatus of claim 7, wherein the selection unit is further configured to perform the selecting at least one skeleton region from each of the at least one set of clusters as follows:
for each cluster set of the at least one cluster set, responding to the fact that skeleton areas in at least two candidate pages exist in the cluster set, which are determined based on the characteristics of the skeleton areas, are identical, and the arrangement sequence of the skeleton areas is different, and selecting the skeleton areas of the at least two candidate pages as the skeleton areas in the at least one skeleton area.
12. The device according to claim 10 or 11, wherein the characteristics of the skeleton region are obtained by:
extracting elements of a skeleton region for the skeleton region;
determining an element characteristic of an element in the skeleton region as a characteristic of the skeleton region, wherein the element characteristic comprises at least one of: element type, element size, element number, and element order.
13. An electronic device, comprising:
one or more processors;
storage means for storing one or more programs,
when executed by the one or more processors, causes the one or more processors to implement the method of any of claims 1-6.
14. A computer readable storage medium having stored thereon a computer program, wherein the program when executed by a processor implements the method of any of claims 1-6.
CN202010462451.2A 2020-05-27 2020-05-27 Page processing method and device, electronic equipment and storage medium Active CN111611503B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010462451.2A CN111611503B (en) 2020-05-27 2020-05-27 Page processing method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010462451.2A CN111611503B (en) 2020-05-27 2020-05-27 Page processing method and device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN111611503A CN111611503A (en) 2020-09-01
CN111611503B true CN111611503B (en) 2023-07-14

Family

ID=72200738

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010462451.2A Active CN111611503B (en) 2020-05-27 2020-05-27 Page processing method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN111611503B (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012094718A1 (en) * 2011-01-14 2012-07-19 Andre Douen Systems, methods and articles for managing presentation of information

Family Cites Families (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7676465B2 (en) * 2006-07-05 2010-03-09 Yahoo! Inc. Techniques for clustering structurally similar web pages based on page features
US8645354B2 (en) * 2011-06-23 2014-02-04 Microsoft Corporation Scalable metadata extraction for video search
WO2014146265A1 (en) * 2013-03-20 2014-09-25 Nokia Corporation Method and apparatus for personalized resource recommendations
US10338977B2 (en) * 2016-10-11 2019-07-02 Oracle International Corporation Cluster-based processing of unstructured log messages
CN106708952B (en) * 2016-11-25 2019-11-19 北京神州绿盟信息安全科技股份有限公司 A kind of Webpage clustering method and device
US10360288B2 (en) * 2017-01-25 2019-07-23 International Business Machines Corporation Web page design snapshot generator
JP2020530601A (en) * 2017-02-15 2020-10-22 宗剛 Kanji skeleton code input method and system with screen presentation screen
US11127189B2 (en) * 2018-02-23 2021-09-21 Canon Kabushiki Kaisha 3D skeleton reconstruction from images using volumic probability data
CN109902248B (en) * 2019-02-25 2021-07-13 百度在线网络技术(北京)有限公司 Page display method and device, computer equipment and readable storage medium
CN110058838B (en) * 2019-04-28 2021-03-16 腾讯科技(深圳)有限公司 Voice control method, device, computer readable storage medium and computer equipment
CN110187878A (en) * 2019-05-29 2019-08-30 北京三快在线科技有限公司 A kind of page generation method and device
CN111026946A (en) * 2019-12-12 2020-04-17 杭州昕华信息科技有限公司 Page information extraction method, device, medium and equipment

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012094718A1 (en) * 2011-01-14 2012-07-19 Andre Douen Systems, methods and articles for managing presentation of information

Also Published As

Publication number Publication date
CN111611503A (en) 2020-09-01

Similar Documents

Publication Publication Date Title
KR20210040316A (en) Method for generating user interactive information processing model and method for processing user interactive information
CN111582477B (en) Training method and device for neural network model
CN112650907A (en) Search word recommendation method, target model training method, device and equipment
CN113765873B (en) Method and device for detecting abnormal access traffic
CN113748413B (en) Text detection, inserted symbol tracking, and active element detection
CN112529180A (en) Method and apparatus for model distillation
CN113238943A (en) Method and device for continuous integration test
EP3901905B1 (en) Method and apparatus for processing image
CN112100530B (en) Webpage classification method and device, electronic equipment and storage medium
CN112561059B (en) Method and apparatus for model distillation
CN111783644B (en) Detection method, detection device, detection equipment and computer storage medium
CN111738325A (en) Image recognition method, device, equipment and storage medium
CN111611503B (en) Page processing method and device, electronic equipment and storage medium
CN106021279B (en) Information display method and device
CN112035210B (en) Method, apparatus, device and medium for outputting color information
CN111510376B (en) Image processing method and device and electronic equipment
US20220400945A1 (en) Color vision deficiency based text substitution
CN106020503B (en) Input method and device
CN112446716B (en) UGC processing method and device, electronic equipment and storage medium
CN111651229A (en) Font changing method, device and equipment
CN113805708A (en) Information display method and device, electronic equipment and storage medium
CN113220927A (en) Image detection method, device, equipment and storage medium
CN111930356B (en) Method and device for determining picture format
CN112000905A (en) Information display method and device
CN111352685A (en) Input method keyboard display method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant