CN113538450B - Method and device for generating image - Google Patents

Method and device for generating image Download PDF

Info

Publication number
CN113538450B
CN113538450B CN202010315358.9A CN202010315358A CN113538450B CN 113538450 B CN113538450 B CN 113538450B CN 202010315358 A CN202010315358 A CN 202010315358A CN 113538450 B CN113538450 B CN 113538450B
Authority
CN
China
Prior art keywords
rectangular frame
target image
image
preset
picture
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010315358.9A
Other languages
Chinese (zh)
Other versions
CN113538450A (en
Inventor
焦阳
杨羿
王建国
李�一
陈晓冬
刘林
贺翔
朱延峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN202010315358.9A priority Critical patent/CN113538450B/en
Priority to EP21163538.8A priority patent/EP3828766A3/en
Priority to US17/207,564 priority patent/US11810333B2/en
Priority to KR1020210037804A priority patent/KR102648760B1/en
Priority to JP2021052215A priority patent/JP7213291B2/en
Publication of CN113538450A publication Critical patent/CN113538450A/en
Application granted granted Critical
Publication of CN113538450B publication Critical patent/CN113538450B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/62Text, e.g. of license plates, overlay texts or captions on TV images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/002D [Two Dimensional] image generation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/002D [Two Dimensional] image generation
    • G06T11/20Drawing from basic elements, e.g. lines or circles
    • G06T11/203Drawing of straight lines or curves
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/136Segmentation; Edge detection involving thresholding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/187Segmentation; Edge detection involving region growing; involving region merging; involving connected component labelling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/10Image acquisition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • G06V30/15Cutting or merging image elements, e.g. region growing, watershed or clustering-based techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/18Extraction of features or characteristics of the image
    • G06V30/1801Detecting partial patterns, e.g. edges or contours, or configurations, e.g. loops, corners, strokes or intersections
    • G06V30/18076Detecting partial patterns, e.g. edges or contours, or configurations, e.g. loops, corners, strokes or intersections by analysing connectivity, e.g. edge linking, connected component analysis or slices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2210/00Indexing scheme for image generation or computer graphics
    • G06T2210/12Bounding box
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Processing Or Creating Images (AREA)
  • Image Analysis (AREA)

Abstract

The application discloses a method and a device for generating images, and relates to the field of computer vision. The specific implementation scheme is as follows: acquiring a screenshot of a webpage preloaded by a terminal as a source image; identifying connected domains in the source image, and generating a first external rectangular frame outside the outline of each connected domain; if the distance between the connected domains is smaller than a preset distance threshold, merging the connected domains, and generating a second external rectangular frame outside the outline of the merged connected domain; and generating a target image based on the nesting relationship between the first circumscribed rectangular frame and the second circumscribed rectangular frame and the picture in the first circumscribed rectangular frame. The first external rectangular frame and the second external rectangular frame are respectively generated through identification and combination of the connected domains in the source image, the spatial relationship between the materials in the webpage is represented through the nested relationship between the first external rectangular frame and the second external rectangular frame, and the spatial relationship between the materials in the source image can be represented in the generated target image.

Description

Method and device for generating image
Technical Field
The application discloses a method and a device for generating an image, relates to the technical field of computers, and particularly relates to the field of computer vision.
Background
Text and pictures in Html (hypertext language) web pages often contain important information and are very valuable material. In the related art, based on the files in the html webpage, the materials to be extracted are determined in an encoding mode, and then the files corresponding to the materials to be extracted are downloaded from the server to obtain the materials in the html webpage.
Disclosure of Invention
The embodiment of the application provides a method, a device, equipment and a storage medium for generating an image.
According to a first aspect, there is provided a method for generating an image, the method comprising: acquiring a screenshot of a webpage preloaded by a terminal as a source image; identifying connected domains in the source image, and generating a first external rectangular frame outside the outline of each connected domain; if the distance between the connected domains is smaller than a preset distance threshold, merging the connected domains, and generating a second external rectangular frame outside the outline of the merged connected domain; and generating a target image based on the nesting relationship between the first circumscribed rectangular frame and the second circumscribed rectangular frame and the picture in the first circumscribed rectangular frame.
According to a second aspect, there is provided an apparatus for generating an image, the apparatus comprising: the image acquisition module acquires a screenshot of a webpage preloaded by the terminal as a source image; the first generation module is configured to identify connected domains in the source image and generate a first external rectangular frame outside the outline of each connected domain; the second generation module is configured to combine the connected domains if the distance between the connected domains is smaller than a preset distance threshold value, and generate a second external rectangular frame outside the outline of the combined connected domains; and the image generation module is used for generating a target image based on the nesting relation between the first circumscribed rectangular frame and the second circumscribed rectangular frame and the picture in the first circumscribed rectangular frame.
According to the method and the device for identifying the connected domain in the source image, the problem that the spatial relationship among the materials extracted from the webpage in the related technology cannot be represented is solved, the first external rectangular frame and the second external rectangular frame are respectively generated through identification and combination of the connected domain in the source image, the spatial relationship among the materials in the webpage is represented through the nested relationship between the first external rectangular frame and the second external rectangular frame, and the spatial relationship among the materials in the source image can be represented in the generated target image.
It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the disclosure, nor is it intended to be used to limit the scope of the disclosure. Other features of the present disclosure will become apparent from the following specification.
Drawings
The drawings are for better understanding of the present solution and do not constitute a limitation of the present application. Wherein:
FIG. 1 is an exemplary system architecture diagram in which embodiments of the present application may be applied;
FIG. 2 is a schematic diagram of a first embodiment of a method for generating an image according to an embodiment of the present application;
FIG. 3 is a schematic illustration of an application scenario of a method for generating an image according to an embodiment of the present application;
FIG. 4 is a schematic diagram of a second embodiment of a method for generating an image according to an embodiment of the present application;
FIG. 5 is a block diagram of an electronic device for implementing a method for generating images of embodiments of the present application;
FIG. 6 is a diagram of a scenario in which a computer-readable medium may be implemented according to an embodiment of the present application.
Detailed Description
Exemplary embodiments of the present application are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present application to facilitate understanding, and should be considered as merely exemplary. Accordingly, one of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present application. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
Fig. 1 illustrates a real-force type system architecture 100 to which a method for generating an image or an apparatus for generating an image of an embodiment of the present application may be applied.
As shown in fig. 1, a system architecture 100 may include terminal devices 101, 102, 103, a network 104, and a server 105, as shown in fig. 1. The network 104 is used as a medium to provide communication links between the terminal devices 101, 102, 103 and the server 105. The network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
The user may interact with the server 105 through the network 104 using the terminal devices 101, 102, 103 to receive or send data, etc., for example, the user may input a web page address that the user wants to browse into the terminal device, the terminal device obtains data from the server 105 through the network 104, then generates a web page by the terminal after steps of parsing, rendering, etc. based on the obtained data, and finally presents the web page to the user.
The terminal devices 101, 102, 103 may be hardware or software. When the terminal devices 101, 102, 103 are hardware, they may be various electronic devices having a data interaction function with a server and other terminals, including but not limited to smart phones, tablet computers, desktop computers, and the like. When the terminal devices 101, 102, 103 are software, they can be installed in the above-listed electronic devices. It may be implemented as a plurality of software or software modules, for example, for providing distributed services, or as a single software or software module. The present invention is not particularly limited herein.
The server 105 may be a server providing data processing services, for example, sending corresponding data to the terminal devices according to access requests sent by the terminal devices 101, 102, 103, for the terminal devices to generate web pages to be accessed.
It should be noted that, the method for generating an image provided by the embodiment of the present application may be performed by the server 105, and accordingly, the apparatus for generating an image may be provided in the server 105. At this time, the server 105 acquires the web page information preloaded in the terminal devices 101, 102, 103 through the network 104, and then generates a source image from the acquired web page information and extracts material from the source image. The method for generating an image provided by the embodiment of the present application may also be performed by a terminal device, and accordingly, the device for generating an image may be set in the terminal device, where the terminal device is connected to other terminal devices through the network 104 in a communication manner, and may acquire web page information preloaded in the other terminal devices, and then generate a source image according to the acquired web page information and extract material from the source image. And are not limited herein.
With continued reference to fig. 2 (a), fig. 2 (a) shows a flowchart of a first embodiment of a method for generating an image according to the present disclosure, comprising the steps of:
step S201, acquiring a screenshot of a webpage preloaded by a terminal as a source image.
In this embodiment, the object for performing the main body extraction is a material contained in a web page presented by the terminal, including a text material and a picture material. The source image not only comprises materials in the webpage, but also comprises spatial relations among the materials.
As an example, the source image may be generated by: the execution main body obtains the network address of the webpage preloaded on the terminal, the webpage obtained by accessing the network address is the webpage preloaded on the terminal, then the obtained webpage is subjected to screenshot, and the obtained webpage screenshot is used as a source image. For example, the execution body may execute the above steps through a snapshot tool, and may also directly receive, through a network, a screenshot of a web page sent by the terminal device, which is not limited in this application.
Step S202, identifying connected domains in the source image, and generating a first circumscribed rectangular frame outside the outline of each connected domain.
The connected region, also called connected region, refers to an image region composed of foreground pixels having the same pixel value and adjacent positions in the image. Each connected domain in the image can be identified through connected domain analysis, and an external rectangular frame is generated outside the outline of the connected domain. The connected domain analysis technique belongs to a conventional technical means in the field of image processing, for example, a Two-Pass method, an OCR algorithm (Optical Character Recognition ) and the like can realize the function, and the application is not limited thereto.
In this embodiment, the first circumscribed rectangular frame is used to mark the smallest connected domain in the source image. For example, the executing body (such as the terminal shown in fig. 1) recognizes the connected domain in the source image through the OCR algorithm, and if a segment of text exists in the source image, the region of each line of text in the image is recognized as one connected domain, and accordingly, each line of text generates one first circumscribed rectangular frame outside the region outline in the image, the executing body may recognize a plurality of connected domains from the segment of text image and generate a plurality of first circumscribed rectangular frames.
It should be noted that, the recognition accuracy of the minimum connected domain may be adjusted according to the actual requirement, for example, in the above example, the same text segment may be recognized as a connected domain, and accordingly, a first circumscribed rectangular frame may be generated in the region of the text segment in the source image. The present application is not limited in this regard.
And step 203, if the distance between the connected domains is smaller than the preset distance threshold, merging the connected domains, and generating a second external rectangular frame outside the outline of the merged connected domain.
In this embodiment, the second circumscribed rectangular box is used to characterize the circumscribed rectangular box that has a nested relationship. By merging the connected domains, a second circumscribed rectangular box can be included that yields a multi-layer nested relationship. The distance between the connected domains is used for representing the position relation between materials contained in different connected domains in the source image.
Further describing with reference to fig. 2 (b), fig. 2 (b) shows a specific example of the present embodiment, and the connected domains 1, 2, 3, 4 (201, 202, 203, 204 shown in fig. 2 (b)) correspond to the first circumscribed rectangular frames a, b, c, d (205, 206, 207, 208 shown in fig. 2 (b)), respectively. The execution subject performs step S203 of merging the connected domains 1 and 2 to obtain a connected domain 5 (as shown in 209 in fig. 2 (b)), and merging the connected domains 3 and 4 to obtain a connected domain 6 (as shown in 210 in fig. 2 (b)); then a second rectangular frame e (as shown at 211 in fig. 2 (b)) is generated outside the outline of the connected domain 5, and a second detection frame f (as shown at 212 in fig. 2 (b)) is generated outside the outline of the connected domain 6; then, if the distance between the connected domain 5 and the connected domain 6 is still smaller than the preset distance threshold, the execution body continues to merge the connected domain 5 and the connected domain 6 to obtain the connected domain 7 (213 shown in fig. 2 (b)), and generates a second detection frame g outside the outline of the connected domain 7 (214 shown in fig. 2 (b)). Finally, second external rectangular frames e, f and g are obtained, wherein the second external rectangular frame g comprises second external rectangular frames and f, the second external rectangular frame e comprises first external rectangular frames a and b, and the second external rectangular frame f comprises first external rectangular frames c and d. The inclusion relationship between each circumscribed rectangular frame is a nested relationship between the first circumscribed rectangular frame and the second circumscribed rectangular frame, and can be used for representing the spatial relationship between materials in each connected domain.
Step S204, generating a target image based on the nesting relationship between the first circumscribed rectangle frame and the second circumscribed rectangle frame and the picture in the first circumscribed rectangle frame.
In this embodiment, the pictures in the first circumscribed rectangle frame represent the materials to be extracted from the source image, and are used to generate the basic elements of the target image, and the nested relationship between the first circumscribed rectangle frame and the second circumscribed rectangle frame is used to characterize the spatial relationship between the materials in the source image.
The execution subject combines the pictures in the first circumscribed rectangular frame according to the nesting relationship obtained in the step S203, and the generated image is the target image.
In one specific example, the target image may be generated by: the execution body may characterize the first bounding rectangle and the second bounding rectangle by using rect functions, each rect storing coordinates of an upper left corner of the bounding rectangle in the source image and a length and a width of the bounding rectangle, such that each rect represents one of the first bounding rectangle and the second bounding rectangle. Then, the execution main body takes the rect with the largest quantity of rects as a father node, and constructs a rect tree structure according to the nesting relation between the first circumscribed rectangular frame and the second circumscribed rectangular frame, wherein each node in the tree structure represents one first circumscribed rectangular frame or one second circumscribed rectangular frame, and the bottommost node represents the first circumscribed rectangular frame in the source image. And finally, the execution main body combines the pictures in the first external rectangular frame together according to the tree structure, so that the target image can be obtained.
With continued reference to fig. 3, fig. 3 shows a schematic view of a scenario of a method for generating an image according to the present disclosure. In this application scenario, the execution body 306 may be a terminal device or a server. The executing body obtains a screenshot 301 of a preloaded webpage in a terminal 305 through a network, identifies connected domains therein and obtains each first detection frame (as shown by 302 in fig. 3), then merges the connected domains with a distance smaller than a preset distance threshold value and obtains each second detection frame (as shown by 303 in fig. 3), and finally combines pictures in the first detection frames into a target image 304 based on a nesting relationship between the first detection frames and the second detection frames.
According to the method for generating the image, disclosed by the embodiment of the application, through identifying and merging the connected domains in the source image, the first external rectangular frame and the second external rectangular frame are respectively generated, the spatial relationship among the materials in the webpage is represented through the nested relationship between the first external rectangular frame and the second external rectangular frame, and the spatial relationship among the materials in the source image can be represented in the generated target image.
With continued reference to fig. 4, fig. 4 shows a flowchart of a second embodiment of a method for generating an image according to the present disclosure, comprising the steps of:
step S401, acquiring a screenshot of a webpage preloaded by a terminal as a source image. This step corresponds to the aforementioned step S201, and will not be described here again.
Step S402, identifying connected domains in the source image, and generating a first circumscribed rectangular frame outside the outline of each connected domain. This step corresponds to the line of the step S202, and will not be described here again.
Step S403, if the distance between the connected domains is smaller than the preset distance threshold, merging the connected domains, and generating a second external rectangular frame outside the outline of the merged connected domain. This step corresponds to the aforementioned step S203, and will not be described here again
Step S404, deleting the first circumscribed rectangular frame if the definition of the picture in the first circumscribed rectangular frame is smaller than a preset definition threshold. Therefore, the extraction of materials with lower definition from the source image can be avoided, and the quality of the generated target image can be ensured.
Step S405, deleting the first external rectangular frame in the preset area in the source image based on the position of the first external rectangular frame in the source image.
In this embodiment, the preset area represents an area where the material with lower importance in the source image is located, for example, may be a bottom area and a top area of the source image, and in general, the web page will place characters or pictures with lower importance (for example, advertisements placed in the web page, etc.) in the two areas. The first circumscribed rectangle frame is used for marking the position and the area of the material to be extracted in the source image so as to facilitate the execution of the main body to extract the image in the area from the source image, namely, the step of extracting the material from the source image is completed. Therefore, deleting the first bounding rectangle means that the image within the first bounding rectangle is not extracted.
The execution subject deletes the first circumscribed rectangle frame in the preset area, so that the low-value materials can be filtered, the operation amount is reduced, and the generated target image is prevented from containing the low-value materials.
Step S406, identifying the picture in the first external rectangular frame, and obtaining an identification result corresponding to the picture content in the first external rectangular frame.
In this embodiment, the pictures in the first circumscribed rectangular frame include text material pictures and image material pictures, which may include low-value materials, for example, the text material pictures are advertisement languages in the web page, and the image material pictures are logo images or key images in the web page, and the effective information included in the materials is less, so the value is lower. By identifying the picture in the first external rectangular frame, an identification result corresponding to the picture content can be obtained, and the identification result can be used for judging whether the picture in the first external rectangular frame needs to be filtered. For example, the execution subject may input the source image into the convolutional neural network model, and obtain the recognition result of the picture in each first circumscribed rectangular frame in the source image, where the recognition result may be various types such as text, logo image, advertisement or key diagram.
Step S407, deleting the first external rectangular frame which meets the preset conditions based on the identification result. The preset conditions can be set according to actual requirements so as to remove unnecessary materials and retain valuable materials.
In this embodiment, the materials to be extracted include text materials and image materials, for example, preset conditions may be set to be logo images, key images and advertisement, if the identification results are three, the execution main body deletes the corresponding first circumscribed rectangular frame, so that the content in the partial image area will not be included when the target image is generated subsequently, filtering of the content extracted from the source image is achieved, and adding the material with lower value into the generated target image is avoided.
In some optional implementations of this embodiment, before deleting the first circumscribed rectangular frame that meets the preset condition, the method may further include: and storing the picture in the first circumscribed rectangular frame corresponding to the identification result into a preset position based on the identification result. In an actual application scenario, although some pictures in the source image are not needed by the target image, the target image can be used for other purposes, for example, logo images in the source image can be used for business data analysis of a webpage, key images can be used for analyzing interactive functions of the webpage, and the like, so that an executing body can store the identified logo images and the key images in corresponding storage positions respectively, so that subsequent applications are facilitated.
Step S408, based on the nesting relationship between the first circumscribed rectangle frames and the second circumscribed rectangle frames, combining the pictures in each first circumscribed rectangle frame into an initial target image. The difference between this step and the step of generating the target image in the step S204 is that, in this embodiment, based on the nesting relationship between the first circumscribed rectangular frame and the second circumscribed rectangular frame, the image formed by combining the pictures in each first circumscribed rectangular frame is used as the initial target image, and the target image is obtained after the subsequent steps.
Step S409, determining a core area in the initial target image, wherein the core area in the initial target image is an area including a preset target in the initial target image.
In this embodiment, the preset target is used to represent the material containing the key information in the initial target image, where the preset target includes at least one of the following: including images of faces and dense text. As an example, the execution subject may use a saliency detection algorithm to identify, from the initial target image, an area where the image including the face and the dense text are located, that is, a core area of the initial target image, where it is to be noted that the number of core areas may be one or more, and is determined by the number of face image areas or text dense areas in the initial target image.
Step S410, dividing the initial target image based on a preset cutting proportion and a preset size, and obtaining a picture of the divided core area.
In this embodiment, the execution body may preset the clipping ratio and the size according to the actual requirement, divide the initial target image to obtain a plurality of divided pictures with consistent clipping ratio and size, and delete the pictures outside the core area, thereby obtaining the divided pictures of the core area. For example, when the initial target image includes a plurality of text-dense areas and a plurality of face image areas, the execution subject may divide the initial target image to obtain a plurality of core area pictures, and other pictures not in the core area may be deleted after the division.
Step S411, aggregating the pictures of the segmented core region based on the characteristic information of the pictures of the segmented core region to obtain a target image. The characteristic information includes at least one of: size, aspect ratio, and composition properties of the picture.
In this embodiment, the composition attribute of the picture includes text and image for characterizing whether the material content included in the picture is text or image.
Based on the feature information of the pictures of the segmented core regions obtained in step S410, the execution subject may aggregate the pictures of the segmented core regions together according to a preset rule, to obtain the target image. For example, pictures of the divided core areas with the same size and the same attribute as characters can be spliced together, so that characters in the two areas with the same attribute are aggregated into a whole text, and continuity between text materials is ensured. For another example, pictures of a plurality of segmented core regions, which are composed of images and have the same aspect ratio and size, may be aggregated in one region to highlight the contrast and association between a plurality of image materials.
As can be seen from fig. 4, the second embodiment, compared with the first embodiment shown in fig. 2, embodies the steps of generating an initial target image according to a nested relationship and identifying a core region thereof, then dividing and aggregating the initial target image, and filtering the material extracted from the source image according to a preset rule. Important materials can be further extracted from the initial target image through segmentation and aggregation of the initial target image, and materials extracted from the source image are filtered according to preset rules, so that materials with lower value in the source image can be removed, materials with lower value are prevented from being contained in the target image, and therefore quality of the materials contained in the generated target image is improved.
Fig. 5 shows a block diagram of an electronic device for generating an image according to the method disclosed herein. The electronic device includes: the image acquisition module 501 acquires a screenshot of a webpage preloaded by a terminal as a source image; the first generation module 502 is configured to identify connected domains in the source image and generate a first circumscribed rectangular frame outside the outline of each connected domain; a second generating module 503 configured to merge the connected domains if the distance between the connected domains is smaller than a preset distance threshold, and generate a second external rectangular frame outside the outline of the merged connected domain; the image generation module 504 is configured to generate a target image based on the nested relationship between the first circumscribed rectangular frame and the second circumscribed rectangular frame and the picture in the first circumscribed rectangular frame.
In this embodiment, the image generation module package 504 includes: the initial image module is configured to combine the pictures in each first circumscribed rectangular frame into an initial target image based on the nesting relationship between the first circumscribed rectangular frame and the second circumscribed rectangular frame; the area identification module is configured to determine a core area of an initial target image, wherein the core area in the initial target image is an area comprising a preset target in the initial target image; the image segmentation module is configured to segment the picture of the initial target image based on a preset cutting proportion and a preset size to obtain a segmented picture of the core region; the image aggregation module is configured to aggregate the images of the segmented core areas based on the characteristic information of the images of the segmented core areas to obtain target images, wherein the characteristic information at least comprises one of the following components: size, aspect ratio, and composition properties of the picture.
In this embodiment, the apparatus further includes a picture identification module configured to perform, before determining the core region of the picture except in the first circumscribed rectangular frame, the steps of: identifying pictures in the first external rectangular frame, and obtaining an identification result corresponding to the content of the pictures in the first external rectangular frame; and deleting the first circumscribed rectangle frame meeting the preset condition based on the identification result.
In this embodiment, the picture recognition module is further configured to: before deleting the first circumscribed rectangular frame meeting the preset conditions, storing the picture in the first circumscribed rectangular frame corresponding to the identification result in a preset position based on the identification result.
In this embodiment, the apparatus further comprises a position detection module configured to: and deleting the first circumscribed rectangle frame in the preset area in the source image based on the position of the first circumscribed rectangle frame in the source image before determining the core area of the picture in the first circumscribed rectangle frame.
In this embodiment, the apparatus further includes a sharpness detection module configured to: and before generating the initial target image, deleting the first circumscribed rectangular frame if the definition of the picture in the first circumscribed rectangular frame is smaller than a preset definition threshold.
According to embodiments of the present application, an electronic device and a readable storage medium are also provided.
As shown in fig. 6, is a block diagram of an electronic device of a method of a computer-readable medium according to an embodiment of the present application. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the application described and/or claimed herein.
As shown in fig. 6, the electronic device includes: one or more processors 601, memory 602, and interfaces for connecting the components, including high-speed interfaces and low-speed interfaces. The various components are interconnected using different buses and may be mounted on a common motherboard or in other manners as desired. The processor may process instructions executing within the electronic device, including instructions stored in or on memory to display graphical information of the GUI on an external input/output device, such as a display device coupled to the interface. In other embodiments, multiple processors and/or multiple buses may be used, if desired, along with multiple memories and multiple memories. Also, multiple electronic devices may be connected, each providing a portion of the necessary operations (e.g., as a server array, a set of blade servers, or a multiprocessor system). One processor 601 is illustrated in fig. 6.
Memory 602 is a non-transitory computer-readable storage medium provided herein. Wherein the memory stores instructions executable by the at least one processor to cause the at least one processor to perform the methods of the computer-readable storage medium provided herein. The non-transitory computer readable storage medium of the present application stores computer instructions for causing a computer to perform the method of the computer readable storage medium provided by the present application.
The memory 602 is used as a non-transitory computer readable storage medium for storing non-transitory software programs, non-transitory computer executable programs, and modules, such as program instructions/modules corresponding to the methods of the computer readable storage medium in the embodiments of the present application (e.g., the image acquisition module 501, the first generation module 502, the second generation module 503, and the image generation module 504 shown in fig. 5). The processor 601 performs various functional applications of the server and data processing, i.e., a method of implementing the computer-readable storage medium in the above-described method embodiments, by running non-transitory software programs, instructions, and modules stored in the memory 602.
The memory 602 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, at least one application program required for a function; the storage data area may store data created according to the use of the electronic device of the computer-readable storage medium, and the like. In addition, the memory 602 may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid-state storage device. In some embodiments, memory 602 may optionally include memory located remotely from processor 601, such remote memory being connectable through a network to the electronic device of the computer-readable storage medium. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The electronic device of the method of computer-readable storage media may further comprise: an input device 603 and an output device 604. The processor 601, memory 602, input device 603 and output device 604 may be connected by a bus or otherwise, for example in fig. 6.
The input device 603 may receive input numeric or character information and generate key signal inputs related to user settings and function control of the electronic device of the computer-readable medium, such as input devices for a touch screen, a keypad, a mouse, a track pad, a touch pad, a pointer stick, one or more mouse buttons, a track ball, a joystick, and the like. The output means 604 may include a display device, auxiliary lighting means (e.g., LEDs), tactile feedback means (e.g., vibration motors), and the like. The display device may include, but is not limited to, a Liquid Crystal Display (LCD), a Light Emitting Diode (LED) display, and a plasma display. In some implementations, the display device may be a touch screen.
Various implementations of the systems and techniques described here can be realized in digital electronic circuitry, integrated circuitry, application specific ASIC (application specific integrated circuit), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.
These computing programs (also referred to as programs, software applications, or code) include machine instructions for a programmable processor, and may be implemented in a high-level procedural and/or object-oriented programming language, and/or in assembly/machine language. As used herein, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, apparatus, and/or device (e.g., magnetic discs, optical disks, memory, programmable Logic Devices (PLDs)) used to provide machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the internet.
The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.
According to the technical scheme of the embodiment of the application, the first external rectangular frame and the second external rectangular frame are respectively generated through identifying and merging the connected domains in the source image, the spatial relationship between the materials in the webpage is represented through the nested relationship between the first external rectangular frame and the second external rectangular frame, and the spatial relationship between the materials in the source image can be represented in the generated target image.
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps described in the present application may be performed in parallel, sequentially, or in a different order, provided that the desired results of the technical solutions disclosed in the present application can be achieved, and are not limited herein.
The above embodiments do not limit the scope of the application. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present application are intended to be included within the scope of the present application.

Claims (12)

1. A method for generating an image, comprising:
acquiring a screenshot of a webpage preloaded by a terminal as a source image;
identifying connected domains in the source image, and generating a first external rectangular frame outside the outline of each connected domain;
if the distance between the connected domains is smaller than a preset distance threshold, merging the connected domains, and generating a second external rectangular frame outside the outline of the merged connected domain;
combining the pictures in each first circumscribed rectangular frame into an initial target image based on the nesting relationship between the first circumscribed rectangular frame and the second circumscribed rectangular frame;
determining a core area in the initial target image, wherein the core area in the initial target image is an area comprising a preset target in the initial target image;
dividing the initial target image based on a preset cutting proportion and a preset size to obtain a picture of a divided core area;
based on the characteristic information of the pictures of the segmented core areas, aggregating the pictures of the segmented core areas to obtain the target image, wherein the characteristic information at least comprises one of the following: size, aspect ratio, and composition properties of the picture.
2. The method of claim 1, wherein prior to generating the initial target image, the method further comprises:
identifying the picture in the first external rectangular frame, and obtaining an identification result corresponding to the picture content in the first external rectangular frame;
and deleting the first circumscribed rectangular frame meeting the preset condition based on the identification result.
3. The method of claim 2, wherein deleting the first circumscribed rectangle box meeting the preset condition further comprises:
and storing the picture in the first circumscribed rectangular frame corresponding to the identification result into a preset position based on the identification result.
4. The method of claim 2, wherein prior to generating the initial target image, the method further comprises:
and deleting the first circumscribed rectangular frame in a preset area in the source image based on the position of the first circumscribed rectangular frame in the source image.
5. The method of one of claims 1 to 4, wherein, prior to generating the initial target image, the method further comprises:
and deleting the first circumscribed rectangular frame if the definition of the picture in the first circumscribed rectangular frame is smaller than a preset definition threshold.
6. An apparatus for generating an image, comprising:
the image acquisition module acquires a screenshot of a webpage preloaded by the terminal as a source image;
the first generation module is configured to identify connected domains in the source image and generate a first circumscribed rectangular frame outside the outline of each connected domain;
the second generation module is configured to combine the connected domains if the distance between the connected domains is smaller than a preset distance threshold value, and generate a second external rectangular frame outside the outline of the combined connected domains;
an image generation module configured to combine pictures in each of the first bounding rectangular frames into an initial target image based on a nesting relationship between the first bounding rectangular frame and the second bounding rectangular frame; determining a core area in the initial target image, wherein the core area in the initial target image is an area comprising a preset target in the initial target image; dividing the initial target image based on a preset cutting proportion and a preset size to obtain a picture of a divided core area; based on the characteristic information of the pictures of the segmented core areas, aggregating the pictures of the segmented core areas to obtain the target image, wherein the characteristic information at least comprises one of the following: size, aspect ratio, and composition properties of the picture.
7. The apparatus of claim 6, wherein the apparatus further comprises a picture recognition module configured to perform, prior to generating the initial target image, the steps of:
identifying the picture in the first external rectangular frame, and obtaining an identification result corresponding to the picture content in the first external rectangular frame;
and deleting the first circumscribed rectangular frame meeting the preset condition based on the identification result.
8. The apparatus of claim 7, wherein the picture identification module is further configured to:
before deleting the first circumscribed rectangular frame meeting the preset conditions, storing the picture in the first circumscribed rectangular frame corresponding to the identification result in a preset position based on the identification result.
9. The apparatus of claim 7, wherein the apparatus further comprises a position detection module configured to:
and deleting the first circumscribed rectangle frame in a preset area in the source image based on the position of the first circumscribed rectangle frame in the source image before the initial target image is generated.
10. The apparatus according to one of claims 6 to 9, wherein the apparatus further comprises a sharpness detection module configured to:
and before the initial target image is generated, deleting the first circumscribed rectangular frame if the definition of the picture in the first circumscribed rectangular frame is smaller than a preset definition threshold.
11. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-5.
12. A non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the method of any one of claims 1-5.
CN202010315358.9A 2020-04-21 2020-04-21 Method and device for generating image Active CN113538450B (en)

Priority Applications (5)

Application Number Priority Date Filing Date Title
CN202010315358.9A CN113538450B (en) 2020-04-21 2020-04-21 Method and device for generating image
EP21163538.8A EP3828766A3 (en) 2020-04-21 2021-03-18 Method, apparatus, sotrage medium and program for generating image
US17/207,564 US11810333B2 (en) 2020-04-21 2021-03-19 Method and apparatus for generating image of webpage content
KR1020210037804A KR102648760B1 (en) 2020-04-21 2021-03-24 Method and apparatus for generating images
JP2021052215A JP7213291B2 (en) 2020-04-21 2021-03-25 Method and apparatus for generating images

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010315358.9A CN113538450B (en) 2020-04-21 2020-04-21 Method and device for generating image

Publications (2)

Publication Number Publication Date
CN113538450A CN113538450A (en) 2021-10-22
CN113538450B true CN113538450B (en) 2023-07-21

Family

ID=75108280

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010315358.9A Active CN113538450B (en) 2020-04-21 2020-04-21 Method and device for generating image

Country Status (5)

Country Link
US (1) US11810333B2 (en)
EP (1) EP3828766A3 (en)
JP (1) JP7213291B2 (en)
KR (1) KR102648760B1 (en)
CN (1) CN113538450B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102686542B1 (en) 2021-11-22 2024-07-19 주식회사 신세계아이앤씨 A banner production management system that can automatically produce and manage product banners
CN114549573A (en) * 2022-02-22 2022-05-27 南京航空航天大学 Dense cable segmentation method and system
CN114943113B (en) * 2022-07-26 2022-11-01 江西少科智能建造科技有限公司 Method, system, storage medium and device for arranging diffusers in polygonal rooms

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107748888A (en) * 2017-10-13 2018-03-02 众安信息技术服务有限公司 A kind of image text row detection method and device
CN109711508A (en) * 2017-10-25 2019-05-03 北京京东尚科信息技术有限公司 Image processing method and device
CN109951654A (en) * 2019-03-06 2019-06-28 腾讯科技(深圳)有限公司 A kind of method of Video Composition, the method for model training and relevant apparatus
WO2020000879A1 (en) * 2018-06-27 2020-01-02 北京字节跳动网络技术有限公司 Image recognition method and apparatus

Family Cites Families (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130061132A1 (en) * 2010-05-19 2013-03-07 Li-Wei Zheng System and method for web page segmentation using adaptive threshold computation
WO2012055067A1 (en) * 2010-10-26 2012-05-03 Hewlett-Packard Development Company, L.P. Extraction of content from a web page
CN102567300B (en) * 2011-12-29 2013-11-27 方正国际软件有限公司 Picture document processing method and device
JP5794154B2 (en) 2012-01-23 2015-10-14 富士通株式会社 Image processing program, image processing method, and image processing apparatus
US9251580B2 (en) * 2013-08-23 2016-02-02 Cimpress Schweiz Gmbh Methods and systems for automated selection of regions of an image for secondary finishing and generation of mask image of same
CN103885712B (en) * 2014-03-21 2017-08-15 小米科技有限责任公司 Webpage method of adjustment, device and electronic equipment
CN104951741A (en) * 2014-03-31 2015-09-30 阿里巴巴集团控股有限公司 Character recognition method and device thereof
US20190065589A1 (en) 2016-03-25 2019-02-28 Quad Analytix Llc Systems and methods for multi-modal automated categorization
CN110334706B (en) * 2017-06-30 2021-06-01 清华大学深圳研究生院 Image target identification method and device
CN108446697B (en) * 2018-03-06 2019-11-12 平安科技(深圳)有限公司 Image processing method, electronic device and storage medium
CN109325201A (en) * 2018-08-15 2019-02-12 北京百度网讯科技有限公司 Generation method, device, equipment and the storage medium of entity relationship data
CN110555839A (en) * 2019-09-06 2019-12-10 腾讯云计算(北京)有限责任公司 Defect detection and identification method and device, computer equipment and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107748888A (en) * 2017-10-13 2018-03-02 众安信息技术服务有限公司 A kind of image text row detection method and device
CN109711508A (en) * 2017-10-25 2019-05-03 北京京东尚科信息技术有限公司 Image processing method and device
WO2020000879A1 (en) * 2018-06-27 2020-01-02 北京字节跳动网络技术有限公司 Image recognition method and apparatus
CN109951654A (en) * 2019-03-06 2019-06-28 腾讯科技(深圳)有限公司 A kind of method of Video Composition, the method for model training and relevant apparatus

Also Published As

Publication number Publication date
JP2021152901A (en) 2021-09-30
CN113538450A (en) 2021-10-22
KR20210040305A (en) 2021-04-13
KR102648760B1 (en) 2024-03-15
EP3828766A2 (en) 2021-06-02
JP7213291B2 (en) 2023-01-26
US11810333B2 (en) 2023-11-07
EP3828766A3 (en) 2021-10-06
US20210264614A1 (en) 2021-08-26

Similar Documents

Publication Publication Date Title
CN113538450B (en) Method and device for generating image
CN111709878B (en) Face super-resolution implementation method and device, electronic equipment and storage medium
US20210350541A1 (en) Portrait extracting method and apparatus, and storage medium
CN114550177A (en) Image processing method, text recognition method and text recognition device
EP4080469A2 (en) Method and apparatus of recognizing text, device, storage medium and smart dictionary pen
CN113408251B (en) Layout document processing method and device, electronic equipment and readable storage medium
CN110032419A (en) The methods of exhibiting and display systems of threedimensional model
CN114117128A (en) Method, system and equipment for video annotation
CN113923474B (en) Video frame processing method, device, electronic equipment and storage medium
JP2022536320A (en) Object identification method and device, electronic device and storage medium
CN113837194B (en) Image processing method, image processing apparatus, electronic device, and storage medium
CN108734718B (en) Processing method, device, storage medium and equipment for image segmentation
CN112508005B (en) Method, apparatus, device and storage medium for processing image
US10963690B2 (en) Method for identifying main picture in web page
CN113780297A (en) Image processing method, device, equipment and storage medium
CN116259064B (en) Table structure identification method, training method and training device for table structure identification model
CN112528610A (en) Data labeling method and device, electronic equipment and storage medium
CN114882313B (en) Method, device, electronic equipment and storage medium for generating image annotation information
CN115719444A (en) Image quality determination method, device, electronic equipment and medium
CN113038184B (en) Data processing method, device, equipment and storage medium
CN115082298A (en) Image generation method, image generation device, electronic device, and storage medium
CN113221742B (en) Video split screen line determining method, device, electronic equipment, medium and program product
CN113051504B (en) Document preview method, device, apparatus, storage medium and program product
CN113033333A (en) Entity word recognition method and device, electronic equipment and storage medium
US20230119741A1 (en) Picture annotation method, apparatus, electronic device, and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant