WO2024051632A1 - 图像处理方法、装置、介质及设备 - Google Patents

图像处理方法、装置、介质及设备 Download PDF

Info

Publication number
WO2024051632A1
WO2024051632A1 PCT/CN2023/116675 CN2023116675W WO2024051632A1 WO 2024051632 A1 WO2024051632 A1 WO 2024051632A1 CN 2023116675 W CN2023116675 W CN 2023116675W WO 2024051632 A1 WO2024051632 A1 WO 2024051632A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
elements
size
target
background
Prior art date
Application number
PCT/CN2023/116675
Other languages
English (en)
French (fr)
Inventor
詹科
刘银星
张政
吕晶晶
王维珍
阮涛
Original Assignee
北京沃东天骏信息技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京沃东天骏信息技术有限公司 filed Critical 北京沃东天骏信息技术有限公司
Publication of WO2024051632A1 publication Critical patent/WO2024051632A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/75Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/10Image acquisition
    • G06V10/16Image acquisition using multiple overlapping images; Image stitching
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/141Image acquisition using multiple overlapping images; Image stitching
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • G06V30/15Cutting or merging image elements, e.g. region growing, watershed or clustering-based techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/18Extraction of features or characteristics of the image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • G06V30/19007Matching; Proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • G06V30/191Design or setup of recognition systems or techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
    • G06V30/19147Obtaining sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • G06V30/191Design or setup of recognition systems or techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
    • G06V30/19173Classification techniques

Definitions

  • the embodiments of the present application relate to the field of image processing technology, for example, to an image processing method, device, medium and equipment.
  • the current method of image size expansion is mainly to achieve image size expansion by cropping the original image; that is, first calculate two scaling ratios based on the current width and height and the width and height of the target size, and then calculate the larger one based on the width and height of the target size. Scale proportionally so that the new image will have redundant space in width or height compared to the target size, and then find out the unimportant parts of the upper, lower, left, and right boundaries and crop them to get the cropped target image.
  • This application provides an image processing method, device, medium and equipment to improve the coordination of visual effects during the display process without losing key information of the image during image fission processing. .
  • an image processing method which method includes:
  • the processing method of each image element is determined, and the image element is processed based on the determined processing method to obtain the processed Image elements;
  • the processed image elements are spliced to obtain the target image.
  • an image processing device which device includes:
  • a base image determination module configured to determine a base image that matches the target image size based on the target image size
  • An image element extraction module configured to extract image elements in the basic image
  • An image element processing module configured to determine the processing method of each image element based on the size difference between the target image size and the basic image size, and the deformation type of each image element, and to process the image element based on the determined processing method. Perform processing to obtain processed image elements;
  • the target image generation module is configured to splice the processed image elements to obtain the target image.
  • an electronic device including:
  • the memory stores a computer program that can be executed by the at least one processor, and the computer program is executed by the at least one processor, so that the at least one processor can execute the method described in any embodiment of the present application. Image processing methods.
  • a computer-readable storage medium stores computer instructions, and the computer instructions are used to implement any of the embodiments of the present application when executed by a processor. image processing methods.
  • Figure 1 is a schematic flow chart of an image processing method provided by an embodiment of the present application.
  • Figure 2 is a schematic flow chart of another image processing method provided by an embodiment of the present application.
  • Figure 3 is a schematic flow chart of another image processing method provided by an embodiment of the present application.
  • Figure 4 is a schematic structural diagram of an image processing device provided by an embodiment of the present application.
  • FIG. 5 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.
  • Figure 1 is a flow chart of an image processing method provided by an embodiment of the present application. This embodiment can be applied to the situation of modifying the image size.
  • the method can be executed by an image processing device, and the deployment control device can be configured by software and/or Or hardware, the deployment control device can be configured on electronic computing equipment, including the following steps:
  • Step 110 Based on the target image size, determine a base image that matches the target image size.
  • Step 120 Extract image elements in the basic image.
  • Step 130 Based on the size difference between the target image size and the basic image size, and the deformation type of each image element, determine the processing method of each image element, and process the image element based on the determined processing method to obtain the processed image element.
  • Step 140 Splice the processed image elements to obtain the target image.
  • the target image can be understood as an image that needs to be displayed at a preset display location.
  • the target image can be an advertising image displayed in a preset advertising promotion position in the current interface for promotion; of course, the target image can also be an announcement image displayed in a preset announcement position.
  • This embodiment focuses on the image type and image display of the target image. There are no restrictions on location. On the basis of the above, the display sizes of different display locations are different, which will lead to different target image sizes when the same basic image is displayed in different display locations; for example, poster images displayed on the exterior walls of shopping malls and on bus display screens The same poster image played back has different target image sizes.
  • the target image size may be determined based on the display size of the target image during display.
  • the target image size is preset to be consistent with the display size of the display location.
  • the basic image can be understood as the image before the target image fission. In other words, it can also be explained as: in order to match different display positions, different image ratios are set for the same basic image in advance; for any basic image, if the image ratio of the basic image If it is consistent with the size ratio of the display position, determine the image size of the base image.
  • the base image will be directly displayed as the target image in the display position; on the contrary, if the image ratio of the base image If it is inconsistent with the size ratio of the display position, select the image ratio that is closest to the size ratio of the display position, and perform corresponding image fission processing on the base image size based on the size difference between the two ratios to obtain the target corresponding to the base image.
  • the image is displayed in the placement.
  • the image ratio of the base image may include but is not limited to 3:1, 2:1, 1:1, 1.2:1 and 0.5:1.
  • Image fission processing includes image scaling processing and image stretching processing in width and/or height.
  • the display size of the display position is obtained, the target image size of the target image to be displayed is determined based on the display size, and the base image that matches the target image size is determined based on the image ratio corresponding to the target image size.
  • matching can be understood as the image proportion corresponding to the target image size and the image proportion of the base image are equal or the image proportion corresponding to the target image size is equal to the base image.
  • the difference in image proportions of the images is within the preset proportion range.
  • the image ratio corresponding to the target image size can be determined to be 1:1.3. Determine whether there is an image ratio consistent with the image ratio corresponding to the target image size among the multiple image ratios of the preset base image, and there is no image corresponding to the target image size among the multiple image ratios of the preset base image.
  • image ratios with the same ratio select the image ratio that is closest to the image ratio of the target image size, that is, the closest image ratio in this embodiment is 1:1.2, then the base image corresponding to the image ratio is the same as the target image ratio.
  • the image dimensions match the base image.
  • the target image is obtained after performing image fission processing on the base image.
  • the image element can be understood as the image content in the basic image, and different image contents belong to different element types.
  • image elements in advertising images can include logo elements, product elements, copywriting elements, face elements, human body elements, etc.
  • image elements in announcement images can include official seal elements, text elements, and other elements.
  • each image element in the basic image can be determined based on the selection instructions for each image element triggered by the user respectively.
  • the basic image can also be input into each image element extraction model respectively to obtain each element extraction model respectively.
  • the output image element extraction result In the process of identifying image elements, it is necessary to identify the element type and element position of each image element.
  • a selection instruction triggered by the user is received to determine the corresponding selection result. For example, if the user triggers a frame selection on a product element and inputs the corresponding element type as a product element, then it is determined that the element selected by the user is a product element, and the location area of the frame selection is the location of the product element.
  • each element recognition model includes but is not limited to a logo element recognition model, a product element Recognition model, copywriting element recognition model, face element recognition model, human body element recognition model and other element recognition models.
  • the basic images are input into each element recognition model respectively, and the recognition results output by each element recognition model are obtained respectively.
  • the product element recognition model as an example, input the basic image into the product element recognition model to obtain the element results output by the product element recognition model.
  • the element recognition result can be a classification result, that is, when the output result is 1, it means that the basic image contains a product element, and the location of the product element is output at the same time, for example, the location of the product element in the basic image is output. Covered image pixels.
  • model training is first performed on each element recognition model.
  • the training method of any element recognition model includes: obtaining the background image and element data, performing enhancement processing on the element data to obtain multiple enhanced element data, and setting the enhanced element data in the background image to obtain training samples.
  • an image synthesis program can be used to synthesize advertising images with products and use them as training samples for training the product element recognition model. For example, select a batch of pictures as background images, then select a batch of products and paste them randomly on the pictures. When pasting, you can know the coordinates and categories as label information. When pasting, Gaussian blur, salt and pepper noise, image flipping, random cropping, scaling, and color channel replacement will be randomly added. These data enhancement methods increase the diversity of the data set and help improve the accuracy of model recognition.
  • the product element recognition model to be trained is iteratively trained based on the sample label to obtain the trained product element recognition model.
  • the logo element recognition model, face element recognition model, human body element recognition model and other element recognition models similar to image recognition can be trained using the above method.
  • the element recognition model similar to image recognition can be a network structure module, such as a convolutional neural network, a multi-layer perceptron, etc., which is not limited.
  • the element recognition model can be a YOLOv5 model.
  • the Chinese corpus can be downloaded from the Internet and the existing copywriting library can be added. For example, select a batch of pictures as background images, then randomly select copy from the copy library, and then randomly write it on the background image using Python, and add rotation, tilt, projection transformation, and Gaussian blur to increase sample richness, thereby improving the model Recognition accuracy.
  • the element recognition model similar to text recognition can be a network structure module, such as a convolutional neural network, a multi-layer perceptron, etc., which is not limited.
  • the element identification model can be the east model.
  • each image element in the basic image is identified, each image element is extracted, and fission processing is performed on the image element.
  • determine the processing method of each image element based on the determined processing method to obtain the processing The image element after.
  • the size difference between the target image size and the base image size includes a scaling difference and/or an aspect ratio difference;
  • the deformation types of image elements include non-deformable, slightly deformable and deformable; non-deformable
  • the processing methods of image elements include proportional scaling.
  • the processing methods of slightly deformable image elements include proportional scaling and stretching within the preset deformation range.
  • the processing methods of deformable image elements include proportional scaling and arbitrary proportions. Stretch.
  • the scaling difference can be understood as the difference that can be eliminated by enlarging or reducing the image length and image width of the current image at the same ratio.
  • the target image size is 1000mm*1000mm
  • the basic image size is 500mm*
  • the scaling difference can be eliminated by enlarging the size between the target image size and the base image size by equal proportions.
  • the aspect ratio difference can be understood as the difference that can be eliminated by stretching the image length of the current image or stretching the image width.
  • the target image size is 1000mm*1200mm
  • the base image size is In the case of 1000mm*1000mm
  • the aspect ratio difference can be eliminated by stretching the size between the target image size and the base image size.
  • the target image size is 1000mm*1200mm and the base image size is 500mm*500mm.
  • the size between the target image size and the base image size can be eliminated by first scaling it proportionally and then stretching it. Size difference between images.
  • non-deformable image elements can be understood as the shape ratio of the image element needs to maintain a specific ratio and cannot change during the fission process; for example, logo elements, face elements and human body elements in advertising images, etc. , for example, the logo needs to maintain the preset shape and cannot change; the image element that can be slightly deformed can be understood as the shape ratio of the image element needs to maintain the ratio of the preset range during the fission processing, and cannot exceed the preset range. Changes in proportion; for example, product elements in advertising images, etc. For example, the shape of the product can be widened or stretched within a preset range; deformable image elements can be understood as the shape of the image element during the fission process.
  • the shape proportions are not limited and can be changed arbitrarily; for example, the copywriting elements in advertising images, etc., for example, the text size and font in the copywriting can be changed arbitrarily.
  • the base image is enlarged in equal proportions, because the image elements of each deformation type can be enlarged in equal proportions. According to scaling, each image element in the basic image can be directly enlarged by a factor of 2 to obtain the processed image elements.
  • the base image is stretched in width.
  • the base image includes For non-deformable image elements and slightly deformable image elements, the non-deformable image elements will not be stretched; for slightly deformable image elements, if the size difference is within the preset deformation range, the slightly deformable image elements will be directly stretched.
  • the image elements are stretched based on the size difference. If the size difference is not within the preset deformation range, the slightly deformable image elements are stretched within the preset deformation range; the arbitrarily deformable image elements are directly stretched based on the size. The difference is stretched to obtain each processed image element.
  • the target image size is 1000mm*1200mm and the base image size is 500mm*500mm
  • non-deformable image elements will not be stretched, and slightly deformable image elements will be stretched within the preset stretching range in width.
  • the deformable image elements are stretched in width based on size differences, thereby obtaining each processed image element.
  • the processed image elements are spliced to obtain the target image.
  • at least one layout distribution of the target image can be determined, and the processed image elements are spliced based on the layout distribution to obtain a target image corresponding to each layout distribution.
  • the layout distribution can be interpreted as the layout relationship of each image element in the image.
  • the layout distribution may include, but is not limited to, layout distributions such as picture above and below, picture above and below, left text and right picture, left picture and right text, and text centered on both sides of the product.
  • the basic image does not include an image background
  • the spliced image after generating the spliced image corresponding to the at least one layout distribution, calculate the rationality probability of each spliced image, and determine that the at least one spliced image meets the rationality requirements based on a preset rationality threshold.
  • the spliced image is displayed to the user, and the target image corresponding to the basic image is generated based on the user's instructions for selecting the target image in the spliced image that meets the rationality requirements; of course, the rationality of the at least one spliced image can also be calculated.
  • the spliced image with the largest probability value in the rationality probability is directly used as the target image.
  • the spliced target image can also be determined based on other methods, which is not limited in this embodiment.
  • the technical solution of this embodiment determines a base image that matches the target image size based on the target image size; extracts image elements in the base image; and based on the size difference between the target image size and the base image size, and the deformation of each image element type, determines how each image element is processed, and The image elements are processed based on the determined processing method to obtain the processed image elements; the processed image elements are spliced to obtain the target image.
  • FIG. 2 is a flow chart of another image processing method provided by an embodiment of the present application.
  • the embodiment of the present application can be combined with various options in the above embodiments.
  • extracting image elements in the basic image includes:
  • the image elements to be extracted are updated based on the image element recognition results, the layout type of the basic image is determined based on the updated positional relationship of the image elements to be extracted, and the image elements to be extracted are extracted based on the layout type.
  • the method in the embodiment of this application includes the following steps:
  • Step 210 Based on the target image size, determine a base image that matches the target image size.
  • Step 220 Input the basic image into multiple element recognition models respectively, and obtain the image element recognition results output by each element recognition model, where the image element recognition results include the positions of the image elements.
  • Step 230 Update the image elements to be extracted based on the image element recognition results, determine the layout type of the basic image based on the updated positional relationship of the image elements to be extracted, and extract the image elements to be extracted based on the layout type.
  • Step 240 Based on the size difference between the target image size and the basic image size, and the deformation type of each image element, determine the processing method of each image element, and process the image element based on the determined processing method to obtain the processed image element.
  • Step 250 Splice the processed image elements to obtain the target image.
  • a logo in the form of text can be recognized as Text and logo
  • a product image including text can be identified as a product image and text, and the positions of the above-mentioned image elements recognized based on the same image element overlap.
  • the image element to be extracted is updated, and based on The updated positional relationship of the image elements to be extracted determines the layout type of the basic image; the image elements to be extracted are extracted based on the layout type to improve the recognition accuracy of each image element.
  • the method of determining the layout type of the basic image may include: determining multiple image elements with a position overlapping relationship based on the position of each image element; determining multiple images with a position overlapping relationship based on the priority of each image element.
  • the subordination relationship of elements updates multiple image elements with subordination relationships into an independent image element; based on the positional relationship between independent image elements in the basic image, the layout type of the basic image is determined.
  • the position of each image element output by the element recognition model is determined, and based on the position of each image element, multiple image elements having a position overlapping relationship are determined.
  • positional overlap can be understood as having at least two image elements arranged on the same pixel point in the basic image.
  • the priority of each image element with a positional overlap relationship is determined, and then based on the priority of each image element, the affiliation relationship of multiple image elements with a positional overlap relationship is determined, and the affiliation relationship is determined
  • Multiple image elements in a relationship are updated into an independent image element, and based on the positional relationship between independent image elements in the basic image, the layout type of the basic image is determined.
  • the layout types include pictures above and below, pictures above and below, pictures on the left and pictures on the right, pictures on the left and text on the right, and text centered with products on both sides.
  • the image elements in the basic image include first copywriting elements, logo elements, second copywriting elements, product elements and human body elements.
  • Determine product elements based on the position of each image element The human body element and the second copywriting element have a positional overlap relationship.
  • the independent product element, the first copywriting element, and the logo element, and they are independent image elements determine the positional relationship between the first copywriting element, the logo element, and the product element, and based on the above positional relationship Determine the layout type of the basic image, where the layout type is determined based on the relative positional relationship between copywriting elements and image elements such as face elements, product elements, human body elements, logo elements, etc.
  • the basic image can be divided into regions based on the layout type. Taking the above figure as an example, the basic image is divided into two parts, namely the upper image area and the lower copy area. Correspondingly, the upper image area is divided into two parts.
  • the basic image is divided into two parts, namely the upper image area and the lower copy area.
  • the upper image area is divided into two parts.
  • To extract graphic elements extract copywriting elements in the copywriting area below. When text is included in the image elements, copywriting elements are not extracted from the image area to avoid repeated extraction of image elements.
  • the technical solution of this embodiment determines a basic image that matches the target image size based on the target image size; inputs the basic images into multiple element recognition models respectively, and obtains the image element recognition results output by each element recognition model.
  • the image element recognition result includes the position of each image element; the image element to be extracted is updated based on the image element recognition result, and the layout type of the basic image is determined based on the updated positional relationship of the image element to be extracted, and the layout type is extracted based on the Image elements to be extracted; based on the size difference between the target image size and the basic image size, as well as the deformation type of each image element, determine the processing method of each image element, and process each image element based on the determined processing method to obtain Processed image elements; splice the processed image elements to obtain the target image.
  • the above technical solution achieves this by updating the image elements to be extracted when it is recognized that each image element in the basic image has a positional overlapping relationship, and determining the layout type of the basic image based on the updated positional relationship of the image elements to be extracted.
  • the result is that the coordination of the visual effects in the display process of the target image obtained based on the splicing of image elements is improved.
  • FIG. 3 is a flow chart of another image processing method provided by an embodiment of the present application.
  • the embodiment of the present application can be combined with various options in the above embodiments.
  • the method further includes:
  • the image edges or background edges adjacent to the background extension area are obtained, and the derived background corresponding to the background extension area is obtained based on the color data of the image edge or background edge.
  • the method in the embodiment of this application includes the following steps:
  • Step 310 Based on the target image size, determine a basic image that matches the target image size.
  • Step 320 Extract image elements in the basic image.
  • Step 330 Based on the size difference between the target image size and the basic image size, and the deformation type of each image element, determine the processing method of each image element, and process the image element based on the determined processing method to obtain the processed image element.
  • Step 340 Splice the processed image elements. If there is a background extension area in the spliced image, obtain the image edge or background edge adjacent to the background extension area, and obtain the background based on the color data of the image edge or background edge. Expand the derived background corresponding to the area to obtain the target image.
  • the technical solution of directly splicing the processed elements to obtain the target image is introduced when the basic image does not include the image background. This embodiment will not be repeated here.
  • the image background is fissured so that the fissioned image background adapts to the image background of the target image to improve the coordination of visual effects during the display process.
  • the background area can be directly adjusted based on the size difference between the spliced image and the base image.
  • the background extension area can be in any direction up, down, left, or right in the basic image, which is not limited in this embodiment.
  • the image edge or background edge adjacent to the background extension area in the spliced image is obtained, and a derived background corresponding to the background extension area is obtained based on the color data of the image edge or background edge.
  • any pixel point in the image edge or background edge can be selected, and the color data of the pixel point can be used as the derived background corresponding to the background extension area; optionally, the adjacent image edge or background edge can also be used as the derived background.
  • the average color data of each pixel within the preset range is used as the derived background corresponding to the background extension area; optionally, the corresponding color data of each pixel within the preset range in the adjacent image edge or background edge can also be determined.
  • the color data is based on the intermediate color data between the color data with the largest value and the color data with the smallest value as the derived background corresponding to the background extension area.
  • the above method of determining the derived background is only an optional embodiment, and the actual derived background generation method can also be determined based on the data of image edges or background edges adjacent to the background extension area, which is not limited in this embodiment.
  • the base image can be obtained The background is trimmed based on the size of the target image to obtain a target background that conforms to the size of the target image; the processed image elements are spliced on the target background to obtain the target image.
  • determine the area size of the background extension area in the spliced image crop the background area with the same size as the area size in the base image, and splice the cropped background area to the position of the background extension area to obtain The target background of the target image size, and then the processed image elements are spliced on the target background to obtain the target image.
  • the technical solution of this embodiment includes: determining a base image that matches the target image size based on the target image size; extracting image elements in the base image; based on the size difference between the target image size and the base image size, and each image element Deformation type, determine the processing method of each image element, and process each image element based on the determined processing method to obtain the processed image elements; splice the processed image elements, and there is a background in the spliced image In the case of an extended area, the image edge or background edge adjacent to the background extended area is obtained, and based on the color data of the image edge or background edge, the derived background corresponding to the background extended area is obtained, and the target image is obtained.
  • the above technical solution performs fission processing on the background of the spliced image, so that the spliced image background adapts to the size of the display location to improve the coordination of visual effects during the display process. sex.
  • An image device provided by embodiments of this application can execute the image processing method provided by any embodiment of this application and has functional modules corresponding to the execution method.
  • the image processing device and the image processing method in the above-mentioned embodiments belong to the same inventive concept. For details that are not described in detail in the embodiments of the image processing device, please refer to the embodiments of the above-mentioned image processing method.
  • FIG 4 is a structural diagram of an image processing device provided by an embodiment of the present application.
  • the structure of the image processing device includes: a basic image determination module 410, an image element extraction module 420, an image element processing module 430 and a target image.
  • Generate module 440 wherein,
  • the base image determination module 410 is configured to determine a base image that matches the target image size based on the target image size
  • the image element extraction module 420 is configured to extract image elements in the base image
  • the image element processing module 430 is configured to determine the processing method of each image element based on the size difference between the target image size and the basic image size, and the deformation type of each image element, and process the image based on the determined processing method. The elements are processed and the processed image elements are obtained;
  • the target image generation module 440 is configured to splice the processed image elements to obtain the target image. picture.
  • the image element extraction module 420 includes:
  • the image element recognition result acquisition submodule is configured to input the basic image into multiple element recognition models respectively, and obtain the image element recognition results output by each of the element recognition models, wherein the image element recognition results include image elements. s position;
  • the image element extraction submodule is configured to extract each image element from the basic image based on the image element recognition result.
  • the device further includes: a model training module for each element recognition model; the model training module for any element recognition model includes:
  • a training sample image acquisition unit is configured to acquire a background image and element data, perform enhancement processing on the element data to obtain a plurality of enhanced element data, and set the enhanced element data in the background image to obtain a training sample Figure and record the element type of the enhanced element data and the setting position in the background image;
  • the element recognition model training unit is configured to iteratively train the element recognition model to be trained based on the training sample image, the element type corresponding to the training sample image, and the element recognition model to be trained at the setting position of the background image, so as to obtain the trained element recognition model.
  • the optional image element extraction sub-module includes:
  • a layout type determination unit configured to update the image elements to be extracted based on the image element recognition results, and determine the layout type of the basic image based on the updated positional relationship of the image elements to be extracted;
  • An image element extraction unit configured to extract the image elements to be extracted based on the layout type.
  • the layout type determination unit includes:
  • the image element determination subunit is configured to determine multiple image elements with positional overlapping relationships based on the position of each image element
  • the image element update subunit is configured to determine the affiliation relationship of the multiple image elements with positional overlapping relationships based on the priority of each image element, and update the multiple image elements with the affiliation relationship into an independent image element;
  • the layout type determination subunit is configured to determine the layout type of the basic image based on the positional relationship between independent image elements in the basic image.
  • the size difference between the target image size and the base image size includes a scaling difference and/or an aspect ratio difference
  • the deformation types of image elements include non-deformable, slightly deformable and deformable;
  • the processing method of the non-deformable image elements includes proportional scaling
  • the processing method of the slightly deformable image elements includes proportional scaling and stretching within a preset deformation range
  • the processing method of the deformable image elements Including proportional scaling and stretching at any ratio.
  • the target image generation module 440 includes:
  • the first target image generating unit is configured to determine at least one layout distribution of the target image, and splice the processed image elements based on the at least one layout distribution to obtain a target image corresponding to each layout distribution.
  • the target image generation module 440 includes:
  • a target background generation unit configured to obtain the background of the base image and perform clipping processing on the background based on the size of the target image to obtain a target background that conforms to the size of the target image;
  • the second target image generating unit is configured to splice the image elements to be processed on the target background to obtain a target image.
  • the device further includes:
  • a derived background generation module configured to obtain the image edge or background edge adjacent to the background extension area when there is a background extension area in the spliced image, and obtain the said image edge or background edge based on the color data of the image edge or background edge The derived background corresponding to the background extension area.
  • FIG. 5 shows a schematic structural diagram of an electronic device 10 that can be used to implement embodiments of the present application.
  • Electronic devices are intended to refer to various forms of digital computers, such as laptop computers, desktop computers, workstations, personal digital assistants, servers, blade servers, mainframe computers, and other suitable computers.
  • Electronic devices may also represent various forms of mobile devices, such as personal digital processing, cellular Cell phones, smart phones, wearable devices (such as helmets, glasses, watches, etc.) and other similar computing devices.
  • the components shown herein, their connections and relationships, and their functions are examples only and are not intended to limit the implementation of the present application as described and/or claimed herein.
  • the electronic device 10 includes at least one processor 11, and a memory communicatively connected to the at least one processor 11, such as a read-only memory (Read-Only Memory, ROM) 12, a random access memory (Random Access Memory, RAM) 13, etc., wherein the memory stores a computer program that can be executed by at least one processor, and the processor 11 can be loaded into the random access memory (RAM) according to the computer program stored in the read-only memory (ROM) 12 or from the storage unit 18.
  • a computer program in RAM) 13 to perform various appropriate actions and processes.
  • various programs and data required for the operation of the electronic device 10 can also be stored.
  • the processor 11, the ROM 12 and the RAM 13 are connected to each other via the bus 14.
  • An input/output (I/O) interface 15 is also connected to the bus 14 .
  • the I/O interface 15 Multiple components in the electronic device 10 are connected to the I/O interface 15, including: an input unit 16, such as a keyboard, a mouse, etc.; an output unit 17, such as various types of displays, speakers, etc.; a storage unit 18, such as a magnetic disk, an optical disk, etc. etc.; and communication unit 19, such as network card, modem, wireless communication transceiver, etc.
  • the communication unit 19 allows the electronic device 10 to exchange information/data with other devices through computer networks such as the Internet and/or various telecommunications networks.
  • Processor 11 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of the processor 11 include, but are not limited to, a central processing unit (Central Processing Unit, CPU), a graphics processing unit (Graphics Processing Unit, GPU), various dedicated artificial intelligence (Artificial Intelligence, AI) computing chips, various running Processors for machine learning model algorithms, digital signal processors (Digital Signal Processing, DSP), and any appropriate processors, controllers, microcontrollers, etc.
  • the processor 11 performs various methods and processes described above, such as image processing methods.
  • the image processing method may be implemented as a computer program, which is tangibly embodied in a computer-readable storage medium, such as the storage unit 18.
  • part or all of the computer program may be loaded and/or installed onto the electronic device 10 via the ROM 12 and/or the communication unit 19 .
  • the computer program When the computer program is loaded into RAM 13 and executed by processor 11, the above described at least one step of the image processing method.
  • the processor 11 may be configured to perform the image processing method in any other suitable manner (eg, by means of firmware).
  • Various implementations of the systems and techniques described above may be implemented in digital electronic circuit systems, integrated circuit systems, Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Parts (ASSP), System on Chip (SOC), Complex Programmable Logic Device (CPLD), computer hardware, firmware, software, and/or they realized in a combination.
  • FPGAs Field Programmable Gate Arrays
  • ASICs Application Specific Integrated Circuits
  • ASSP Application Specific Standard Parts
  • SOC System on Chip
  • CPLD Complex Programmable Logic Device
  • computer hardware firmware, software, and/or they realized in a combination.
  • These various embodiments may include implementation in at least one computer program executable and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or
  • a general-purpose programmable processor can receive data and instructions from a storage system, at least one input device, and at least one output device, and transmit data and instructions to the storage system, the at least one input device
  • Computer programs for implementing the methods of the present application may be written in any combination of one or more programming languages. These computer programs may be provided to a processor of a general-purpose computer, a special-purpose computer, or other programmable data processing device, such that the computer program, when executed by the processor, causes the functions/operations specified in the flowcharts and/or block diagrams to be implemented.
  • a computer program may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
  • a computer-readable storage medium may be a tangible medium that may contain or store a computer program for use by or in connection with an instruction execution system, apparatus, or device.
  • Computer-readable storage media may include, but are not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, devices or devices, or any suitable combination of the foregoing.
  • the computer-readable storage medium may be a machine-readable signal medium.
  • machine-readable storage media would include electrical connections based on at least one wire, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable Read memory (EPROM (Erasable Programmable Read-Only Memory) or flash memory), optical fiber, portable compact disk read-only memory (Compact Disc Read-Only Memory, CD-ROM), optical storage device, magnetic storage device, or the above any suitable combination of content.
  • RAM random access memory
  • ROM read only memory
  • EPROM Erasable Programmable Read-Only Memory
  • CD-ROM Compact Disc Read-Only Memory
  • optical storage device magnetic storage device, or the above any suitable combination of content.
  • the systems and techniques described herein may be implemented on an electronic device having: a display device (e.g., CRT (Cathode Ray Tube, cathode ray tube) or LCD) for displaying information to the user (Liquid Crystal Display, LCD monitor); and a keyboard and pointing device (e.g., a mouse or a trackball) through which a user can provide input to the electronic device.
  • a display device e.g., CRT (Cathode Ray Tube, cathode ray tube) or LCD
  • a keyboard and pointing device e.g., a mouse or a trackball
  • Other kinds of devices may also be used to provide interaction with the user; for example, the feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and may be provided in any form, including Acoustic input, voice input or tactile input) to receive input from the user.
  • the systems and techniques described herein may be implemented in a computing system that includes back-end components (e.g., as a data server), or a computing system that includes middleware components (e.g., an application server), or a computing system that includes front-end components (e.g., A user's computer having a graphical user interface or web browser through which the user can interact with implementations of the systems and technologies described herein), or including such backend components, middleware components, or any combination of front-end components in a computing system.
  • the components of the system may be interconnected by any form or medium of digital data communication (eg, a communications network). Examples of communication networks include: Local Area Network (LAN), Wide Area Network (WAN), blockchain network, and the Internet.
  • Computing systems may include clients and servers.
  • Clients and servers are generally remote from each other and typically interact over a communications network.
  • the relationship of client and server is created by computer programs running on corresponding computers and having a client-server relationship with each other.
  • the server can be a cloud server, also known as cloud computing server or cloud host. It is a host product in the cloud computing service system to solve the problems existing in traditional physical host and virtual private server (VPS) services. It has the disadvantages of difficult management and weak business scalability.
  • VPN virtual private server

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Image Processing (AREA)

Abstract

一种图像处理方法、装置、介质及设备。该方法包括:基于目标图像尺寸,确定与所述目标图像尺寸相匹配的基础图像(S110);提取所述基础图像中的图像元素(S120);基于所述目标图像尺寸和基础图像尺寸的尺寸差异,以及每个图像元素的形变类型,确定每个图像元素的处理方式,并基于确定的处理方式对所述图像元素进行处理,得到处理后的图像元素(S130);将处理后的图像元素进行拼接,得到目标图像(S140)。

Description

图像处理方法、装置、介质及设备
本申请要求在2022年9月9日提交中国专利局、申请号为202211104066.6的中国专利申请的优先权,该申请的全部内容通过引用结合在本申请中。
技术领域
本申请实施例涉及图像处理技术领域,例如涉及一种图像处理方法、装置、介质及设备。
背景技术
目前对于图片尺寸扩展的方法主要是通过对原图片的裁剪处理,从而实现图片尺寸的扩展;即先根据当前宽、高与目标尺寸的宽、高计算出两个缩放比,根据较大的那个比例等比缩放,这样新图片会比目标尺寸在宽或者高上有冗余的空间,然后找出上下左右边界中不重要的部分剪裁掉,得到裁剪后的目标图片。
上述图片尺寸扩展方式在实施的过程中会发现:当原图尺寸与目标图的尺寸差异较大时,基于剪裁的方法需要剪裁很大一部分,会裁掉关键区域,比如文字、商品,导致图像丢失了关键数据;以及在对原始图进行拉伸的过程中,导致图像变形验证,降低了视觉效果的协调性。
发明内容
本申请提供了一种图像处理方法、装置、介质及设备,以实现在对图像进行图像裂变处理的过程中,在不丢失图像关键信息的情况下,提高在展示过程中的视觉效果的协调性。
根据本申请的一方面,提供了一种图像处理方法,该方法包括:
基于目标图像尺寸,确定与所述目标图像尺寸相匹配的基础图像;
提取所述基础图像中的图像元素;
基于所述目标图像尺寸和基础图像尺寸的尺寸差异,以及每个图像元素的形变类型,确定每个图像元素的处理方式,并基于确定的处理方式对所述图像元素进行处理,得到处理后的图像元素;
将处理后的图像元素进行拼接,得到目标图像。
根据本申请的另一方面,提供了一种图像处理装置,该装置包括:
基础图像确定模块,设置为基于目标图像尺寸,确定与所述目标图像尺寸相匹配的基础图像;
图像元素提取模块,设置为提取所述基础图像中的图像元素;
图像元素处理模块,设置为基于所述目标图像尺寸和基础图像尺寸的尺寸差异,以及每个图像元素的形变类型,确定每个图像元素的处理方式,并基于确定的处理方式对所述图像元素进行处理,得到处理后的图像元素;
目标图像生成模块,设置为将处理后的图像元素进行拼接,得到目标图像。
根据本申请的另一方面,提供了一种电子设备,所述电子设备包括:
至少一个处理器;以及
与所述至少一个处理器通信连接的存储器;其中,
所述存储器存储有可被所述至少一个处理器执行的计算机程序,所述计算机程序被所述至少一个处理器执行,以使所述至少一个处理器能够执行本申请任一实施例所述的图像处理方法。
根据本申请的另一方面,提供了一种计算机可读存储介质,所述计算机可读存储介质存储有计算机指令,所述计算机指令用于使处理器执行时实现本申请任一实施例所述的图像处理方法。
附图说明
图1是本申请实施例提供的一种图像处理方法的流程示意图;
图2是本申请实施例提供的另一种图像处理方法的流程示意图;
图3是本申请实施例提供的另一种图像处理方法的流程示意图;
图4是本申请实施例提供的一种图像处理装置的结构示意图;
图5是本申请实施例提供的一种电子设备的结构示意图。
具体实施方式
为了使本技术领域的人员更好地理解本申请方案,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述。
需要说明的是,本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的本申请的实施例能够以除了在这里图示或描述的那些以外的顺序实施。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。
图1为本申请实施例提供的一种图像处理方法的流程图,本实施例可适用于对图像尺寸进行修改的情况,该方法可以由图像处理装置来执行,该布控装置可以由软件和/或硬件来实现,该布控装置可以配置在电子计算设备上,包括如下步骤:
步骤110、基于目标图像尺寸,确定与目标图像尺寸相匹配的基础图像。
步骤120、提取基础图像中的图像元素。
步骤130、基于目标图像尺寸和基础图像尺寸的尺寸差异,以及每个图像元素的形变类型,确定每个图像元素的处理方式,并基于确定的处理方式对图像元素进行处理,得到处理后的图像元素。
步骤140、将处理后的图像元素进行拼接,得到目标图像。
在本申请实施例中,目标图像可以理解为需要在预设展示位置进行展示的 图像。例如,目标图像可以是展示在当前界面中预设广告推广位进行推广的广告图像;当然目标图像还可以是展示在预设公告位置的公告图像,本实施例对目标图像的图像类型和图像展示位置不作限制。在上述基础上,不同展示位置的展示尺寸不同,相应的会导致同一基础图像在不同展示位置进行展示时的目标图像尺寸不相同;例如在商场外墙展示的海报图像和在公交车展示屏中播放的同一海报图像的目标图像尺寸不相同。为了使基础图像适应在不同展示位置的目标图像尺寸,需要对基础图像进行图像裂变处理,即对图像的尺寸进行扩展处理,从而使裂变得到的目标图像与其所在展示位匹配,从而提高图像在展示过程中的视觉效果的协调性。
需要说明的是,目标图像尺寸可以是基于目标图像在展示时的展示尺寸所确定,为了提高视觉效果的协调性,本实施例中预先设置目标图像尺寸与展示位的展示尺寸保持一致。基础图像可以理解为目标图像裂变之前的图像,换言之,还可以解释为:为了匹配不同的展示位置,预先对同一基础图像设置不同的图像比例;对于任一基础图像,若该基础图像的图像比例与展示位置的尺寸比例一致,则确定该基础图像的图像尺寸,若图像尺寸与展示尺寸一致,则直接将该基础图像作为目标图像在展示位置进行展示;相反的,若该基础图像的图像比例与展示位置的尺寸比例不一致时,则选取与展示位置的尺寸比例最接近的图像比例,并基于两个比例的尺寸差异对该基础图像尺寸进行相应的图像裂变处理,得到该基础图像对应的目标图像在展示位置进行展示。本实施例中,基础图像的图像比例可以包括但不限于3:1、2:1、1:1、1.2:1和0.5:1。图像裂变处理包括图像等比例缩放处理和图像再宽度上和/或在高度上的拉伸处理。
示例性的,获取展示位置的展示尺寸,基于该展示尺寸确定待展示的目标图像的目标图像尺寸,并基于目标图像尺寸所对应的图像比例,确定与目标图像尺寸相匹配的基础图像。其中,相匹配可以理解为目标图像尺寸所对应的图像比例与基础图像的图像比例相等或者目标图像尺寸所对应的图像比例与基础 图像的图像比例的差值在预设比例范围内。
示例性的,若确定目标图像尺寸为1000mm*1300mm,相应的可以确定目标图像尺寸对应的图像比例为1:1.3。确定预先设置的基础图像的多个图像比例中是否存在与目标图像尺寸所对应的图像比例一致的图像比例,在预先设置的基础图像的多个图像比例中不存在与目标图像尺寸所对应的图像比例一致的图像比例的情况下,选取与目标图像尺寸的图像比例最接近的图像比例,即本实施例中最接近的图像比例为1:1.2,则该图像比例对应的基础图像则为与目标图像尺寸相匹配的基础图像。
在上述实施例的基础上,对基础图像进行图像裂变处理后得到目标图像,为了提高目标图像中的各图像元素在展示过程中的视觉效果的协调性,需要对基础图像中的图像元素进行提取,并基于目标图像的目标图像尺寸对提取出的图像元素进行重新布局。
本实施例中,图像元素可以理解为基础图像中的图像内容,不同的图像内容所属的元素类型不同。例如,广告图像中的图像元素可以包括logo元素、商品元素、文案元素、人脸元素和人体元素等元素;公告图像中的图像元素可以包括公章元素和文字元素等元素。
可选的,可以基于用户分别触发的对各图像元素的选取指令,确定基础图像中的各图像元素,当然还可以是将基础图像分别输入至各图像元素提取模型中,得到各元素提取模型分别输出的图像元素提取结果。在识别图像元素的过程中,需要识别出各图像元素的元素类型以及元素位置。
示例性的,在用户对当前基础图像进行图像元素选取的过程中,接收用户触发的选取指令确定对应的选取结果。例如,用户触发对商品元素的框选,并输入对应的元素类型为商品元素,则确定用户框选的元素为商品元素,框选的位置区域为商品元素所在的位置。
示例性的,基于预先设定的各元素类型,获取各元素类型分别对应的元素识别模型,例如,各元素识别模型包括但不限于logo元素识别模型、商品元素 识别模型、文案元素识别模型、人脸元素识别模型和人体元素识别模型等元素识别模型。将基础图像分别输入至各元素识别模型中,分别得到各元素识别模型所输出的识别结果。以商品元素识别模型为例,将基础图像输入至该商品元素识别模型中,得到该商品元素识别模型输出的元素结果。可选的,元素识别结果可以是分类结果,即当输出结果为1时,则说明该基础图像中包含有商品元素,并且同时输出该商品元素的所在位置,例如输出商品元素在基础图像中所覆盖的图像像素点。当然还可以直接输出基础图像的热图,该热图中商品元素的像素值区别于其他元素,从而可以识别出基础图像中是否存在商品元素,以及存在商品元素时商品元素在基础图像中的位置。
在一些实施例中,在采用各元素识别模型进行元素识别之前,先对各元素识别模型进行模型训练。可选的,任一元素识别模型的训练方法包括:获取背景图和元素数据,对元素数据进行增强处理,得到多个增强元素数据,并将增强元素数据设置在背景图中,得出训练样本图并记录增强元素数据的元素类型和在背景图的设置位置;基于训练样本图、训练样本图对应的元素类型和在背景图的设置位置对待训练的元素识别模型进行迭代训练,以得到训练完成的元素识别模型。
示例性的,对于商品样本数据集的样本构建,可以采用图片合成程序来合成带商品的广告图用来作为训练商品元素识别模型的训练样本。例如,选择一批图片作为背景图,然后选择一批商品在图片上随机粘贴,粘贴的时候就可以知道坐标、类别作为标注信息。粘贴的时候,会随机添加高斯模糊、椒盐噪声、图片翻转、随机裁剪、缩放、颜色通道置换,这些数据增强方法来增加数据集的多样性,有利于提高模型识别的精度。基于商品样本数据集中对应的元素类型和在背景图的设置位置作为样本标签,基于该样本标签分别对待训练的商品元素识别模型进行迭代训练,以得到训练完成的商品元素识别模型。需要说明的是,对于logo元素识别模型、人脸元素识别模型和人体元素识别模型等其他类似于图像识别的元素识别模型均可以采用上述方式进行训练。
在上述各实施例的基础上,类似于图像识别的元素识别模型可以是网络结构模块,诸如卷积神经网络、多层感知器等的结构,对此不作限定。例如,元素识别模型可以是YOLOv5模型。
示例性的,对于文本样本数据集的构建,可以从互联网上下载中文语料库,加上已有的文案库。例如,选择一批图片作为背景图,然后从文案库中随机挑选文案,然后在背景图片上用Python随机写出来,加上旋转、倾斜、投射变换、高斯模糊来增加样本丰富性,从而提高模型识别的精度。
在上述各实施例的基础上,类似于文本识别的元素识别模型可以是网络结构模块,诸如卷积神经网络、多层感知器等的结构,对此不作限定。例如,元素识别模型可以是east模型。
在上述实施例的基础上,当识别出基础图像中的各图像元素时,提取出各图像元素,并对图像元素进行裂变处理。可选的,基于目标图像尺寸和基础图像尺寸的尺寸差异,以及每个图像元素的形变类型,确定每个图像元素的处理方式,并基于确定的处理方式对每个图像元素进行处理,得到处理后的图像元素。
需要解释的是,本实施例中目标图像尺寸和基础图像尺寸的尺寸差异包括等比例缩放差异和/或宽高比差异;图像元素的形变类型包括不可形变、可轻微形变和可形变;不可形变的图像元素的处理方式包括等比例缩放,可轻微形变的图像元素的处理方式包括等比例缩放和预设形变范围内的拉伸,可形变的图像元素的处理方式包括等比例缩放和任意比例的拉伸。
本实施例中,比例缩放差异可以理解为对当前图像的图像长度和图像宽度进行相同比例的放大或者缩小即可消除的差异,换言之,在目标图像尺寸为1000mm*1000mm,基础图像尺寸为500mm*500mm的情况下,目标图像尺寸和基础图像尺寸之间的尺寸通过等比例放大即可以消除比例缩放差异。宽高比差异可以理解为对当前图像的图像长度进行拉伸或者对图像宽度进行拉伸即可消除的差异。换言之,在目标图像尺寸为1000mm*1200mm,基础图像尺寸为 1000mm*1000mm的情况下,目标图像尺寸和基础图像尺寸之间的尺寸通过进行拉伸即可以消除宽高比差异。当然还有一种情况是目标图像尺寸为1000mm*1200mm,基础图像尺寸为500mm*500mm,相应的,目标图像尺寸和基础图像尺寸之间的尺寸通过先进行等比例缩放,再进行拉伸即可以消除图像之间的尺寸差异。
本实施例中,不可形变的图像元素可以理解为在裂变处理的过程中,该图像元素的形状比例需要保持特定比例,不能发生变化;例如广告图像中的logo元素、人脸元素和人体元素等,例如,logo需要保持预设形状,不能变化;可轻微形变的图像元素可以理解为在裂变处理的过程中,该图像元素的形状比例需要保持预设范围的比例,不能发生超出预设范围的比例变化;例如广告图像中的商品元素等,例如,商品的形状可在在预设范围内进行拉宽或者伸高;可形变的图像元素可以理解为在裂变处理的过程中,该图像元素的形状比例不作限定,可任意变化;例如广告图像中的文案元素等,例如,文案中的文字大小和字体可任意改变。
可选的,在目标图像尺寸为1000mm*1000mm,基础图像尺寸为500mm*500mm的情况下,为了消除尺寸差异,对基础图像进行等比例放大处理,由于各形变类型的图像元素均可进行等比例缩放,相应的,可以直接对基础图像中的各图像元素进行2倍的等比例放大处理,得到处理完的各图像元素。
可选的,在目标图像尺寸为1000mm*1200mm,基础图像尺寸为1000mm*1000mm的情况下,为了消除尺寸差异,对基础图像在宽度上进行拉伸处理,在此基础上,若基础图像中包括不可形变的图像元素以及可轻微形变的图像元素,则对不可形变的图像元素不进行拉伸处理;对可轻微形变的图像元素,若尺寸差异在预设形变范围内,则直接对可轻微形变的图像元素基于尺寸差异进行拉伸处理,若尺寸差异不在预设形变范围内,则对可轻微形变的图像元素在预设形变范围内进行拉伸处理;对可任意形变的图像元素直接基于尺寸差异进行拉伸处理,从而得到处理完的各图像元素。
可选的,在目标图像尺寸为1000mm*1200mm,基础图像尺寸为500mm*500mm的情况下,为了消除尺寸差异,可以先对各图像元素进行2倍的放大处理,得到1000mm*1000mm的基础图像,并在此基础上,基于各图像元素的可形变类型,对于不可形变的图像元素不进行拉伸处理、对于可轻微形变的图像元素在宽度上进行预设拉伸范围内的拉伸处理、对于可形变的图像元素在宽度上基于尺寸差异进行拉伸处理,从而得到处理完的各图像元素。
在上述实施例的基础上,将处理后的图像元素进行拼接,得到目标图像。可选的,可以确定目标图像的至少一个布局分布,基于布局分布将处理后的图像元素进行拼接,得到每个布局分布对应的目标图像。
本实施例中,布局分布可以解释为各图像元素在图像中的布局关系。示例性的,布局分布可以包括但不限于上文下图、上图下文、左文右图、左图右文、文字居中两边商品等布局分布。
可选的,若基础图像中不包括图像背景,则可以直接对处理完的图像元素基于各类布局分布进行元素拼接,分别得到每个布局分布对应的拼接图像,将至少一个拼接图像向用户进行展示,并基于用户在所述至少一个拼接图像中对目标图像的选取指令,生成基础图像对应的目标图像。
在一些实施例中,在生成所述至少一个布局分布对应的拼接图像之后,计算每个拼接图像的合理性概率,并基于预设的合理性阈值确定所述至少一个拼接图像中符合合理性要求的拼接图像向用户进行展示,并基于用户在符合合理性要求的拼接图像中对目标图像的选取指令,生成基础图像对应的目标图像;当然也可以在计算出所述至少一个拼接图像的合理性概率之后,直接将合理性概率中概率值最大的拼接图像作为目标图像。当然,还可以基于其他方式确定拼接后的目标图像,本实施例对此不作限定。
本实施例的技术方案,基于目标图像尺寸,确定与目标图像尺寸相匹配的基础图像;提取基础图像中的图像元素;基于目标图像尺寸和基础图像尺寸的尺寸差异,以及每个图像元素的形变类型,确定每个图像元素的处理方式,并 基于确定的处理方式对图像元素进行处理,得到处理后的图像元素;将处理后的图像元素进行拼接,得到目标图像。通过上述技术的方案,解决了相关技术在对图像裂变的过程中导致丢失图像元素,以及拉伸图像元素发生形变,从而降低视觉效果的协调性的问题,达到了对图像进行图像裂变处理的过程中,在不丢失图像关键信息的情况下,提高视觉效果的协调性的效果。
图2为本申请实施例提供的另一种图像处理方法的流程图,本申请实施例与上述实施例中各个可选方案可以结合。可选的,在本申请实施例中,提取基础图像中的图像元素,包括:
将基础图像分别输入至多个元素识别模型中,分别得到每个元素识别模型输出的图像元素识别结果,其中,图像元素识别结果中包括各图像元素的位置;
基于图像元素识别结果更新待提取的图像元素,并基于更新后的待提取的图像元素的位置关系确定基础图像的布局类型,基于布局类型提取待提取的图像元素。
如图2所示,本申请实施例的方法包括如下步骤:
步骤210、基于目标图像尺寸,确定与目标图像尺寸相匹配的基础图像。
步骤220、将基础图像分别输入至多个元素识别模型中,分别得到每个元素识别模型输出的图像元素识别结果,其中,图像元素识别结果中包括图像元素的位置。
步骤230、基于图像元素识别结果更新待提取的图像元素,并基于更新后的待提取的图像元素的位置关系确定基础图像的布局类型,基于布局类型提取待提取的图像元素。
步骤240、基于目标图像尺寸和基础图像尺寸的尺寸差异,以及每个图像元素的形变类型,确定每个图像元素的处理方式,并基于确定的处理方式对图像元素进行处理,得到处理后的图像元素。
步骤250、将处理后的图像元素进行拼接,得到目标图像。
在上述各实施例的基础上,由于通过不同类型的元素识别模型对基础图像进行识别,存在一个图像元素被识别为不同类型的图像元素的情况,示例性的,文字形式的logo可被识别为文字和logo,示例性的,包括文字的商品图像可别识别为商品图像和文字,上述基于同一图像元素识别出的图像元素的位置存在重叠。
本实施例的技术方案中,在基于各元素识别模型识别出基础图像中的各图像元素之后,若识别出的各图像元素中的元素位置存在位置重叠,则更新待提取的图像元素,并基于更新后的待提取的图像元素的位置关系确定基础图像的布局类型;基于布局类型提取待提取的图像元素,以实现提高各图像元素的识别准确率。
可选的,确定基础图像的布局类型的方法可以包括:基于各图像元素的位置,确定具有位置重叠关系的多个图像元素;基于各图像元素的优先级,确定具有位置重叠关系的多个图像元素的从属关系,将具有从属关系的多个图像元素更新为一个独立的图像元素;基于基础图像中各独立的图像元素之间的位置关系,确定基础图像的布局类型。
示例性的,基于各元素识别模型输出的各图像元素的位置,并基于各图像元素的位置,确定具有位置重叠关系的多个图像元素。其中,位置重叠可以理解为在基础图像中的相同像素点上设置有至少两种图像元素。基于预先设置的各图像元素的优先级,确定具有位置重叠关系的各图像元素的优先级,进而基于各图像元素的优先级确定具有位置重叠关系的多个图像元素的从属关系,并将具有从属关系的多个图像元素更新为一个独立的图像元素,并基于基础图像中各独立的图像元素之间的位置关系,确定基础图像的布局类型。其中,布局类型包括上文下图、上图下文、左文右图、左图右文、文字居中两边商品等布局类型。
示例性的,若识别出基础图像中的图像元素包括第一文案元素、logo元素、第二文案元素、商品元素和人体元素。基于各图像元素的位置确定出商品元素、 人体元素和第二文案元素具有位置重叠关系。获取预先设置的各图像元素的优先级,例如商品元素、人体元素、第二文案元素的优先级依次下降,基于上述优先级关系,确定人体元素和第二文案元素从属于商品元素,则将上述第二文案元素、商品元素和人体元素更新为包括商品、人体和第二文案的独立商品元素。在独立商品元素、第一文案元素、logo元素之间不存在位置重叠,分别为独立图像元素的情况下,确定第一文案元素、logo元素和商品元素之间的位置关系,并基于上述位置关系确定基础图像的布局类型,其中,布局类型基于文案元素、和诸如人脸元素、商品元素、人体元素、logo元素等图像元素之间的相对位置关系确定。
通过确定基础图像的布局类型,基于布局类型对基础图像中的各图像元素进行类型验证,以提高待提取的图像元素的类型和位置的准确性。示例性的,可以是基于布局类型对基础图像进行区域划分,以上图下文为例,将基础图像划分为两部分,即上方的图区域和下方的文案区域,相应的,在上方的图区域进行图元素的提取,在下方的文案区域进行文案元素的提取。在图元素中包括文字的情况,不对图区域进行文案元素的提取,避免图像元素的重复提取。
本实施例的技术方案,通过基于目标图像尺寸,确定与目标图像尺寸相匹配的基础图像;将基础图像分别输入至多个元素识别模型中,分别得到每个元素识别模型输出的图像元素识别结果,其中,图像元素识别结果中包括各图像元素的位置;基于图像元素识别结果更新待提取的图像元素,并基于更新后的待提取的图像元素的位置关系确定基础图像的布局类型,基于布局类型提取待提取的图像元素;基于目标图像尺寸和基础图像尺寸的尺寸差异,以及每个图像元素的形变类型,确定每个图像元素的处理方式,并基于确定的处理方式对各图像元素进行处理,得到处理后的图像元素;将处理后的图像元素进行拼接,得到目标图像。上述技术方案通过在识别出基础图像中的各图像元素具有位置重叠关系的情况下,更新待提取的图像元素,并基于更新后的待提取的图像元素的位置关系确定基础图像的布局类型,达到了提高图像元素识别准确性的效 果,从而提升了基于图像元素拼接得到的目标图像在展示的过程中的视觉效果的协调性。
图3为本申请实施例提供的另一种图像处理方法的流程图,本申请实施例与上述实施例中各个可选方案可以结合。可选的,在本申请实施例中,在将处理后的图像元素进行拼接之后,方法还包括:
在拼接后的图像中存在背景扩展区域的情况下,获取背景扩展区域相邻的图像边缘或背景边缘,基于图像边缘或背景边缘的颜色数据得到背景扩展区域对应的衍生背景。
如图3所示,本申请实施例的方法包括如下步骤:
步骤310、基于目标图像尺寸,确定与目标图像尺寸相匹配的基础图像。
步骤320、提取基础图像中的图像元素。
步骤330、基于目标图像尺寸和基础图像尺寸的尺寸差异,以及每个图像元素的形变类型,确定每个图像元素的处理方式,并基于确定的处理方式对图像元素进行处理,得到处理后的图像元素。
步骤340、将处理后的图像元素进行拼接,在拼接后的图像中存在背景扩展区域的情况下,获取背景扩展区域相邻的图像边缘或背景边缘,基于图像边缘或背景边缘的颜色数据得到背景扩展区域对应的衍生背景,得到目标图像。
在上述各发明实施例的技术方案中,介绍了在基础图像中不包括图像背景的情况下,直接将处理后的元素进行拼接并得到目标图像的技术方案,本实施例在此不再赘述。
本申请实施例中,在基础图像中包括图像背景的情况下,对图像背景的裂变处理,以使裂变后的图像背景适应目标图像的图像背景,以提高展示过程中视觉效果的协调性。
可选的,若拼接后的图像与基础图像的图像比例不一致,图像背景的背景元素可拉伸,则可以直接基于拼接后的图像与基础图像的尺寸差异对背景区域 进行等比例缩放和/或拉伸处理,得到符合拼接后的图像尺寸的目标背景;将处理后的图像元素在目标背景上进行拼接,得到目标图像。
可选的,若拼接后的图像与基础图像的图像比例不一致,且图像背景中存在不可拉伸的背景元素,则拼接后的图像中可能存在背景扩展区。可选的,背景扩展区可以是在基础图像中的上、下、左、右的任意方向,本实施例对此不作限定。在此情况下,获取拼接后的图像中与背景扩展区域相邻的图像边缘或者背景边缘,并基于图像边缘或背景边缘的颜色数据得到背景扩展区域对应的衍生背景。
示例性的,可以是选取图像边缘或者背景边缘中任意像素点,将该像素点的颜色数据作为背景扩展区域对应的衍生背景;可选的,还可以是将相邻的图像边缘或者背景边缘中预设范围内的各像素点的平均颜色数据作为背景扩展区域对应的衍生背景;可选的,还可以是确定相邻的图像边缘或者背景边缘中预设范围内的各像素点中分别对应的颜色数据,基于数值最大的颜色数据和数值最小的颜色数据的中间颜色数据作为背景扩展区域对应的衍生背景。当然,上述确定衍生背景的方法只是可选实施例,还可以基于与背景扩展区域相邻的图像边缘或者背景边缘的数据情况确定实际的衍生背景生成方法,本实施例对此不作限定。
可选的,若拼接后的图像与基础图像的图像比例不一致,且图像背景中存在不可拉伸的背景元素,但是背景元素中包括重复随意排列的背景元素的情况下,则可以获取基础图像的背景,并基于目标图像尺寸对背景进行剪裁处理,得到符合目标图像尺寸的目标背景;将处理后的图像元素在目标背景上进行拼接,得到目标图像。
示例性的,确定拼接后的图像中的背景扩展区域的区域尺寸,并在基础图像中裁剪与区域尺寸相同尺寸的背景区域,并将裁剪后的背景区域拼接于背景扩展区域的位置,得到符合目标图像尺寸的目标背景,进而将处理后的图像元素在目标背景上进行拼接,得到目标图像。
本实施例的技术方案,包括:基于目标图像尺寸,确定与目标图像尺寸相匹配的基础图像;提取基础图像中的图像元素;基于目标图像尺寸和基础图像尺寸的尺寸差异,以及每个图像元素的形变类型,确定每个图像元素的处理方式,并基于确定的处理方式对各图像元素进行处理,得到处理后的图像元素;将处理后的图像元素进行拼接,在拼接后的图像中存在背景扩展区域的情况下,获取背景扩展区域相邻的图像边缘或背景边缘,基于图像边缘或背景边缘的颜色数据得到背景扩展区域对应的衍生背景,得到目标图像。上述技术方案在基础图像中包括图像背景的情况下,通过对拼接后的图像的背景进行裂变处理,从而使拼接后的图像背景适应展示位置的尺寸,以提升在展示过程中的视觉效果的协调性。
以下是本申请实施例提供的图像处理装置的实施例,本申请实施例所提供的一种图像装置可执行本申请任意实施例所提供的图像处理方法,具备执行方法相应的功能模块。图像处理装置与上述各实施例的图像处理方法属于同一个发明构思,在图像处理装置的实施例中未详尽描述的细节内容,可以参考上述图像处理方法的实施例。
图4为本申请实施例提供的一种图像处理装置的结构图,参见图4,该图像处理装置的结构包括:基础图像确定模块410、图像元素提取模块420、图像元素处理模块430和目标图像生成模块440;其中,
基础图像确定模块410,设置为基于目标图像尺寸,确定与所述目标图像尺寸相匹配的基础图像;
图像元素提取模块420,设置为提取所述基础图像中的图像元素;
图像元素处理模块430,设置为基于所述目标图像尺寸和基础图像尺寸的尺寸差异,以及每个图像元素的形变类型,确定每个图像元素的处理方式,并基于确定的处理方式对所述图像元素进行处理,得到处理后的图像元素;
目标图像生成模块440,设置为将处理后的图像元素进行拼接,得到目标图 像。
在上述实施例的技术方案的基础上,可选的,图像元素提取模块420,包括:
图像元素识别结果获取子模块,设置为将所述基础图像分别输入至多个元素识别模型中,分别得到每个所述元素识别模型输出的图像元素识别结果,其中,图像元素识别结果中包括图像元素的位置;
图像元素提取子模块,设置为基于所述图像元素识别结果从所述基础图像中提取各图像元素。
在上述实施例的技术方案的基础上,可选的,该装置还包括:各元素识别模型的模型训练模块;任一元素识别模型的模型训练模块包括:
训练样本图获取单元,设置为获取背景图和元素数据,对所述元素数据进行增强处理,得到多个增强元素数据,并将所述增强元素数据设置在所述背景图中,得出训练样本图并记录所述增强元素数据的元素类型和在背景图的设置位置;
元素识别模型训练单元,设置为基于训练样本图、所述训练样本图对应的元素类型和在背景图的设置位置对待训练的元素识别模型进行迭代训练,以得到训练完成的元素识别模型。
在上述实施例的技术方案的基础上,可选的,图像元素提取子模块,包括:
布局类型确定单元,设置为基于图像元素识别结果更新待提取的图像元素,并基于更新后的待提取的图像元素的位置关系确定所述基础图像的布局类型;
图像元素提取单元,设置为基于所述布局类型提取所述待提取的图像元素。
在上述实施例的技术方案的基础上,可选的,布局类型确定单元,包括:
图像元素确定子单元,设置为基于各图像元素的位置,确定具有位置重叠关系的多个图像元素;
图像元素更新子单元,设置为基于各图像元素的优先级,确定所述具有位置重叠关系的多个图像元素的从属关系,将具有从属关系的多个图像元素更新为一个独立的图像元素;
布局类型确定子单元,设置为基于所述基础图像中各独立的图像元素之间的位置关系,确定所述基础图像的布局类型。
在上述实施例的技术方案的基础上,可选的,所述目标图像尺寸和基础图像尺寸的尺寸差异包括等比例缩放差异和/或宽高比差异;
图像元素的形变类型包括不可形变、可轻微形变和可形变;
所述不可形变的图像元素的处理方式包括等比例缩放,所述可轻微形变的图像元素的处理方式包括等比例缩放和预设形变范围内的拉伸,所述可形变的图像元素的处理方式包括等比例缩放和任意比例的拉伸。
在上述实施例的技术方案的基础上,可选的,目标图像生成模块440,包括:
第一目标图像生成单元,设置为确定目标图像的至少一个布局分布,基于所述至少一个布局分布将处理后的图像元素进行拼接,得到每个布局分布对应的目标图像。
在上述实施例的技术方案的基础上,可选的,目标图像生成模块440,包括:
目标背景生成单元,设置为获取所述基础图像的背景,并基于所述目标图像尺寸对所述背景进行剪裁处理,得到符合所述目标图像尺寸的目标背景;
第二目标图像生成单元,设置为将所述将处理后的图像元素在所述目标背景上进行拼接,得到目标图像。
在上述实施例的技术方案的基础上,可选的,该装置还包括:
衍生背景生成模块,设置为在拼接后的图像中存在背景扩展区域的情况下,获取所述背景扩展区域相邻的图像边缘或背景边缘,基于所述图像边缘或背景边缘的颜色数据得到所述背景扩展区域对应的衍生背景。
图5示出了可以用来实施本申请的实施例的电子设备10的结构示意图。电子设备旨在表示各种形式的数字计算机,诸如,膝上型计算机、台式计算机、工作台、个人数字助理、服务器、刀片式服务器、大型计算机、和其它适合的计算机。电子设备还可以表示各种形式的移动装置,诸如,个人数字处理、蜂 窝电话、智能电话、可穿戴设备(如头盔、眼镜、手表等)和其它类似的计算装置。本文所示的部件、它们的连接和关系、以及它们的功能仅仅作为示例,并且不意在限制本文中描述的和/或者要求的本申请的实现。
如图5所示,电子设备10包括至少一个处理器11,以及与至少一个处理器11通信连接的存储器,如只读存储器(Read-Only Memory,ROM)12、随机访问存储器(Random Access Memory,RAM)13等,其中,存储器存储有可被至少一个处理器执行的计算机程序,处理器11可以根据存储在只读存储器(ROM)12中的计算机程序或者从存储单元18加载到随机访问存储器(RAM)13中的计算机程序,来执行各种适当的动作和处理。在RAM 13中,还可存储电子设备10操作所需的各种程序和数据。处理器11、ROM 12以及RAM 13通过总线14彼此相连。输入/输出(Input/Output,I/O)接口15也连接至总线14。
电子设备10中的多个部件连接至I/O接口15,包括:输入单元16,例如键盘、鼠标等;输出单元17,例如各种类型的显示器、扬声器等;存储单元18,例如磁盘、光盘等;以及通信单元19,例如网卡、调制解调器、无线通信收发机等。通信单元19允许电子设备10通过诸如因特网的计算机网络和/或各种电信网络与其他设备交换信息/数据。
处理器11可以是各种具有处理和计算能力的通用和/或专用处理组件。处理器11的一些示例包括但不限于中央处理单元(Central Processing Unit,CPU)、图形处理单元(Graphics Processing Unit,GPU)、各种专用的人工智能(Artificial Intelligence,AI)计算芯片、各种运行机器学习模型算法的处理器、数字信号处理器(Digital Signal Processing,DSP)、以及任何适当的处理器、控制器、微控制器等。处理器11执行上文所描述的各个方法和处理,例如图像处理方法。
在一些实施例中,图像处理方法可被实现为计算机程序,其被有形地包含于计算机可读存储介质,例如存储单元18。在一些实施例中,计算机程序的部分或者全部可以经由ROM 12和/或通信单元19而被载入和/或安装到电子设备10上。当计算机程序加载到RAM 13并由处理器11执行时,可以执行上文描述 的图像处理方法的至少一个步骤。备选地,在其他实施例中,处理器11可以通过其他任何适当的方式(例如,借助于固件)而被配置为执行图像处理方法。
本文中以上描述的系统和技术的各种实施方式可以在数字电子电路系统、集成电路系统、场可编程门阵列(Field Programmable Gate Array,FPGA)、专用集成电路(Application Specific Integrated Circuit,ASIC)、专用标准产品(Application Specific Standard Parts,ASSP)、芯片上系统的系统(System on Chip,SOC)、负载可编程逻辑设备(Complex Programmable Logic Device,CPLD)、计算机硬件、固件、软件、和/或它们的组合中实现。这些各种实施方式可以包括:实施在至少一个计算机程序中,该至少一个计算机程序可在包括至少一个可编程处理器的可编程系统上执行和/或解释,该可编程处理器可以是专用或者通用可编程处理器,可以从存储系统、至少一个输入装置、和至少一个输出装置接收数据和指令,并且将数据和指令传输至该存储系统、该至少一个输入装置、和该至少一个输出装置。
用于实施本申请的方法的计算机程序可以采用一个或多个编程语言的任何组合来编写。这些计算机程序可以提供给通用计算机、专用计算机或其他可编程数据处理装置的处理器,使得计算机程序当由处理器执行时使流程图和/或框图中所规定的功能/操作被实施。计算机程序可以完全在机器上执行、部分地在机器上执行,作为独立软件包部分地在机器上执行且部分地在远程机器上执行或完全在远程机器或服务器上执行。
在本申请的上下文中,计算机可读存储介质可以是有形的介质,其可以包含或存储以供指令执行系统、装置或设备使用或与指令执行系统、装置或设备结合地使用的计算机程序。计算机可读存储介质可以包括但不限于电子的、磁性的、光学的、电磁的、红外的、或半导体系统、装置或设备,或者上述内容的任何合适组合。备选地,计算机可读存储介质可以是机器可读信号介质。机器可读存储介质的更具体示例会包括基于至少一个线的电气连接、便携式计算机盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦除可编程只 读存储器(EPROM(Erasable Programmable Read-Only Memory)或快闪存储器)、光纤、便捷式紧凑盘只读存储器(Compact Disc Read-Only Memory,CD-ROM)、光学储存设备、磁储存设备、或上述内容的任何合适组合。
为了提供与用户的交互,可以在电子设备上实施此处描述的系统和技术,该电子设备具有:用于向用户显示信息的显示装置(例如,CRT(Cathode Ray Tube,阴极射线管)或者LCD(Liquid Crystal Display,液晶显示器)监视器);以及键盘和指向装置(例如,鼠标或者轨迹球),用户可以通过该键盘和该指向装置来将输入提供给电子设备。其它种类的装置还可以用于提供与用户的交互;例如,提供给用户的反馈可以是任何形式的传感反馈(例如,视觉反馈、听觉反馈、或者触觉反馈);并且可以用任何形式(包括声输入、语音输入或者、触觉输入)来接收来自用户的输入。
可以将此处描述的系统和技术实施在包括后台部件的计算系统(例如,作为数据服务器)、或者包括中间件部件的计算系统(例如,应用服务器)、或者包括前端部件的计算系统(例如,具有图形用户界面或者网络浏览器的用户计算机,用户可以通过该图形用户界面或者该网络浏览器来与此处描述的系统和技术的实施方式交互)、或者包括这种后台部件、中间件部件、或者前端部件的任何组合的计算系统中。可以通过任何形式或者介质的数字数据通信(例如,通信网络)来将系统的部件相互连接。通信网络的示例包括:局域网(Local Area Network,LAN)、广域网(Wide Area Network,WAN)、区块链网络和互联网。
计算系统可以包括客户端和服务器。客户端和服务器一般远离彼此并且通常通过通信网络进行交互。通过在相应的计算机上运行并且彼此具有客户端-服务器关系的计算机程序来产生客户端和服务器的关系。服务器可以是云服务器,又称为云计算服务器或云主机,是云计算服务体系中的一项主机产品,以解决了传统物理主机与虚拟专用服务器(Virtual Private Server,VPS)服务中,存在的管理难度大,业务扩展性弱的缺陷。
应该理解,可以使用上面所示的各种形式的流程,重新排序、增加或删除 步骤。例如,本申请中记载的各步骤可以并行地执行也可以顺序地执行也可以不同的次序执行,只要能够实现本申请的技术方案所期望的结果,本文在此不进行限制。
上述具体实施方式,并不构成对本申请保护范围的限制。本领域技术人员应该明白的是,根据设计要求和其他因素,可以进行各种修改、组合、子组合和替代。任何在本申请的精神和原则之内所作的修改、等同替换和改进等,均应包含在本申请保护范围之内。
注意,上述仅为本申请的可选实施例及所运用技术原理。本领域技术人员会理解,本申请不限于这里所述的特定实施例,对本领域技术人员来说能够进行各种明显的变化、重新调整和替代而不会脱离本申请的保护范围。因此,虽然通过以上实施例对本申请进行了较为详细的说明,但是本申请不仅仅限于以上实施例,在不脱离本申请构思的情况下,还可以包括更多其他等效实施例,而本申请的范围由所附的权利要求范围决定。

Claims (12)

  1. 一种图像处理方法,包括:
    基于目标图像尺寸,确定与所述目标图像尺寸相匹配的基础图像;
    提取所述基础图像中的图像元素;
    基于所述目标图像尺寸和基础图像尺寸的尺寸差异,以及每个图像元素的形变类型,确定每个图像元素的处理方式,并基于确定的处理方式对所述图像元素进行处理,得到处理后的图像元素;
    将处理后的图像元素进行拼接,得到目标图像。
  2. 根据权利要求1所述的方法,其中,所述提取所述基础图像中的图像元素,包括:
    将所述基础图像分别输入至多个元素识别模型中,分别得到每个所述元素识别模型输出的图像元素识别结果,其中,图像元素识别结果中包括图像元素的位置;
    基于所述图像元素识别结果从所述基础图像中提取图像元素。
  3. 根据权利要求2所述的方法,其中,任一所述元素识别模型的训练方法,包括:
    获取背景图和元素数据,对所述元素数据进行增强处理,得到多个增强元素数据,并将所述增强元素数据设置在所述背景图中,得出训练样本图并记录所述增强元素数据的元素类型和在背景图的设置位置;
    基于训练样本图、所述训练样本图对应的元素类型和在背景图的设置位置对待训练的元素识别模型进行迭代训练,以得到训练完成的元素识别模型。
  4. 根据权利要求2所述的方法,其中,基于所述图像元素识别结果从所述基础图像中提取图像元素,包括:
    基于图像元素识别结果更新待提取的图像元素,并基于更新后的待提取的图像元素的位置关系确定所述基础图像的布局类型;
    基于所述布局类型提取所述待提取的图像元素。
  5. 根据权利要求4所述的方法,其中,所述基于图像元素识别结果更新待 提取的图像元素,并基于更新后的待提取的图像元素的位置关系确定所述基础图像的布局类型,包括:
    基于图像元素的位置,确定具有位置重叠关系的多个图像元素;
    基于图像元素的优先级,确定所述具有位置重叠关系的多个图像元素的从属关系,将具有从属关系的多个图像元素更新为一个独立的图像元素;
    基于所述基础图像中各独立的图像元素之间的位置关系,确定所述基础图像的布局类型。
  6. 根据权利要求1所述的方法,其中,所述目标图像尺寸和基础图像尺寸的尺寸差异包括等比例缩放差异和宽高比差异中的至少之一;
    图像元素的形变类型包括不可形变、可轻微形变和可形变;
    所述不可形变的图像元素的处理方式包括等比例缩放,所述可轻微形变的图像元素的处理方式包括等比例缩放和预设形变范围内的拉伸,所述可形变的图像元素的处理方式包括等比例缩放和任意比例的拉伸。
  7. 根据权利要求1所述的方法,其中,所述将处理后的图像元素进行拼接,得到目标图像,包括:
    确定目标图像的至少一个布局分布,基于所述至少一个布局分布将处理后的图像元素进行拼接,得到每个布局分布对应的目标图像。
  8. 根据权利要求1所述的方法,其中,所述将处理后的图像元素进行拼接,得到目标图像,包括:
    获取所述基础图像的背景,并基于所述目标图像尺寸对所述背景进行剪裁处理,得到符合所述目标图像尺寸的目标背景;
    将所述将处理后的图像元素在所述目标背景上进行拼接,得到目标图像。
  9. 根据权利要求1所述的方法,在将处理后的图像元素进行拼接之后,还包括:
    在拼接后的图像中存在背景扩展区域的情况下,获取所述背景扩展区域相邻的图像边缘或背景边缘,基于所述图像边缘或背景边缘的颜色数据得到所述 背景扩展区域对应的衍生背景。
  10. 一种图像处理装置,包括:
    基础图像确定模块,设置为基于目标图像尺寸,确定与所述目标图像尺寸相匹配的基础图像;
    图像元素提取模块,设置为提取所述基础图像中的图像元素;
    图像元素处理模块,设置为基于所述目标图像尺寸和基础图像尺寸的尺寸差异,以及每个图像元素的形变类型,确定每个图像元素的处理方式,并基于确定的处理方式对所述图像元素进行处理,得到处理后的图像元素;
    目标图像生成模块,设置为将处理后的图像元素进行拼接,得到目标图像。
  11. 一种电子设备,包括:
    至少一个处理器;以及
    与所述至少一个处理器通信连接的存储器;其中,
    所述存储器存储有可被所述至少一个处理器执行的计算机程序,所述计算机程序被所述至少一个处理器执行,以使所述至少一个处理器能够执行权利要求1-9中任一项所述的图像处理方法。
  12. 一种计算机可读存储介质,所述计算机可读存储介质存储有计算机指令,所述计算机指令用于使处理器执行时实现权利要求1-9中任一项所述的图像处理方法。
PCT/CN2023/116675 2022-09-09 2023-09-04 图像处理方法、装置、介质及设备 WO2024051632A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202211104066.6A CN115564976A (zh) 2022-09-09 2022-09-09 图像处理方法、装置、介质及设备
CN202211104066.6 2022-09-09

Publications (1)

Publication Number Publication Date
WO2024051632A1 true WO2024051632A1 (zh) 2024-03-14

Family

ID=84741713

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/116675 WO2024051632A1 (zh) 2022-09-09 2023-09-04 图像处理方法、装置、介质及设备

Country Status (2)

Country Link
CN (1) CN115564976A (zh)
WO (1) WO2024051632A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115564976A (zh) * 2022-09-09 2023-01-03 北京沃东天骏信息技术有限公司 图像处理方法、装置、介质及设备

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110293180A1 (en) * 2010-05-28 2011-12-01 Microsoft Corporation Foreground and Background Image Segmentation
CN111062871A (zh) * 2019-12-17 2020-04-24 腾讯科技(深圳)有限公司 一种图像处理方法、装置、计算机设备及可读存储介质
CN111540033A (zh) * 2019-01-18 2020-08-14 北京京东尚科信息技术有限公司 图像制作方法、装置、浏览器、计算机设备及存储介质
CN112164127A (zh) * 2020-09-25 2021-01-01 大方众智创意广告(珠海)有限公司 图片生成方法、装置、电子设备及可读存储介质
US20210097344A1 (en) * 2019-09-27 2021-04-01 Raytheon Company Target identification in large image data
CN114677432A (zh) * 2022-03-23 2022-06-28 稿定(厦门)科技有限公司 图像处理方法、装置及存储介质
CN115564976A (zh) * 2022-09-09 2023-01-03 北京沃东天骏信息技术有限公司 图像处理方法、装置、介质及设备

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110293180A1 (en) * 2010-05-28 2011-12-01 Microsoft Corporation Foreground and Background Image Segmentation
CN111540033A (zh) * 2019-01-18 2020-08-14 北京京东尚科信息技术有限公司 图像制作方法、装置、浏览器、计算机设备及存储介质
US20210097344A1 (en) * 2019-09-27 2021-04-01 Raytheon Company Target identification in large image data
CN111062871A (zh) * 2019-12-17 2020-04-24 腾讯科技(深圳)有限公司 一种图像处理方法、装置、计算机设备及可读存储介质
CN112164127A (zh) * 2020-09-25 2021-01-01 大方众智创意广告(珠海)有限公司 图片生成方法、装置、电子设备及可读存储介质
CN114677432A (zh) * 2022-03-23 2022-06-28 稿定(厦门)科技有限公司 图像处理方法、装置及存储介质
CN115564976A (zh) * 2022-09-09 2023-01-03 北京沃东天骏信息技术有限公司 图像处理方法、装置、介质及设备

Also Published As

Publication number Publication date
CN115564976A (zh) 2023-01-03

Similar Documents

Publication Publication Date Title
US20210201445A1 (en) Image cropping method
CN108446698B (zh) 在图像中检测文本的方法、装置、介质及电子设备
US20210350541A1 (en) Portrait extracting method and apparatus, and storage medium
JP7425147B2 (ja) 画像処理方法、テキスト認識方法及び装置
WO2024051632A1 (zh) 图像处理方法、装置、介质及设备
JP6811796B2 (ja) 拡張現実アプリケーションのためのビデオにおけるリアルタイムオーバーレイ配置
US20220189189A1 (en) Method of training cycle generative networks model, and method of building character library
WO2023035531A1 (zh) 文本图像超分辨率重建方法及其相关设备
EP4080469A2 (en) Method and apparatus of recognizing text, device, storage medium and smart dictionary pen
WO2019080702A1 (zh) 图像处理方法和装置
US20230087489A1 (en) Image processing method and apparatus, device, and storage medium
JP7418370B2 (ja) 髪型を変換するための方法、装置、デバイス及び記憶媒体
KR20200036098A (ko) 글자 검출 장치, 방법 및 시스템
US20230005171A1 (en) Visual positioning method, related apparatus and computer program product
EP4120181A2 (en) Method and apparatus of fusing image, and method of training image fusion model
WO2023019995A1 (zh) 训练方法、译文展示方法、装置、电子设备以及存储介质
US20220319141A1 (en) Method for processing image, device and storage medium
WO2015074405A1 (en) Methods and devices for obtaining card information
US20230052979A1 (en) Image display method and apparatus, and medium
WO2022095318A1 (zh) 字符检测方法、装置、电子设备、存储介质及程序
US20230186599A1 (en) Image processing method and apparatus, device, medium and program product
CN114998897B (zh) 生成样本图像的方法以及文字识别模型的训练方法
WO2022237460A1 (zh) 图像处理方法、设备、存储介质及程序产品
WO2023134143A1 (zh) 图像样本生成方法、文本识别方法、装置、设备和介质
CN112528707A (zh) 图像处理方法、装置、设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23862326

Country of ref document: EP

Kind code of ref document: A1