CN114463216A

CN114463216A - Image adjusting method, storage medium and electronic device

Info

Publication number: CN114463216A
Application number: CN202210118988.6A
Authority: CN
Inventors: 张树业; 周祥全; 蔡海军
Original assignee: Guangzhou Fanxing Huyu IT Co Ltd
Current assignee: Guangzhou Fanxing Huyu IT Co Ltd
Priority date: 2022-02-08
Filing date: 2022-02-08
Publication date: 2022-05-10

Abstract

The invention discloses an image adjusting method, a storage medium and an electronic device. Wherein, the method comprises the following steps: acquiring a first target image to be adjusted; separating a foreground sub-image and a background sub-image from a first target image in response to an image adjustment request for the first target image; adjusting a target object in the foreground sub-image to obtain a target foreground image; repairing a missing image area in the background subimage to obtain a target background image, wherein the missing image content is the missing image area in the first target image after the foreground subimage is separated; and fusing the target foreground image and the background sub-image to obtain a second target image. The invention solves the technical problem of lower accuracy of image adjustment.

Description

Image adjusting method, storage medium and electronic device

Technical Field

The present invention relates to the field of computers, and in particular, to an image adjusting method, a storage medium, and an electronic device.

Background

The application of image adjustment is more and more extensive, for example, in the field of live broadcast, the image adjustment can be used for beautifying and adjusting the image related to the anchor. However, in the related art, usually, contour points in an area where an image is located are identified, and then the contour points are adjusted by stretching or shrinking through an image deformation technique. However, the image adjustment method is poor in pertinence, and the background image around the area is easily distorted, so that the problem of low accuracy of image adjustment occurs. Therefore, there is a problem that the accuracy of image adjustment is low.

In view of the above problems, no effective solution has been proposed.

Disclosure of Invention

The embodiment of the invention provides an image adjusting method, a storage medium and electronic equipment, which at least solve the technical problem of low accuracy of image adjustment.

According to an aspect of an embodiment of the present invention, there is provided an image adjusting method including: acquiring a first target image to be adjusted; separating a foreground sub-image and a background sub-image from the first target image in response to an image adjustment request for the first target image, wherein the image adjustment request is used for requesting adjustment of a target object in the first target image, the foreground sub-image is a foreground image in which the first target image includes the target object, and the background sub-image is a background image in which the first target image does not include the target object; adjusting the target object in the foreground sub-image to obtain a target foreground image; repairing a missing image area in the background sub-image to obtain a target background image, wherein the missing image content is the missing image area in the first target image after the foreground sub-image is separated; and fusing the target foreground image and the background subimage to obtain a second target image.

According to another aspect of the embodiments of the present invention, there is also provided an image adjusting apparatus including: the device comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring a first target image to be adjusted; a separating unit, configured to separate a foreground sub-image and a background sub-image from the first target image in response to an image adjustment request for the first target image, where the image adjustment request is used to request adjustment of a target object in the first target image, the foreground sub-image is a foreground image in the first target image that includes the target object, and the background sub-image is a background image in the first target image that does not include the target object; an adjusting unit, configured to adjust the target object in the foreground sub-image to obtain a target foreground image; a restoring unit, configured to restore a missing image area in the background sub-image to obtain a target background image, where the missing image content is a missing image area in the first target image after the foreground sub-image is separated; and the fusion unit is used for fusing the target foreground image and the background subimage to obtain a second target image.

As an optional solution, the adjusting unit includes: a first obtaining module, configured to obtain a target adjustment identifier carried in the image adjustment request, where the target adjustment identifier is an identifier of an object element, which is requested to be adjusted in response to the image adjustment request, in the target object; a first determining module, configured to determine a candidate area image in the foreground sub-image according to the target adjustment identifier, where the candidate area image includes a target element, which is in the target object and matches the target adjustment identifier; the adjusting module is used for adjusting the candidate area image to obtain a target area image; and the second acquisition module is used for acquiring the target foreground image according to the target area image.

As an alternative, the apparatus includes: a third obtaining module, configured to obtain a plurality of image region points in the foreground sub-image before determining a candidate region image in the foreground sub-image according to the target adjustment identifier, where the image region points are contour points of the target object in the foreground sub-image; constructing a plurality of image subregions based on the plurality of image region points; the first determining module includes: and the determining submodule is used for determining a target image sub-area from the plurality of image sub-areas according to the target adjustment identifier, wherein the candidate area image comprises the target image sub-area.

As an optional solution, the repair unit includes: a fourth obtaining module, configured to obtain target edge information according to the foreground sub-image and the background sub-image, where the target edge information is edge information of an image area associated with the target object in the first target image; and the restoration module is used for restoring the missing image area in the background sub-image according to the target edge information to obtain the target background image.

As an optional solution, the fourth obtaining module includes: and the first input submodule is used for inputting the foreground mask of the foreground sub-image, the background edge information of the background sub-image and the image content of the background sub-image into an edge generation structure to obtain the target edge information output by the edge generation structure, wherein the edge generation structure is a structure which is obtained by training a plurality of image samples and is used for estimating the edge information of an image area shielded by the foreground image.

As an optional solution, the repair module includes: and a second input sub-module, configured to input the target edge information and the image content of the background sub-image into an entire image generation structure, so as to obtain the target background image output by the entire image generation structure, where the entire image generation structure is a structure obtained by training a plurality of image samples and used for estimating image content in an image area blocked by a foreground image.

As an alternative, the separation unit includes: the input module is used for inputting the first target image into a semantic segmentation model, wherein the semantic segmentation model is a structure which is obtained by training a plurality of image samples and is used for segmenting a foreground image and a background image; a fifth obtaining module, configured to obtain a plurality of mask images of the target object output by the semantic segmentation model, where each of the plurality of mask images is associated with a corresponding foreground value, and the foreground value is used to indicate a probability that the mask image is a foreground image of the target object; a second determining module, configured to determine a target mask image from the multiple mask images according to the foreground value, where the foreground value of the target mask image is a maximum value of multiple foreground values corresponding to the multiple mask images; and a third determining module, configured to determine the target mask image as the foreground sub-image, and determine an image other than the target mask image in the first target image as the background sub-image.

According to yet another aspect of embodiments herein, there is provided a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the computer device executes the image adjustment method as above.

According to another aspect of the embodiments of the present invention, there is also provided an electronic device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the image adjusting method through the computer program.

In the embodiment of the invention, a first target image to be adjusted is obtained; separating a foreground sub-image and a background sub-image from the first target image in response to an image adjustment request for the first target image, wherein the image adjustment request is used for requesting adjustment of a target object in the first target image, the foreground sub-image is a foreground image in which the first target image includes the target object, and the background sub-image is a background image in which the first target image does not include the target object; adjusting the target object in the foreground sub-image to obtain a target foreground image; repairing a missing image area in the background sub-image to obtain a target background image, wherein the missing image content is the missing image area in the first target image after the foreground sub-image is separated; and fusing the target foreground image and the background subimage to obtain a second target image, and achieving the purpose of improving the adjustment pertinence of the images by utilizing the mode of segmenting the images into the foreground image and the background image and then performing targeted adjustment on the foreground image, thereby realizing the effect of improving the accuracy of image adjustment and further solving the technical problem of lower accuracy of image adjustment.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the invention without limiting the invention. In the drawings:

FIG. 1 is a schematic diagram of an application environment of an alternative image adjustment method according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of a flow of an alternative image adjustment method according to an embodiment of the invention;

FIG. 3 is a diagram illustrating an alternative image adjustment method according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of an alternative image adjustment method according to an embodiment of the invention;

FIG. 5 is a schematic diagram of an alternative image adjustment method according to an embodiment of the invention;

FIG. 6 is a schematic diagram of an alternative image adjustment method according to an embodiment of the invention;

FIG. 7 is a schematic diagram of an alternative image adjustment method according to an embodiment of the invention;

FIG. 8 is a schematic diagram of an alternative image adjustment apparatus according to an embodiment of the present invention;

fig. 9 is a schematic structural diagram of an alternative electronic device according to an embodiment of the invention.

Detailed Description

In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

According to an aspect of the embodiments of the present invention, an image adjusting method is provided, and optionally, as an optional implementation manner, the image adjusting method may be applied to, but is not limited to, an environment as shown in fig. 1. The system may include, but is not limited to, a user equipment 102, a network 110, and a server 112, wherein the user equipment 102 may include, but is not limited to, a display 108, a processor 106, and a memory 104.

The specific process comprises the following steps:

step S102, the user equipment 102 obtains an image adjustment request triggered by the first target image 1022;

steps S104-S106, the user device 102 sends an image adjustment request to the server 112 via the network 110;

step S108, the server 112 divides the first target image 1022 into a foreground sub-image and a background sub-image through the processing engine 116, adjusts the foreground sub-image, and generates a second target image 1024 based on the adjusted foreground sub-image and background sub-image;

steps S110-S112, the server 112 sends the second target image 1024 to the user device 102 via the network 110, and the processor 106 in the user device 102 displays the second target image 1024 on the display 108 and stores the second target image 1024 in the memory 104.

In addition to the example shown in fig. 1, the above steps may be performed by the user device 102 independently, that is, the user device 102 performs the steps of obtaining the foreground sub-image and the background sub-image, adjusting the foreground sub-image, or generating the second target image 1024, so as to relieve the processing pressure of the server. The user equipment 102 includes, but is not limited to, a handheld device (e.g., a mobile phone), a notebook computer, a desktop computer, a vehicle-mounted device, and the like, and the specific implementation manner of the user equipment 102 is not limited in the present invention.

Optionally, as an optional implementation manner, as shown in fig. 2, the image adjusting method includes:

s202, acquiring a first target image to be adjusted;

s204, in response to an image adjustment request for the first target image, separating a foreground sub-image and a background sub-image from the first target image, wherein the image adjustment request is used for requesting to adjust a target object in the first target image, the foreground sub-image is a foreground image containing the target object in the first target image, and the background sub-image is a background image not containing the target object in the first target image;

s206, adjusting the target object in the foreground sub-image to obtain a target foreground image;

s208, repairing a missing image area in the background sub-image to obtain a target background image, wherein the missing image content is the missing image area in the first target image after the foreground sub-image is separated;

and S210, fusing the target foreground image and the background sub-image to obtain a second target image.

Alternatively, in the present embodiment, the image adjusting method may be applied, but not limited, to a scene of shaping a portrait, where shaping the portrait refers to a technology of beautifying and modifying the shape (including the face, limbs, etc.) of a person in an image or a video, for example, the leg shape of the person is made finer, and the face shape is made to conform to the melon seed face. Different from the image adjustment method in the related art, in the embodiment, the foreground image where the person is located in the image and the video is determined first, and then the foreground image is subjected to targeted processing, so that the influence of image adjustment on the image in the region outside the region where the person is located is reduced.

Optionally, in this embodiment, the first target image may be segmented by an image segmentation technique to obtain a foreground sub-image and a background sub-image, where the foreground image may be but is not limited to be understood as a target image that is most interested in a user in the first target image, and the background image may be but is not limited to be understood as other images except for the foreground image in the first target image; for example, if the first target image is an image of a target virtual character performing on a virtual stage, the image corresponding to the target virtual character is a foreground image, and the image corresponding to the virtual stage is a background image.

Optionally, in this embodiment, the method for adjusting the target object in the foreground sub-image may include, but is not limited to, adjusting image pixel points of the target object in the foreground sub-image individually or in batches, where the individual adjustment method is low in efficiency but high in accuracy; the batch adjustment method has high efficiency, but the accuracy is not as good as that of the single adjustment method, and specifically, the batch adjustment method can be, but not limited to, dividing the target object into at least two regions according to the elements such as structure, position, and the like, and then adjusting each image pixel point in each region in the adjustment process.

Optionally, in this embodiment, the background restoration may be, but is not limited to, a manner of processing the background sub-image, and is used to restore the loss of the background sub-image after the first target image is covered by the foreground image; the background of the background sub-image is restored to obtain a target background image, the target foreground image and the target background image are further integrated to obtain a second target image, and the effect of improving the image adjusting accuracy is achieved.

Optionally, in this embodiment, obtaining the second target image according to the target foreground image and the background sub-image may be, but is not limited to, fusing the adjusted foreground sub-image (target foreground image) and background sub-image, and the fusing method may be, but is not limited to, poisson fusion or the like, where poisson fusion is adopted to make the fused image transition realistically and naturally at the edge.

It should be noted that, a first target image to be adjusted is acquired; separating a foreground sub-image and a background sub-image from a first target image in response to an image adjustment request for the first target image, wherein the image adjustment request is used for requesting adjustment of a target object in the first target image, the foreground sub-image is a foreground image containing the target object in the first target image, and the background sub-image is a background image not containing the target object in the first target image; adjusting a target object in the foreground sub-image to obtain a target foreground image; repairing a missing image area in the background subimage to obtain a target background image, wherein the missing image content is the missing image area in the first target image after the foreground subimage is separated; and fusing the target foreground image and the background sub-image to obtain a second target image.

To further illustrate, optionally, for example, as shown in fig. 3, a first target image 302 to be adjusted is acquired; in response to an image adjustment request for the first target image 302, obtaining a foreground sub-image 304 and a background sub-image 306 of the first target image 302, where the image adjustment request is used to request adjustment of a target object 308 in the first target image 302, the foreground sub-image 304 is a foreground image of the first target image 302 that includes the target object 308, and the background sub-image 306 is a background image of the first target image 302 that does not include the target object 308; adjusting the target object 308 in the foreground sub-image 304 to obtain a target foreground image 310; a second target image 312 is derived from the target foreground image 310 and the background sub-image 306.

According to the embodiment provided by the application, a first target image to be adjusted is obtained; separating a foreground sub-image and a background sub-image from a first target image in response to an image adjustment request for the first target image, wherein the image adjustment request is used for requesting adjustment of a target object in the first target image, the foreground sub-image is a foreground image containing the target object in the first target image, and the background sub-image is a background image not containing the target object in the first target image; adjusting a target object in the foreground sub-image to obtain a target foreground image; repairing a missing image area in the background subimage to obtain a target background image, wherein the missing image content is the missing image area in the first target image after the foreground subimage is separated; the target foreground image and the background sub-image are fused to obtain a second target image, the image is firstly segmented into the foreground image and the background image, and then the foreground image is subjected to targeted adjustment, so that the aim of improving the adjustment pertinence of the image is fulfilled, and the effect of improving the accuracy of image adjustment is achieved.

As an optional scheme, adjusting a target object in a foreground sub-image to obtain a target foreground image includes:

s1, acquiring a target adjustment identifier carried in the image adjustment request, wherein the target adjustment identifier is an identifier of an object element which is requested to be adjusted in response to the image adjustment request in the target object;

s2, determining a candidate area image in the foreground sub-image according to the target adjustment identifier, wherein the candidate area image contains target elements matched with the target adjustment identifier in the target object;

s3, adjusting the candidate area image to obtain a target area image;

and S4, obtaining a target foreground image according to the target area image.

Optionally, in this embodiment, the object element may be, but is not limited to, a part of elements included in the target object, for example, the target object is a face corresponding to the target object 402 shown in fig. 4, and further, a plurality of object elements are included in the target object 402, such as a first object element 404 (mouth), a second object element 406 (eye), and the like;

further acquiring a target adjustment identifier carried in the image adjustment request, wherein the target adjustment identifier is an identifier of a first object element 404, which is requested to be adjusted in response to the image adjustment request, in the target object 402; determining a candidate area image in the foreground sub-image according to the target adjustment identifier, wherein the candidate area image comprises a first object element 404 matched with the target adjustment identifier in the target object; adjusting the candidate area image to obtain a target area image; and obtaining a target foreground image according to the target area image.

According to the embodiment provided by the application, a target adjustment identifier carried in an image adjustment request is obtained, wherein the target adjustment identifier is an identifier of an object element which is requested to be adjusted in response to the image adjustment request in a target object; determining a candidate area image in the foreground sub-image according to the target adjustment identifier, wherein the candidate area image contains target elements matched with the target adjustment identifier in the target object; adjusting the candidate area image to obtain a target area image; and obtaining the target foreground image according to the target area image, thereby achieving the purpose of efficiently adjusting the image by using the identifier and realizing the effect of improving the efficiency of image adjustment.

As an optional solution, before determining the candidate area image in the foreground sub-image according to the target adjustment identifier, the method includes: acquiring a plurality of image area points in the foreground sub-image, wherein the image area points are contour points of a target object in the foreground sub-image; constructing a plurality of image sub-regions based on the plurality of image region points;

as an optional solution, determining a candidate area image in the foreground sub-image according to the target adjustment identifier includes: and determining a target image sub-region from the plurality of image sub-regions according to the target adjustment identifier, wherein the candidate region image comprises the target image sub-region.

Optionally, in this embodiment, the target object in the foreground sub-image is adjusted to obtain the target foreground image, which may be but is not limited to a mode of foreground deformation, and the specific foreground deformation may be but is not limited to three parts, which are respectively to identify a key point (image area point) of the foreground area, construct a triangular patch (image sub-area), and perform stretching shrinkage deformation on the graph.

For further example, if the target object is a human face object, the existing human face or human body keypoint technology may identify the position of the contour point of the human face or human body. Because the geometric shapes of the human face and the human body are relatively complex, in order to accurately adjust different areas, a plurality of triangular surface patches can be constructed and then respectively processed in a refined mode. Based on the identified contour point locations, delaunay triangulation techniques may be used to divide the foreground region into a plurality of smaller triangular patches. The triangular patches are not repeated pairwise and jointly form a foreground area. Finally, the specific area is adjusted according to the requirement. For example, if a leg contour point is identified and a corresponding triangular patch is found, deformation operations such as affine transformation can be performed on the triangular patches.

According to the embodiment provided by the application, a plurality of image area points in the foreground sub-image are obtained, wherein the image area points are contour points of the target object in the foreground sub-image; constructing a plurality of image sub-regions based on the plurality of image region points; and determining a target image subregion from the plurality of image subregions according to the target adjustment identifier, wherein the candidate region image comprises the target image subregion, and the effect of improving the adjustment accuracy of the image is realized.

As an optional scheme, the repairing a missing image area in a background sub-image to obtain a target background image includes:

s1, obtaining target edge information according to the foreground sub-image and the background sub-image, wherein the target edge information is edge information of an image area associated with the target object in the first target image;

and S2, repairing the missing image area in the background sub-image according to the target edge information to obtain a target background image.

Optionally, in this embodiment, it is observed that a large number of images have strong semantic correlation, and further, but not limited to, a manner of using the strong semantic correlation may be used to obtain target edge information according to the foreground sub-image and the background sub-image, where the strong semantic correlation means that information such as an edge, a structure, a texture, and a color of the target area has higher correlation with an edge, a structure, a texture, and a color of a surrounding area. For example, if the target is near an edge of a certain portion of the table, then there is a high probability that the target is behind another portion of the table. While the probability of table edges at different locations is similar.

As an optional scheme, obtaining the target edge information according to the foreground sub-image and the background sub-image includes:

inputting a foreground mask of a foreground sub-image, background edge information of a background sub-image and image content of the background sub-image into an edge generating structure to obtain target edge information output by the edge generating structure, wherein the edge generating structure is a structure which is obtained by training a plurality of image samples and is used for estimating edge information of an image area which is shielded by the foreground image.

Alternatively, in this embodiment, the design model may be, but is not limited to, learning edge information behind the foreground image.

Optionally, in this embodiment, the foreground mask may be obtained, but is not limited to, through a semantic segmentation model; background edge information can be, but is not limited to, extracting the background by edge operators (such as Canny edge detection); the image content may be, but is not limited to, the image content separated by the front and background.

Alternatively, in the present embodiment, the edge generating structure can be, but is not limited to, designed as two parts, an encoder and a decoder. Firstly, a foreground mask, background edge information and image content are input into an encoder for encoding, and then encoding characteristics are input into a decoder for decoding into repaired edge information. The encoder is composed of a plurality of convolutions, downsampling operators, residual blocks and the like. The decoder is composed of a plurality of convolutions, upsampling operators, residual blocks, and the like.

It should be noted that the foreground mask of the foreground sub-image, the background edge information of the background sub-image, and the image content of the background sub-image are input to the edge generating structure to obtain the target edge information output by the edge generating structure, where the edge generating structure is a structure obtained by training a plurality of image samples and used for estimating the edge information of the image area blocked by the foreground image.

By way of further example, assuming that the vicinity of the edge of an object is a certain part of the table, then there is a high probability that the back of the object is another part of the table. While the table edges at different locations are likely, the design model can be, but is not limited to, learned to the edge behind the object. Optionally, as shown in fig. 5, the input of the edge generating structure 508 is the foreground mask 502 of the foreground sub-image, the background edge information 504 of the background sub-image, and the image content 506 of the background sub-image; the output of the edge generation structure 508 is then obtained as target edge information 510. In other words, the edge generation structure 508 is used to estimate edge information of the foreground back region;

the parameters of the edge generator are solved by a method against the generation learning. Specifically, a discriminator is added in the training process. A discriminator is used to discriminate whether the generated sample is from a true distribution or a generated distribution. The training process is arranged to alternately tune the edge generator and the discriminator until a certain condition is met. Eventually, the generating capacity of the generator becomes sufficiently good that the discriminator cannot distinguish between the true sample and the generated sample.

As an optional scheme, the repairing a missing image area in a background sub-image according to target edge information to obtain a target background image includes:

and inputting the target edge information and the image content of the background sub-image into a whole image generating structure to obtain a target background image output by the whole image generating structure, wherein the whole image generating structure is a structure which is obtained by training a plurality of image samples and is used for estimating the image content in the image area blocked by the foreground image.

Optionally, in this embodiment, the whole graph generation structure may be, but is not limited to, composed of an encoder and a decoder. Firstly, target edge information and image content of a background sub-image are input into an encoder for encoding, and then encoding characteristics are input into a decoder for decoding into repaired image content. The encoder is composed of a plurality of convolutions, downsampling operators, residual blocks and the like. The decoder is composed of a plurality of convolutions, upsampling operators, residual blocks, and the like.

Optionally, in this embodiment, when the whole graph generation structure is in the training stage, a variety of objective functions such as countermeasure loss and perceptual loss may be constructed. The penalty of fighting is similar to the edge generator. Perceptual loss is defined as constraining the distance between the generated feature and the true feature to be as small as possible. The method comprises the steps of extracting features of a generated sample and a real sample by adopting a pre-training large model. The training process is set to update the parameters of the entire graph generation structure so that the objective function meets the established conditions.

The target edge information and the image content of the background sub-image are input to the whole image generation structure to obtain the target background image output by the whole image generation structure, wherein the whole image generation structure is a structure obtained by training a plurality of image samples and used for estimating the image content in the image area blocked by the foreground image.

To further illustrate, optionally based on the scenario shown in fig. 5, and continuing as shown in fig. 6, the input of the whole graph generating structure 602 includes the output target edge information 510 of the edge generating structure 508 and the image content 506, so as to obtain the target background image 604 output by the whole graph generating structure 602.

As an alternative, the separating the foreground sub-image and the background sub-image from the first target image includes:

s1, inputting the first target image into a semantic segmentation model, wherein the semantic segmentation model is a structure obtained by training a plurality of image samples and used for segmenting a foreground image and a background image;

s2, acquiring a plurality of mask images of the target object output by the semantic segmentation model, wherein each mask image in the plurality of mask images is associated with a corresponding foreground numerical value, and the foreground numerical value is used for representing the probability that the mask image is the foreground image of the target object;

s3, determining a target mask image from the plurality of mask images according to the foreground value, wherein the foreground value of the target mask image is the maximum value of the plurality of foreground values corresponding to the plurality of mask images;

s4, determining the target mask image as a foreground sub-image, and determining the images other than the target mask image in the first target image as background sub-images.

Optionally, in this embodiment, a semantic segmentation model is used to separate the foreground image and the background image. When the semantic segmentation model is adopted for reasoning, the input of the semantic segmentation model can be but is not limited to a single-frame image, the output of the semantic segmentation model can be but is not limited to a mask image of a target, and the higher the numerical value of the mask image is, the higher the probability representing the foreground is.

Furthermore, in this embodiment, the parameters of the semantic segmentation model may be, but are not limited to, solved by a stochastic gradient descent optimization method. Specifically, an error function of the model is defined. The form of cross entropy can be adopted, the cross entropy measures the similarity of real distribution and predicted distribution, the larger the similarity is, the smaller the error function value is, otherwise, the smaller the similarity is, the larger the error value is. The optimization objective is to make the error function value smaller by updating the parameters of the model. The whole optimization process is divided into a forward part and a reverse part. The forward direction is used to calculate the error function value. The inverse is used to calculate the gradient of the error versus the parameter, which reflects the direction in which the error drops fastest. The original parameters are moved by a plurality of step lengths along the gradient direction, and then new parameters can be obtained. The forward and reverse steps are alternately executed a plurality of times until a predetermined condition is satisfied.

As an optional scheme, to facilitate understanding, the image adjustment method is applied to a human figure shaping scene, which is further illustrated, for example, as shown in fig. 7, and the specific steps are as follows:

step S702, separating a foreground image and a background image;

optionally, in this embodiment, a semantic segmentation model is used to separate the foreground from the background. When the model is used for reasoning, the input is a single-frame image, the output is a target mask image, and the higher the numerical value in the mask image is, the higher the probability representing the foreground is.

And the parameters of the semantic segmentation model are obtained by solving through a random gradient descent optimization method. Specifically, an error function of the model is defined. The form of cross entropy can be adopted, the cross entropy measures the similarity of real distribution and predicted distribution, the larger the similarity is, the smaller the error function value is, otherwise, the smaller the similarity is, the larger the error value is. The optimization objective is to make the error function value smaller by updating the parameters of the model. The whole optimization process is divided into a forward part and a reverse part. The forward direction is used to calculate the error function value. The inverse is used to calculate the gradient of the error versus the parameter, which reflects the direction in which the error decreases most rapidly. The original parameters are moved by a plurality of step lengths along the gradient direction, and then new parameters can be obtained. The forward and reverse steps are alternately executed a plurality of times until a predetermined condition is satisfied.

Step S704, deforming the foreground image;

optionally, in this embodiment, the foreground deformation mainly includes three parts, which are respectively to identify a key point of the foreground area, construct a triangular patch, and stretch and shrink deform the graph.

The existing face or human body key point technology can identify the position of the outline point of the face or the human body. Because the geometric shapes of the human face and the human body are relatively complex, in order to accurately adjust different areas, a plurality of triangular surface patches can be constructed and then respectively processed in a refined mode. Based on the identified contour point locations, delaunay triangulation techniques may be used to divide the foreground region into a plurality of smaller triangular patches. The triangular patches are not repeated pairwise and jointly form a foreground area. Finally, the specific area is adjusted according to the requirement. For example, if a leg contour point is identified and a corresponding triangular patch is found, deformation operations such as affine transformation can be performed on the triangular patches.

Step S706, repairing the background image;

optionally, in this embodiment, the background repair is composed of two parts, which are an edge generator and a whole graph generator. The foreground mask, the background edge and the background image are taken as the input of an edge generator, and the output is the edge of the whole image. And then guiding the whole graph generator to generate the whole graph by taking the edge of the whole graph as a condition.

A large number of images were observed to have a semantically strong correlation. The semantic strong correlation means that information such as the edge, structure, texture and color of the target area has higher correlation with the edge, structure, texture and color of the surrounding area. Say, the object is near the edge of a certain part of the table, then there is a high probability that the object is behind another part of the table. And the table edges at different parts are similar with high probability, so that the design model can learn the edge behind the target.

The inputs to the edge generator are foreground mask, background edge and background. The foreground mask is obtained by a semantic segmentation model (see subsection 2.1); extracting a background by an edge operator (such as Canny edge detection); the background image is an image separated by a foreground and a background. The output is the whole image edge, in other words, the generator estimates the edge information of the foreground back region. Typically the edge generator is designed as both encoder and decoder parts. Firstly, a foreground mask, a background edge and a background image are input into an encoder for encoding, and then encoding characteristics are input into a decoder for decoding into a repaired edge. The encoder is composed of a plurality of convolutions, downsampling operators, residual blocks and the like. The decoder is composed of a plurality of convolutions, upsampling operators, residual blocks, and the like.

The parameters of the edge generator are solved by a method of countering generation learning. Specifically, a discriminator is added in the training process. A discriminator is used to discriminate whether the generated sample is from a true distribution or a generated distribution. The training process is arranged to alternately tune the edge generator and the discriminator until a certain condition is met. Eventually, the generating capacity of the generator becomes sufficiently good that the discriminator cannot distinguish between the true sample and the generated sample.

The input of the whole image generator comprises the whole image edge and the background image. The full graph edge is provided by the edge generator. The background image is a background image separated by a foreground background. The output is the whole image, i.e. to estimate the content occluded by the foreground. Similar to the edge generator, the whole picture generator is also composed of an encoder and a decoder. Firstly, the whole image edge and the background image are input into an encoder for encoding, and then the encoding characteristics are input into a decoder for decoding into a repaired image. The encoder is composed of a plurality of convolutions, downsampling operators, residual blocks and the like. The decoder is composed of a plurality of convolutions, upsampling operators, residual blocks, and the like.

In the training stage, various objective functions such as antagonistic loss and perceptual loss can be constructed. The counter-penalty is similar to the edge generator. Perceptual loss is defined as constraining the distance between the generated feature and the true feature to be as small as possible. The method comprises the steps of extracting features of a generated sample and a real sample by adopting a pre-training large model. The training process is arranged to update the parameters of the whole graph generator such that the objective function meets the established conditions.

Step 708, fusing the foreground image and the background image;

optionally, in this embodiment, the deformed foreground and the restored background are fused, and the fusion method includes poisson fusion and the like. Poisson fusion is adopted to ensure that the fused image has vivid and natural transition at the edge.

According to the embodiment, the foreground image and the background image are fused; the body-building restoration model encodes the background image and the foreground position, and then decodes the encoding characteristics into a restored image; and guiding the body beautification model to learn strong correlation of structure, texture and color by using image space semantic constraint.

It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present invention is not limited by the order of acts, as some steps may occur in other orders or concurrently in accordance with the invention. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required by the invention.

According to another aspect of the embodiment of the present invention, there is also provided an image adjusting apparatus for implementing the image adjusting method. As shown in fig. 8, the apparatus includes:

an obtaining unit 802, configured to obtain a first target image to be adjusted;

a separating unit 804, configured to separate a foreground sub-image and a background sub-image from a first target image in response to an image adjustment request for the first target image, where the image adjustment request is used to request to adjust a target object in the first target image, the foreground sub-image is a foreground image in the first target image that includes the target object, and the background sub-image is a background image in the first target image that does not include the target object;

an adjusting unit 806, configured to adjust a target object in the foreground sub-image to obtain a target foreground image;

the adjusting unit 808 is configured to repair a missing image area in the background sub-image to obtain a target background image, where the missing image content is a missing image area in the first target image after the foreground sub-image is separated;

and a fusion unit 810, configured to fuse the target foreground image and the background sub-image to obtain a second target image.

For a specific embodiment, reference may be made to the example shown in the image adjustment method, and details in this example are not described herein again.

As an optional solution, the adjusting unit 806 includes:

the first obtaining module is used for obtaining a target adjusting identifier carried in the image adjusting request, wherein the target adjusting identifier is an identifier of an object element which is requested to be adjusted in response to the image adjusting request in a target object;

the first determining module is used for determining a candidate area image in the foreground sub-image according to the target adjusting identifier, wherein the candidate area image contains target elements matched with the target adjusting identifier in the target object;

the adjusting module is used for adjusting the candidate area image to obtain a target area image;

and the second acquisition module is used for acquiring a target foreground image according to the target area image.

As an alternative, the apparatus comprises: the third acquisition module is used for acquiring a plurality of image area points in the foreground sub-image before determining a candidate area image in the foreground sub-image according to the target adjustment identifier, wherein the image area points are contour points of the target object in the foreground sub-image; constructing a plurality of image sub-regions based on the plurality of image region points;

a first determination module comprising: and the determining submodule is used for determining a target image sub-area from the plurality of image sub-areas according to the target adjustment identification, wherein the candidate area image comprises the target image sub-area.

For a specific embodiment, reference may be made to the example shown in the foregoing image adjustment method, and details are not described herein in this example.

As an optional solution, the adjusting unit 808 includes:

the fourth acquisition module is used for acquiring target edge information according to the foreground sub-image and the background sub-image, wherein the target edge information is edge information of an image area associated with the target object in the first target image;

and the restoration module is used for restoring the missing image area in the background sub-image according to the target edge information to obtain a target background image.

As an optional solution, the fourth obtaining module includes:

the first input submodule is used for inputting a foreground mask of a foreground sub-image, background edge information of a background sub-image and image content of the background sub-image into an edge generating structure to obtain target edge information output by the edge generating structure, wherein the edge generating structure is a structure which is obtained by training a plurality of image samples and is used for estimating edge information of an image area which is shielded by the foreground image.

As an alternative, the repair module includes:

and the second input submodule is used for inputting the target edge information and the image content of the background sub-image into the whole image generation structure to obtain a target background image output by the whole image generation structure, wherein the whole image generation structure is a structure which is obtained by training a plurality of image samples and is used for estimating the image content in the image area blocked by the foreground image.

As an alternative, the separation unit 804 includes:

the input module is used for inputting the first target image into a semantic segmentation model, wherein the semantic segmentation model is a structure obtained by training a plurality of image samples and used for segmenting a foreground image and a background image;

a fifth obtaining module, configured to obtain multiple mask images of the target object output by the semantic segmentation model, where each mask image in the multiple mask images is associated with a corresponding foreground value, and the foreground value is used to represent a probability that the mask image is a foreground image of the target object;

the second determining module is used for determining a target mask image from the plurality of mask images according to the foreground value, wherein the foreground value of the target mask image is the maximum value of the plurality of foreground values corresponding to the plurality of mask images;

and the third determining module is used for determining the target mask image as a foreground sub-image and determining the images except the target mask image in the first target image as background sub-images.

According to a further aspect of the embodiment of the present invention, there is also provided an electronic device for implementing the image adjusting method, as shown in fig. 9, the electronic device includes a memory 902 and a processor 904, the memory 902 stores a computer program, and the processor 904 is configured to execute the steps in any one of the method embodiments by the computer program.

Optionally, in this embodiment, the electronic device may be located in at least one network device of a plurality of network devices of a computer network.

Optionally, in this embodiment, the processor may be configured to execute the following steps by a computer program:

s1, acquiring a first target image to be adjusted;

s2, in response to an image adjustment request for the first target image, separating a foreground sub-image and a background sub-image from the first target image, where the image adjustment request is used to request adjustment of a target object in the first target image, the foreground sub-image is a foreground image in the first target image that includes the target object, and the background sub-image is a background image in the first target image that does not include the target object;

s3, adjusting the target object in the foreground sub-image to obtain a target foreground image;

s4, repairing the missing image area in the background sub-image to obtain a target background image, wherein the missing image content is the missing image area in the first target image after the foreground sub-image is separated;

and S5, fusing the target foreground image and the background sub-image to obtain a second target image.

Alternatively, it can be understood by those skilled in the art that the structure shown in fig. 9 is only an illustration, and the electronic device may also be a terminal device such as a smart phone (e.g., an Android phone, an iOS phone, etc.), a tablet computer, a palmtop computer, a Mobile Internet Device (MID), a PAD, and the like. Fig. 9 does not limit the structure of the electronic device. For example, the electronic device may also include more or fewer components (e.g., network interfaces, etc.) than shown in FIG. 9, or have a different configuration than shown in FIG. 9.

The memory 902 may be used to store software programs and modules, such as program instructions/modules corresponding to the image adjusting method and apparatus in the embodiments of the present invention, and the processor 904 executes various functional applications and data processing by running the software programs and modules stored in the memory 902, so as to implement the image adjusting method. The memory 902 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 902 may further include memory located remotely from the processor 904, which may be connected to the terminal over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof. The memory 902 may be specifically, but not limited to, used to store information such as a first target image, a target foreground image, and a second target image. As an example, as shown in fig. 9, the memory 902 may include, but is not limited to, an obtaining unit 802, a separating unit 804, an adjusting unit 806, an adjusting unit 808, and a fusing unit 810 in the image adjusting apparatus. In addition, other module units in the image adjusting apparatus may also be included, but are not limited to these, and are not described in detail in this example.

Optionally, the transmitting device 906 is used for receiving or sending data via a network. Examples of the network may include a wired network and a wireless network. In one example, the transmission device 906 includes a Network adapter (NIC) that can be connected to a router via a Network cable and other Network devices to communicate with the internet or a local area Network. In one example, the transmission device 906 is a Radio Frequency (RF) module, which is used for communicating with the internet in a wireless manner.

In addition, the electronic device further includes: a display 908 for displaying information such as the first target image, the target foreground image, and the second target image; and a connection bus 910 for connecting the respective module components in the above-described electronic apparatus.

In other embodiments, the terminal device or the server may be a node in a distributed system, where the distributed system may be a blockchain system, and the blockchain system may be a distributed system formed by connecting a plurality of nodes through a network communication. The nodes may form a Peer-To-Peer (P2P) network, and any type of computing device, such as a server, a terminal, and other electronic devices, may become a node in the blockchain system by joining the Peer-To-Peer network.

According to an aspect of the application, there is provided a computer program product comprising a computer program/instructions containing program code for performing the method illustrated by the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via the communication section, and/or installed from a removable medium. When executed by the central processing unit, the computer program performs various functions provided by the embodiments of the present application.

According to an aspect of the present application, there is provided a computer-readable storage medium, from which a processor of a computer device reads a computer instruction, and the processor executes the computer instruction, so that the computer device executes the image adjusting method described above.

Alternatively, in the present embodiment, the above-mentioned computer-readable storage medium may be configured to store a computer program for executing the steps of:

s1, acquiring a first target image to be adjusted;

s4, repairing a missing image area in the background sub-image to obtain a target background image, wherein the missing image content is the missing image area in the first target image after the foreground sub-image is separated;

Alternatively, in this embodiment, a person skilled in the art may understand that all or part of the steps in the various methods in the foregoing embodiments may be implemented by a program instructing hardware related to the terminal device, where the program may be stored in a computer-readable storage medium, and the storage medium may include: flash disks, Read-Only memories (ROMs), Random Access Memories (RAMs), magnetic or optical disks, and the like.

The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.

The integrated unit in the above embodiments, if implemented in the form of a software functional unit and sold or used as a separate product, may be stored in the above computer-readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing one or more computer devices (which may be personal computers, servers, network devices, etc.) to execute all or part of the steps of the method according to the embodiments of the present invention.

In the above embodiments of the present invention, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.

In the several embodiments provided in the present application, it should be understood that the disclosed client may be implemented in other manners. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one type of division of logical functions, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, units or modules, and may be in an electrical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.

Claims

1. An image adjustment method, comprising:

acquiring a first target image to be adjusted;

separating a foreground sub-image and a background sub-image from the first target image in response to an image adjustment request for the first target image, wherein the image adjustment request is used for requesting adjustment of a target object in the first target image, the foreground sub-image is a foreground image containing the target object in the first target image, and the background sub-image is a background image not containing the target object in the first target image;

adjusting the target object in the foreground sub-image to obtain a target foreground image;

repairing a missing image area in the background sub-image to obtain a target background image, wherein the missing image content is the missing image area in the first target image after the foreground sub-image is separated;

and fusing the target foreground image and the background subimage to obtain a second target image.

2. The method of claim 1, wherein the adjusting the target object in the foreground sub-image to obtain a target foreground image comprises:

acquiring a target adjustment identifier carried in the image adjustment request, wherein the target adjustment identifier is an identifier of an object element which is requested to be adjusted in response to the image adjustment request in the target object;

determining a candidate area image in the foreground sub-image according to the target adjustment identifier, wherein the candidate area image contains a target element matched with the target adjustment identifier in the target object;

adjusting the candidate area image to obtain a target area image;

and obtaining the target foreground image according to the target area image.

3. The method of claim 2,

before the determining the candidate area image in the foreground sub-image according to the target adjustment identifier, the method includes: acquiring a plurality of image area points in the foreground sub-image, wherein the image area points are contour points of the target object in the foreground sub-image; constructing a plurality of image sub-regions based on the plurality of image region points;

determining a candidate area image in the foreground sub-image according to the target adjustment identifier includes: and determining a target image sub-region from the plurality of image sub-regions according to the target adjustment identifier, wherein the candidate region image comprises the target image sub-region.

4. The method according to claim 1, wherein the repairing the missing image area in the background sub-image to obtain the target background image comprises:

obtaining target edge information according to the foreground sub-image and the background sub-image, wherein the target edge information is edge information of an image area associated with the target object in the first target image;

and repairing the missing image area in the background sub-image according to the target edge information to obtain the target background image.

5. The method of claim 4, wherein obtaining target edge information from the foreground sub-image and the background sub-image comprises:

and inputting the foreground mask of the foreground sub-image, the background edge information of the background sub-image and the image content of the background sub-image into an edge generating structure to obtain the target edge information output by the edge generating structure, wherein the edge generating structure is a structure which is obtained by training a plurality of image samples and is used for estimating the edge information of an image area which is shielded by the foreground image.

6. The method according to claim 4, wherein the repairing the missing image area in the background sub-image according to the target edge information to obtain the target background image comprises:

and inputting the target edge information and the image content of the background sub-image into a whole image generation structure to obtain the target background image output by the whole image generation structure, wherein the whole image generation structure is a structure obtained by training a plurality of image samples and used for estimating the image content in an image area blocked by a foreground image.

7. The method according to any one of claims 1 to 6, wherein the separating of the foreground sub-image and the background sub-image from the first target image comprises:

inputting the first target image into a semantic segmentation model, wherein the semantic segmentation model is a structure obtained by training a plurality of image samples and used for segmenting a foreground image and a background image;

acquiring a plurality of mask images of the target object output by the semantic segmentation model, wherein each mask image in the plurality of mask images is associated with a corresponding foreground numerical value, and the foreground numerical value is used for representing the probability that the mask image is the foreground image of the target object;

determining a target mask image from the plurality of mask images according to the foreground numerical value, wherein the foreground numerical value of the target mask image is the maximum value of a plurality of foreground numerical values corresponding to the plurality of mask images;

and determining the target mask image as the foreground sub-image, and determining the images except the target mask image in the first target image as the background sub-images.

8. An image adjusting apparatus, comprising:

a first acquisition unit for acquiring a first target image to be adjusted;

a separating unit, configured to separate a foreground sub-image and a background sub-image from the first target image in response to an image adjustment request for the first target image, where the image adjustment request is used to request to adjust a target object in the first target image, the foreground sub-image is a foreground image in the first target image that includes the target object, and the background sub-image is a background image in the first target image that does not include the target object;

the adjusting unit is used for adjusting the target object in the foreground sub-image to obtain a target foreground image;

the restoring unit is used for restoring a missing image area in the background sub-image to obtain a target background image, wherein the missing image content is the missing image area in the first target image after the foreground sub-image is separated;

and the fusion unit is used for fusing the target foreground image and the background subimage to obtain a second target image.

9. A computer-readable storage medium, comprising a stored program, wherein the program when executed performs the method of any one of claims 1 to 7.

10. An electronic device comprising a memory and a processor, characterized in that the memory has stored therein a computer program, the processor being arranged to execute the method of any of claims 1 to 7 by means of the computer program.