WO2022258013A1

WO2022258013A1 - Image processing method and apparatus, electronic device and readable storage medium

Info

Publication number: WO2022258013A1
Application number: PCT/CN2022/097859
Authority: WO
Inventors: 李巧
Original assignee: 维沃移动通信有限公司
Priority date: 2021-06-11
Filing date: 2022-06-09
Publication date: 2022-12-15
Also published as: CN113469903A

Abstract

The present application belongs to the field of image processing, and discloses an image processing method and apparatus, an electronic device and a readable storage medium. The method comprises: acquiring a target mask image of a face region in a first image; on the basis of a binarized face mask image of the target mask image, acquiring a reference face image in a reference face image set that matches the target mask image; performing image processing on the reference face image and the target mask image to obtain N reference face sub-images and N target mask sub-images; and fusing images of a target region of the reference face sub-images with images of a corresponding region of the target mask sub-images to generate a second image corresponding to the first image, wherein the reference face image set comprises a plurality of face mask images which have undergone skin texture processing, and a face skin texture value of the target region is greater than a face skin texture value of a corresponding region of the target region in the target mask image.

Description

Image processing method, device, electronic device and readable storage medium

Cross References to Related Applications

This application claims priority to Chinese Patent Application No. 202110653986.2 filed in China on June 11, 2021, the entire contents of which are hereby incorporated by reference.

technical field

The embodiments of the present application relate to the field of image processing, and in particular, to an image processing method, device, electronic equipment, and readable storage medium.

Background technique

With the development of electronic device technology, users use electronic devices to shoot more and more frequently, and users have higher and higher requirements on the quality of images captured by electronic devices.

In related technologies, the camera takes photos of portraits under different lighting conditions, and is affected by various degradation problems such as noise, motion blur, highlights, post-beauty denoising, etc., and the imaged faces lack good skin quality and details. The internal blemishes (such as acne marks), wrinkles and excessive noise will greatly affect the skin feel and aesthetics of the human face after imaging.

Contents of the invention

The purpose of the embodiment of the present application is to provide an image processing method, device, electronic device and readable storage medium, which can solve the problem of poor skin quality in human face imaging.

In order to solve the above-mentioned technical problems, the application is implemented as follows:

In the first aspect, the embodiment of the present application provides an image processing method, the method comprising: acquiring a target mask image of the face area in the first image; based on a binarized face mask image of the target mask image, acquiring A reference face image matched with the target mask image in the reference face image collection; image processing is performed on the reference face image and the target mask image to obtain N reference face sub-images and N target mask sub-images; The image of the target area of the reference face sub-image is fused with the image of the corresponding area of the target mask sub-image to generate a second image corresponding to the first image; wherein, the set of reference face images includes a plurality of processed human faces Face mask image; the face skin quality value of the target area is greater than the face skin quality value of the area corresponding to the target area in the target mask image.

In a second aspect, the embodiment of the present application also provides an image processing device, the device including: an acquisition module and an image processing module; an acquisition module, configured to acquire a target mask image of a face area in the first image; an acquisition module , is also used for the binarized face mask image based on the target mask image, and obtains the reference face image matching the target mask image in the reference face image set; the acquisition module is also used for the reference face image and the target mask image for image processing to obtain N reference face sub-images and N target mask sub-images; the image processing module is used to obtain the image of the target area of the reference face sub-image obtained by the acquisition module and the image obtained by the acquisition module The images of the corresponding areas of the target mask sub-image are fused to generate a second image corresponding to the first image; wherein, the reference face image set includes a plurality of face mask images processed through skin texture; the face of the target area The skin quality value is higher than the face skin quality value of the area corresponding to the target area in the target mask image.

In the third aspect, the embodiment of the present application provides an electronic device, including a processor, a memory, and a program or instruction stored on the memory and operable on the processor. When the program or instruction is executed by the processor, the The steps of the image processing method as described in the first aspect.

In a fourth aspect, an embodiment of the present application provides a readable storage medium, on which a program or an instruction is stored, and when the program or instruction is executed by a processor, the steps of the method described in the first aspect are implemented .

In the fifth aspect, the embodiment of the present application provides a chip, the chip includes a processor and a communication interface, the communication interface is coupled to the processor, and the processor is used to run programs or instructions, so as to implement the first aspect the method described.

In the embodiment of the present application, after the first image containing the human face is acquired, the target mask image of the human face area in the first image is acquired, and based on the binarized human face mask image of the target mask image, Obtain the reference face image matching the target mask image in the reference face image set. Afterwards, image processing is carried out for the reference face image and the target mask image, and N reference face sub-images and N target mask sub-images are obtained, and the image of the target area of the reference face sub-image and the image of the target mask sub-image The images of the corresponding areas are fused to remove the bad texture and excess of the face, restore the delicate and clear skin texture, and obtain a second image with better skin quality, which greatly improves the quality of the facial skin after imaging.

Description of drawings

FIG. 1 is a schematic diagram of an interface to which an image processing method provided in an embodiment of the present application is applied;

Fig. 2 is a schematic structural diagram of an image pyramid provided by an embodiment of the present application;

FIG. 3 is a schematic structural diagram of an image processing device provided in an embodiment of the present application;

FIG. 4 is one of the structural schematic diagrams of an electronic device provided in an embodiment of the present application;

FIG. 5 is a second schematic structural diagram of an electronic device provided by an embodiment of the present application.

Detailed ways

The following will clearly describe the technical solutions in the embodiments of the present application with reference to the drawings in the embodiments of the present application. Obviously, the described embodiments are part of the embodiments of the present application, but not all of them. All other embodiments obtained by persons of ordinary skill in the art based on the embodiments in this application belong to the protection scope of this application.

The terms "first", "second" and the like in the specification and claims of the present application are used to distinguish similar objects, and are not used to describe a specific sequence or sequence. It should be understood that the terms so used are interchangeable under appropriate circumstances such that the embodiments of the application can be practiced in sequences other than those illustrated or described herein, and that references to "first," "second," etc. distinguish Objects are generally of one type, and the number of objects is not limited. For example, there may be one or more first objects. In addition, "and/or" in the specification and claims means at least one of the connected objects, and the character "/" generally means that the related objects are an "or" relationship.

The image processing method provided in the embodiment of the present application may be applied to a scene of beautifying an image including a human face.

Exemplarily, for the scene of beautifying an image containing a human face, in related technologies, when an electronic device is imaging, it is affected by various degradation problems such as noise, motion blur, highlight, and post-beautification denoising. The human face lacks good skin texture and details, while the blemishes, wrinkles and noise on the face cause excessive unevenness, which greatly affects the skin feel and aesthetics of the imaged human face.

To solve this problem, in the technical solution provided by the embodiment of the present application, through the face and skin quality transfer method based on multi-layer image pyramid fusion, the human face and skin with blemishes in the captured image are combined with the image with better skin quality Image fusion can effectively remove the poor texture and transition of the face of the portrait, making the skin of the face after imaging delicate and clear, and greatly improving the skin quality of the face after imaging.

The image processing method provided by the embodiment of the present application will be described in detail below through specific embodiments and application scenarios with reference to the accompanying drawings.

As shown in Figure 1, an image processing method provided by the embodiment of the present application may include the following steps 201 to 204:

Step 201. The image processing apparatus acquires a target mask image of a face area in a first image.

Exemplarily, the above-mentioned first image may be an image captured by the electronic device, or may be an image stored in the electronic device read by the electronic device.

Exemplarily, after acquiring the above-mentioned first image, the image processing apparatus acquires an image of the face area of the first image in a red green blue (red green blue, RGB) color space. And in the obtained face area, a mask image of the face area, that is, the above-mentioned target mask image, is generated through a face parsing algorithm.

It should be noted that the above-mentioned target mask image can be understood as, after the contour of the human face contained in the first image is obtained, all images outside the range of the contour of the human face are covered, for example, set to the same color, so that the image processing device can only recognize the image of the face area.

It can be understood that the purpose of acquiring only the image of the face area of the first image is to eliminate the interference of images of other areas, so as to facilitate the optimization of the image of the face area.

Step 202 , the image processing device acquires a reference face image matching the target mask image in the reference face image set based on the binarized face mask image of the target mask image.

Exemplarily, image binarization (image binarization) is the process of setting the gray value of the pixel on the image to 0 or 255, that is, the process of presenting an obvious black-and-white effect to the entire image. The above binarized face mask image can be understood as a black and white image of a face image. That is, if the facial features areas of the above target mask image are all set to black, and all non-facial features areas are all set to white.

Exemplarily, the above binarized face image is an image including only facial features, that is, the binarized face image is an image including eyes, nose, eyebrows, mouth and other areas of the face area. The binarized face mask image is mainly used for matching with the face images in the reference face image collection.

Exemplarily, the matching algorithm used in the above-mentioned step 202 is an E-log-Hua template matching algorithm.

Step 203 , the image processing device performs image processing on the reference face image and the target mask image to obtain N reference face sub-images and N target mask sub-images.

Exemplarily, the above-mentioned image processing for the above-mentioned reference face image and the above-mentioned target mask image may include performing formatting processing on the reference face image, and then performing degradation processing and processing on the formatted reference face image. Scaling operation to generate N reference face sub-images. Wherein, the scaling ratio between every two adjacent reference face sub-images is the same, and the image with a smaller resolution is an image obtained after degrading the image with a larger resolution.

It should be noted that the processing method of obtaining N target mask sub-images after processing the target mask image is similar to the above-mentioned processing method for the reference face image, and the target image can be processed based on the above-mentioned processing method for the reference face image. The mask image is processed to obtain N target mask sub-images.

Exemplarily, there is a one-to-one correspondence between the N reference face sub-images and the N target mask sub-images. For example, taking the above-mentioned N as 5 as an example, there is a corresponding relationship between the five reference face sub-images numbered 0-4 and the five target mask sub-images numbered 0-4. Wherein, as the number increases, the resolution of the image gradually decreases.

Step 204. The image processing device fuses the image of the target area of the reference face sub-image with the image of the corresponding area of the target mask sub-image to generate a second image corresponding to the first image.

Wherein, the aforementioned collection of reference face images includes a plurality of face mask images that have undergone skin texture processing. The face skin quality value of the above target area is higher than the face skin quality value of the area corresponding to the target area in the target mask image.

Exemplarily, the image processing device finds a matching reference face image from the collection of reference face images based on the above binarized face mask image. Afterwards, the image of the poor skin quality area in the target mask image may be fused with the image with better skin quality in the corresponding area in the reference face image, so as to obtain a second image with better skin quality.

In a possible implementation manner, the image processing device may generate the first image pyramid according to the reference face image, and generate the second image pyramid according to the target mask image, and then, based on the first image pyramid and the second image pyramid, The target mask image is processed, and the skin quality of the reference face image is transferred to obtain a face image with better skin quality in the first image.

In this way, after the first image containing the face is obtained, the target mask image of the face area in the first image is obtained, and the reference face image is obtained based on the binarized face mask image of the target mask image A set of reference face images that match the target mask image. Afterwards, image processing is carried out for the reference face image and the target mask image, and N reference face sub-images and N target mask sub-images are obtained, and the image of the target area of the reference face sub-image and the image of the target mask sub-image The images of the corresponding areas are fused to remove the bad texture and excess of the face, restore the delicate and clear skin texture, and obtain a second image with better skin quality, which greatly improves the quality of the facial skin after imaging.

Optionally, in the embodiment of the present application, the image processing device may realize the migration of the image with better skin quality in the reference face image to the region with poorer skin quality in the first image based on the image pyramid.

Exemplarily, before the above step 202, the image processing method provided in the embodiment of the present application may further include the following steps 201a1 to 202a3:

Step 202a1, the image processing device acquires N face images that have undergone skin texture processing, where N is a positive integer.

Exemplarily, before the image processing apparatus acquires the reference face image matching the target mask image, it also needs to create a set of reference face images. N human face images with better skin quality that have undergone image-level professional image processing (including skin color adjustment, freckle and acne removal, skin smoothing, enhancement, etc.) can be obtained, and the set includes N human face images.

In step 202a2, the image processing device extracts facial features information of each of the above N facial images, and constructs a binarized mask according to the facial features information.

Among them, a face image corresponds to a binarization mask.

Exemplarily, the facial features information includes various information about the facial features in the face image, for example, the area where the facial features are located, specific location coordinates, and the like.

Exemplarily, after the image processing device acquires the aforementioned N face images, it uses the face analysis model to decompose the pixel-by-pixel segmentation mask image of each part of the face area of each face image, and only An image of the facial features of a human face. Afterwards, the image processing device constructs a binarized mask image of the mask image based on the mask image. Each face object includes a corresponding binarized mask image.

In step 202a3, the image processing device generates the set of reference face images based on the above N face images that have undergone skin texture processing and a binarized mask corresponding to each face image.

Exemplarily, the aforementioned set of reference human face images includes N human face images that have undergone skin texture processing, and N binary mask images corresponding to the human face images.

Exemplarily, the above-mentioned human face image processed with skin quality is mainly used for image fusion with an image of a region with poor skin quality in the target mask image. The main user of the above-mentioned binary mask image adjusts the position of the facial features of the above-mentioned reference face image to make it closer to the position of the facial features in the target mask image, so that the appearance of the person in the reference face image can be adjusted as much as possible after adjustment. It is possible to keep the appearance of the face in the target mask image consistent, so as to facilitate the subsequent migration of skin quality.

In this way, the image processing device can construct a set of reference face images based on the reference face image processed by the skin texture and the corresponding binarized mask image, so that after the image processing device obtains an image that needs to be processed, it can based on This collection performs processing on images.

Optionally, in the embodiment of the present application, after the image processing device obtains the aforementioned set of reference face images, it can process the acquired first image based on the set, and the specific processing process needs to be completed by using an image pyramid.

Exemplarily, the above step 203 may also include the following steps 203a1 and 203a2:

In step 203a1, the image processing device constructs a first image pyramid based on the aforementioned reference face image.

In step 203a2, the image processing device constructs a second image pyramid based on the target mask image.

Wherein, the first image pyramid includes the N reference face sub-images, and the second image pyramid includes the N target mask sub-images.

It should be noted that the image pyramid is a kind of multi-scale representation of images, and it is an effective but conceptually simple structure to explain images at multiple resolutions. The pyramid of an image is a series of images arranged in a pyramid shape with gradually reduced resolution and derived from the same original image. It is obtained by down-sampling in steps, and the sampling is stopped until a certain termination condition is reached. We compare layer-by-layer images to a pyramid. The higher the level, the smaller the image and the lower the resolution.

Exemplarily, the image processing device needs to construct an image pyramid of each face image in the reference face image set.

Exemplarily, the first image pyramid may be a Laplacian pyramid. The bottom layer (level 0) of the first image pyramid may be the above-mentioned reference face image, or may be a formatted image of the reference face image. The image size of the 0th layer of the pyramid constructed by each face image in the reference face image set is the same, and the scaling ratio between layers is also the same.

For example, as shown in Figure 2, it is a schematic structural diagram of an image pyramid, which includes five layers (L0 to L4), each layer contains an image, and the images between layers are scaled according to a preset ratio Zoomed out.

Optionally, in the embodiment of the present application, the image processing apparatus may process the target mask image based on the reference face image and feature points of the target mask image, so as to obtain the second image.

Exemplarily, the above step 204 may include the following steps 204a1 to 204a4:

In step 204a1, the image processing device extracts the first feature point of the reference face image and the second feature point of the target mask image.

Step 204a2, the image processing device obtains the binarized face mask image of each of the N reference face sub-images based on the first feature point, and the binary face mask image contained in each of the above images Vertex coordinates of each triangle block in the M triangle blocks.

Step 204a3: The image processing device obtains the vertex coordinates of each of the K triangle blocks contained in the binarized face mask image of each of the N target mask sub-images based on the second feature point.

In step 204a4, the image processing device fuses the image of the target area of the reference face sub-image with the image of the corresponding area of the target mask sub-image based on the vertex coordinates.

Exemplarily, exemplarily, taking the aforementioned N reference face sub-images as images in the first image pyramid, and the aforementioned N target mask sub-images as images in the aforementioned second image pyramid as an example, the image processing device successfully After constructing the above-mentioned first image pyramid and the second image pyramid, the above-mentioned first image pyramid and the second image pyramid can be triangulated,

Exemplarily, the above-mentioned specific processing steps from step 204a1 to step 204a4 may include the following steps from step 204b1 to step 204b3:

Step 204b1. The image processing device extracts the first feature point of the reference face image, and performs triangulation on the reference face image based on the first feature point to obtain M triangular blocks.

Wherein, one first feature point corresponds to one triangular block, and the circumscribed circle of each triangular block does not include other first feature points, and M is a positive integer.

Exemplarily, the image processing device may extract a plurality of first feature points from the above-mentioned reference face image, and then, the image processing device performs triangulation (also referred to as image triangulation) based on each feature point, so that the generated The circumscribed circle of each triangular block does not include any other first feature points.

It can be understood that if a certain feature point extracted by the image processing device cannot meet the requirement that no other feature points are included within the circumscribed circle of each triangular block, this feature point cannot be used as the first feature point.

It should be noted that image triangulation can be understood as dividing the image into several triangular fragments, each of which is a triangle, and any two triangles on the image either do not intersect, or intersect exactly on a common side (two cannot intersect at the same time). strip or two or more sides).

Step 204b2, the image processing device acquires the binarized face mask image of each layer image of the first image pyramid.

Step 204b3: The image processing device determines the binarization of each layer image based on the binarized face mask image corresponding to each layer image of the first image pyramid, and the scaling ratio between the first target layer and the second target layer Vertex coordinates of each triangle block in the M triangle blocks contained in the face mask image.

Wherein, the first target layer and the second target layer are two adjacent layers of the above-mentioned first image pyramid.

Exemplarily, after the image processing device constructs the first image pyramid of the above-mentioned reference face image, it also needs to generate a binary face mask image corresponding to each layer image based on the images contained in each layer of the first image pyramid.

Exemplarily, after the image processing device acquires the M triangular blocks of the reference face image, it can determine the M triangular blocks of each layer image of the first image pyramid based on the vertex coordinates of each of the M triangular blocks Vertex coordinates of each triangle block in .

It should be noted that, since each layer image in the first image pyramid is obtained based on the above-mentioned reference face image, each triangular block of the reference face image can find a corresponding triangular block in each layer image. Moreover, since image scaling exists between layers of the first image pyramid, the vertex coordinates of each triangular block of each layer image may be recalculated based on the scaling ratio.

In this way, after the image processing device obtains the vertex coordinates of each triangular block of each layer image of the first image pyramid, it can adjust the image of each layer based on the triangular block of each layer image, so that the appearance of the characters contained in it is closer to the target The appearance of the character contained in the mask image.

Exemplarily, similar to the above method of constructing the first image pyramid of the reference face image, the image processing device may construct the second image pyramid of the target mask image according to this method.

Exemplarily, the above steps 204a1 to 204a4 may specifically include the following steps 204c1 to 204c4:

In step 204c1, the image processing device constructs a second image pyramid based on the target mask image.

Exemplarily, similar to the above-mentioned first image pyramid, the 0th layer of the second image pyramid is constructed from the target mask image or the image obtained after the target mask image is formatted.

It should be noted that the size of each layer image of the first image pyramid is the same as that of each layer image of the second image pyramid. And the size of the images used to construct the first image pyramid and the second image pyramid is also the same.

It can be understood that, in the embodiment of the present application, the size of the image may be represented by resolution or inches, which is not limited in the embodiment of the present application.

Step 204c2. The image processing device extracts the second feature points of the target mask image, and performs triangulation on the target mask image based on the second feature points to obtain K triangular blocks.

Wherein, one second feature point corresponds to one triangular block, and the circumscribed circle of each triangular block does not include other second feature points, and K is a positive integer.

In step 204c3, the image processing device acquires the binarized face mask image of each layer image of the second image pyramid.

Step 204c4, the image processing device determines the image of each layer of the second image pyramid based on the binarized face mask image of the image of each layer of the second image pyramid and the scaling ratio between the third target layer and the fourth target layer The vertex coordinates of each triangular block in the K triangular blocks contained in the binarized face mask image of .

Wherein, the third target layer and the fourth target layer are adjacent layers of the second image pyramid.

It should be noted that, since the above steps 204c1 to 204c4 are similar to steps 204b1 to 204b3, for comparison with the explanations of steps 204c1 to 204c4, you can refer to the above explanations of steps 204b1 to 204b3. For the specific processing of the N reference face sub-images and the N target mask sub-images in the above steps 204a1 to 204a4, you can refer to the description of the processing of the first image pyramid and the second image pyramid, in order to prevent repetition, in This will not be repeated here.

In this way, after the image processing device acquires the image pyramid of the reference face image and the image pyramid of the target mask image, it can migrate the image of the region with better skin quality in the reference face image to the target mask image based on the image pyramid. Images in areas of poor quality.

Further optionally, in the embodiment of the present application, the image processing apparatus may improve the face skin quality of the face area in the first image based on the above N reference face sub-images and N target mask sub-images.

Exemplarily, the above step 204a4 may include the following steps 204d1 to 204d3:

Step 204d1, the image processing device performs affine transformation on the vertex coordinates of the M triangular blocks based on the vertex coordinates of the K triangular blocks.

Step 204d2, the image processing device combines the first target area image of the first reference face sub-image in the N reference face sub-images after affine transformation with the first target mask sub-image in the above N target mask sub-images The image of the second target area is fused to obtain N processed target mask sub-images.

Step 204d3. The image processing device reconstructs the above N processed target mask sub-images to generate the above second image.

Wherein, the above-mentioned first reference face sub-image is: any one of the above-mentioned N reference face sub-images; the above-mentioned first target mask sub-image is corresponding to the above-mentioned first reference face sub-image among the above-mentioned N target mask sub-images The target mask sub-image; the first target area image is the image of the first target area of the first reference face sub-image, and the first target area corresponds to the second target area of the first target mask sub-image image area.

Exemplarily, taking the aforementioned N reference face sub-images as images in the first image pyramid, and the aforementioned N target mask sub-images as images in the second image pyramid as an example, the aforementioned steps 204d1 to 204d3 may include the following steps 204e1 to step 204e3:

Step 204e1: The image processing device performs affine transformation on the vertex coordinates of the M triangular blocks in each layer of the image in the first image pyramid based on the vertex coordinates of the K triangular blocks in each layer of the image in the second image pyramid.

Exemplarily, in order to make the appearance of the person in the reference face image closer to the appearance of the person in the first image, the image processing device needs to process each layer of the image in the second image pyramid. That is, the mechanical energy affine transformation of each layer of image.

It should be noted that affine transformation, also known as affine mapping, means that in geometry, a vector space is transformed into another vector space by performing a linear transformation followed by a translation. Affine transformation is geometrically defined as an affine transformation between two vector spaces or an affine mapping consists of a non-singular linear transformation (transformation using a function) followed by a translation transformation.

Exemplarily, before performing affine transformation on each layer image of the first image pyramid, it is necessary to calculate the coordinates of each triangle block of the corresponding reference face image to the target mask according to the coordinates of the triangle blocks contained in the target mask image The affine transformation matrix for each triangular block of the image. The image processing device may perform affine transformation on each layer image of the first image pyramid based on the obtained transformation matrix.

Step 204e2, the image processing device performs image fusion on the image of the first target area of the fifth target layer of the first image pyramid after the affine transformation, and the image of the second target area of the sixth target layer of the second image pyramid , to obtain the processed second image pyramid.

Exemplarily, since each layer of the first image pyramid corresponds to each layer of the second image pyramid, for example, layer 0 of the first image pyramid corresponds to layer 0 of the second image pyramid, the first image The nth level of the pyramid corresponds to the nth level of the second image pyramid. Therefore, the image processing device may perform image fusion on the image of the first target area of the first image pyramid and the image of the corresponding area (ie, the second target area) of the second image pyramid to obtain the processed second image pyramid.

In step 204e3, the image processing device reconstructs the processed second image pyramid to generate the second image.

Wherein, the above-mentioned fifth target layer is: any layer of the above-mentioned first image pyramid; the above-mentioned sixth target layer is the layer corresponding to the above-mentioned fifth target layer in the above-mentioned second image pyramid; The image of the first target area of the five target layers, and the image area corresponding to the first target area and the second target area of the sixth target layer; the number of layers of the first image pyramid and the second image pyramid is the same, And the scaling ratio of each layer is also the same.

It can be understood that the sixth target layer is the layer corresponding to the fifth target layer in the second image pyramid. It can be understood that the number of layers of the sixth target layer in the second image pyramid is the same as that of the fifth target layer in the first The number of layers in the image pyramid is the same, that is, the same layer.

Exemplarily, the image processing device reconstructs the Laplacian pyramid after the texture migration of the reference image, that is, the above-mentioned second image pyramid, to obtain a final result.

It should be noted that the image processing device needs to construct a Gaussian pyramid before constructing the Laplacian pyramid of the target mask image or the reference face image. First, the original image is used as the bottom image G0 (the 0th layer of the Gaussian pyramid), and it is convolved with a Gaussian kernel (5*5), and then the convolved image is down-sampled (removing even rows and columns) to get The previous layer image G1. Afterwards, this image is used as an input, and the convolution and downsampling operations are repeated to obtain a higher-level image, and iterated multiple times to form a pyramid-shaped image data structure, that is, a Gaussian pyramid.

During the operation of the Gaussian pyramid, some high-frequency detail information will be lost after the image undergoes convolution and downsampling operations. To describe these high-frequency information, people define the Laplacian Pyramid (Laplacian Pyramid, LP). The predicted image after upsampling and Gaussian convolution is subtracted from each layer image of the Gaussian pyramid, and a series of difference images are obtained, which are LP decomposition images.

Exemplarily, the image processing device can restore the corresponding Gaussian pyramid from the top layer of the Laplacian pyramid after image fusion, and finally obtain the original image G0. It is the method of using interpolation from the highest level.

In a possible implementation manner, before performing image fusion, the image processing device may first perform skin smoothing on each layer of the image in the second image pyramid.

Exemplarily, after the above step 203a2, the image processing method provided in the embodiment of the present application may also include the following step 203b:

In step 203b, the image processing device performs guided filtering and microdermabrasion on each layer of the image in the second image pyramid according to the preset radius and the relative precision of the floating point.

Exemplarily, the image processing device may also perform guided filter skin smoothing on each of the above N target mask sub-images according to a preset radius and a floating-point relative precision.

Exemplarily, the image processing device can set a reasonable radius radius and eps floating-point relative precision for each layer of the Laplacian pyramid, and perform guided filtering for dermabrasion to alleviate problems such as acne marks, wrinkles, and excessive unevenness on the face.

In this way, the image processing device optimizes the face and skin quality in the first image according to the first image pyramid constructed based on the reference face image and the second image pyramid constructed based on the target mask image, and obtains the second image pyramid with better skin quality. Two images.

The image processing method provided in the embodiment of the present application, through the face and skin quality migration method based on multi-layer image pyramid fusion, performs image fusion on the human face skin with blemishes in the captured image and the image with better skin quality, which can effectively Remove the poor texture and transition of the face of the portrait, making the skin of the face after imaging delicate and clear, and greatly improving the skin quality of the face after imaging.

It should be noted that, the image processing method provided in the embodiment of the present application may be executed by an image processing device, or a control module in the image processing device for executing the image processing method. In the embodiment of the present application, the image processing device executed by the image processing device is taken as an example to describe the image processing device provided in the embodiment of the present application.

It should be noted that, in the embodiment of the present application, the above-mentioned methods are shown in the drawings. The image processing methods are described by way of example in conjunction with a drawing in the embodiment of the present application. During specific implementation, the image processing methods shown in the drawings of the above methods can also be implemented in combination with any other drawings shown in the above embodiments that can be combined, and will not be repeated here.

FIG. 3 is a schematic diagram of a possible structure of an image processing device provided by an embodiment of the present application. As shown in FIG. 3 , the image processing device 300 includes: an acquisition module 301 and an image processing module 302; The target mask image of the face area in an image; the acquisition module 301 is also used to obtain the target mask image in the reference face image set that matches the target mask image based on the binarized face mask image of the target mask image. Reference face image; Acquisition module 301 is also used for image processing for reference face image and target mask image, obtains N reference face sub-images and N target mask sub-images; Image processing module 302 is used for The image of the target area of the reference face sub-image acquired by the acquisition module 301 is fused with the image of the corresponding area of the target mask sub-image acquired by the acquisition module 301 to generate a second image corresponding to the first image; wherein, the reference face image The collection includes a plurality of face mask images processed by skin texture; the face skin quality value of the target area is higher than the face skin quality value of the area corresponding to the target area in the target mask image.

Optionally, the device 300 also includes: a generation module 303; an acquisition module 301, which is also used to acquire N face images processed by skin texture, where N is a positive integer; and an acquisition module 301, which is also used to extract N facial images The facial features information of each face image, and construct a binarized mask image according to the facial features information; a face image corresponds to a binarized mask image; a generation processing module 303 is used to obtain the N processed skins based on the acquisition module 301 The qualitatively processed face images, and the binarized mask image corresponding to each face image acquired by the acquisition module 301, generate a set of reference face images.

Optionally, the device 300 also includes: a construction module 304; the construction module 304 is used to construct a first image pyramid based on the reference face image, and the first image pyramid includes N reference face sub-images; the construction module 304 is also used to Based on the target mask image, a second image pyramid is constructed, and the second image pyramid includes N target mask sub-images.

Optionally, the obtaining module 301 is also used to extract the first feature point of the reference face image and the second feature point of the target mask image; the obtaining module 301 is also used to obtain N reference face sub-points based on the first feature point The binarized face mask image of each image of the image, and the vertex coordinates of each triangle block in the M triangle blocks contained in the binarized face mask image of each image; the acquisition module 301 is also used for Acquire the vertex coordinates of each triangle block in the K triangle blocks contained in the binarized face mask image of each image of N target mask sub-images based on the second feature point; the image processing module 302 is specifically used based on Vertex coordinates, the image of the target area of the reference face sub-image is fused with the image of the corresponding area of the target mask sub-image.

Optionally, the device 300 further includes: a transformation module 305; the transformation module 305 is configured to perform affine transformation on the vertex coordinates of the M triangle blocks acquired by the acquisition module 301 based on the vertex coordinates of the K triangle blocks acquired by the acquisition module 301 The image processing module 302 is specifically used to combine the first target area image of the first reference face sub-image in the N reference face sub-images after affine transformation with the first target mask in the N target mask sub-images Image fusion is performed on the second target area image of the sub-images to obtain N processed target mask sub-images; the image processing module 302 is also specifically used to reconstruct the N processed target mask sub-images to generate the first N target mask sub-images. Two images; Wherein, the first reference human face sub-image is: any one of N reference human face sub-images; the first target mask sub-image is the target corresponding to the first reference human face sub-image in N target mask sub-images The mask sub-image; the first target area image is an image of the first target area of the first reference face sub-image, and the first target area is an image area corresponding to the second target area of the first target mask sub-image.

Optionally, the image processing module 302 is further configured to perform guided filter skin smoothing on each layer of the image of the second image pyramid constructed by the construction module 304 according to the preset radius and relative precision of the floating point.

The image processing apparatus in the embodiment of the present application may be a device, or may be a component, an integrated circuit, or a chip in a terminal. The device may be a mobile electronic device or a non-mobile electronic device. Exemplarily, the mobile electronic device may be a mobile phone, a tablet computer, a notebook computer, a handheld computer, a vehicle electronic device, a wearable device, an ultra-mobile personal computer (ultra-mobile personal computer, UMPC), a netbook or a personal digital assistant (personal digital assistant). assistant, PDA), etc., non-mobile electronic devices can be servers, network attached storage (Network Attached Storage, NAS), personal computer (personal computer, PC), television (television, TV), teller machine or self-service machine, etc., this application Examples are not specifically limited.

The image processing device in the embodiment of the present application may be a device with an operating system. The operating system may be an Android operating system, an iOS operating system, or other possible operating systems, which are not specifically limited in this embodiment of the present application.

The image processing device provided in the embodiment of the present application can implement various processes implemented by the image processing device in the method embodiments shown in FIG. 1 to FIG. 2 , and details are not repeated here to avoid repetition.

For the beneficial effects of the various implementations in this embodiment, refer to the beneficial effects of the corresponding implementations in the foregoing method embodiments. To avoid repetition, details are not repeated here.

The image processing device provided in the embodiment of the present application, through the image processing method, performs image fusion of the human face skin with defects in the captured image and the image with better skin quality, which can effectively remove the bad texture and transition of the human face , making the skin of the human face after imaging delicate and clear, and greatly improving the skin quality of the human face after imaging.

Optionally, as shown in FIG. 4 , the embodiment of the present application further provides an electronic device M00, including a processor M01, a memory M02, and programs or instructions stored in the memory M02 and operable on the processor M01, When the program or instruction is executed by the processor M01, each process of the above-mentioned image processing method embodiment can be achieved, and the same technical effect can be achieved. To avoid repetition, details are not repeated here.

It should be noted that the electronic devices in the embodiments of the present application include the above-mentioned mobile electronic devices and non-mobile electronic devices.

FIG. 5 is a schematic diagram of a hardware structure of an electronic device implementing various embodiments of the present application.

The electronic device 100 includes but is not limited to: a radio frequency unit 101, a network module 102, an audio output unit 103, an input unit 104, a sensor 105, a display unit 106, a user input unit 107, an interface unit 108, a memory 109, and a processor 110, etc. part.

Those skilled in the art can understand that the electronic device 100 can also include a power supply (such as a battery) for supplying power to various components, and the power supply can be logically connected to the processor 110 through the power management system, so that the management of charging, discharging, and function can be realized through the power management system. Consumption management and other functions. The structure of the electronic device shown in FIG. 5 does not constitute a limitation to the electronic device. The electronic device may include more or fewer components than shown in the figure, or combine certain components, or arrange different components, and details will not be repeated here. .

Wherein, the input unit 104 is used to obtain the target mask image of the face area in the first image; the processor 110 is also used to obtain the reference face mask image based on the binarized face mask image of the target mask image. A reference face image matched with the target mask image in the image collection; Processor 110 is also used to perform image processing on the reference face image and the target mask image to obtain N reference face sub-images and N target masks sub-image; the processor 110 is used to fuse the image of the target area of the acquired reference face sub-image with the image of the corresponding area of the acquired target mask sub-image to generate a second image corresponding to the first image; wherein, the reference The face image set includes a plurality of face mask images processed by skin texture; the face skin quality value of the target area is higher than the face skin quality value of the area corresponding to the target area in the target mask image.

In this way, after the first image containing the face is obtained, the target mask image of the face area in the first image is obtained, and the reference face image is obtained based on the binarized face mask image of the target mask image A set of reference face images that match the target mask image. After that, the image of the target area of the reference face image is fused with the image of the corresponding area of the target mask image to remove the bad texture and excess of the face, restore the delicate and clear skin texture, and obtain the second image with better skin quality. image, which greatly improves the quality of human face skin after imaging.

Optionally, the input unit 104 is also used to acquire N facial images processed through skin quality, where N is a positive integer; the processor 110 is also used to extract facial features information of each facial image in the N facial images, and Construct a binarized mask image according to facial features information; a face image corresponds to a binarized mask image; processor 110 is used to obtain N processed face images based on the input unit 104, and obtain The binarized mask image corresponding to each face image in the image is used to generate a set of reference face images.

Optionally, the processor 110 is configured to construct a first image pyramid based on a reference face image, and the first image pyramid includes N reference face sub-images; the processor 110 is also configured to construct a second image pyramid based on a target mask image. An image pyramid, the second image pyramid includes N target mask sub-images.

Optionally, the processor 110 is also used to extract the first feature point of the reference face image and the second feature point of the target mask image; the processor 110 is also used to obtain N reference face sub-points based on the first feature point The binarized face mask image of each image of the image, and the vertex coordinates of each triangle block in the M triangle blocks contained in the binarized face mask image of each image; the processor 110 is also used for Acquire the vertex coordinates of each of the K triangular blocks contained in the binarized face mask image of each of the N target mask sub-images based on the second feature point; the processor 110 is specifically used to Coordinates, the image of the target area of the reference face sub-image is fused with the image of the corresponding area of the target mask sub-image.

Optionally, the processor 110 is configured to perform affine transformation on the acquired apex coordinates of the M triangular blocks based on the acquired apex coordinates of the K triangular blocks; the processor 110 is specifically configured to affinely transform the Image fusion of the first target area image of the first reference face sub-image in the N reference face sub-images with the second target area image of the first target mask sub-image in the N target mask sub-images to obtain N processing The processed target mask sub-image; the processor 110 is also specifically configured to reconstruct N processed target mask sub-images to generate a second image; wherein, the first reference face sub-image is: N reference people Any one of the face sub-images; the first target mask sub-image is the target mask sub-image corresponding to the first reference face sub-image in the N target mask sub-images; the first target area image is the first reference face sub-image The image of the first target area of , and the first target area is an image area corresponding to the second target area of the first target mask sub-image.

In this way, the image processing device can perform affine transformation on the triangulated N reference face images based on the image feature points to obtain an image similar to the person in the target reference face image, and then perform skin texture transfer to obtain Second image with better skin quality.

Optionally, the image processing processor 110 is further configured to perform guided filtering skin smoothing on each layer of the image of the second image pyramid constructed by the structure processor 110 according to the preset radius and the relative precision of the floating point.

The electronic device provided in the embodiment of the present application can effectively remove the bad texture and transition of the face of the portrait by image processing method to fuse the flawed human face skin in the captured image with the image with better skin quality, It makes the skin of the human face after imaging delicate and clear, and greatly improves the skin quality of the human face after imaging.

It should be understood that, in the embodiment of the present application, the input unit 104 may include a graphics processing unit (Graphics Processing Unit, GPU) 1041 and a microphone 1042, and the graphics processing unit 1041 is used by the image capturing device ( Such as the image data of the still picture or video obtained by the camera) for processing. The display unit 106 may include a display panel 1061, and the display panel 1061 may be configured in the form of a liquid crystal display, an organic light emitting diode, or the like. The user input unit 107 includes a touch panel 1071 and other input devices 1072 . The touch panel 1071 is also called a touch screen. The touch panel 1071 may include two parts, a touch detection device and a touch controller. Other input devices 1072 may include, but are not limited to, physical keyboards, function keys (such as volume control keys, switch keys, etc.), trackballs, mice, and joysticks, which will not be repeated here. Memory 109 may be used to store software programs as well as various data, including but not limited to application programs and operating systems. The processor 110 may integrate an application processor and a modem processor, wherein the application processor mainly processes operating systems, user interfaces, and application programs, and the modem processor mainly processes wireless communications. It can be understood that the foregoing modem processor may not be integrated into the processor 110 .

The embodiment of the present application also provides a readable storage medium, the readable storage medium stores a program or an instruction, and when the program or instruction is executed by a processor, each process of the above-mentioned image processing method embodiment is realized, and can achieve the same To avoid repetition, the technical effects will not be repeated here.

Wherein, the processor is the processor in the electronic device described in the above embodiments. The readable storage medium includes computer readable storage medium, such as computer read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk, etc.

The embodiment of the present application further provides a chip, the chip includes a processor and a communication interface, the communication interface is coupled to the processor, and the processor is used to run programs or instructions to implement the above image processing method embodiment Each process can achieve the same technical effect, so in order to avoid repetition, it will not be repeated here.

It should be understood that the chips mentioned in the embodiments of the present application may also be called system-on-chip, system-on-chip, system-on-a-chip, or system-on-a-chip.

It should be noted that, in this document, the term "comprising", "comprising" or any other variation thereof is intended to cover a non-exclusive inclusion such that a process, method, article or apparatus comprising a set of elements includes not only those elements, It also includes other elements not expressly listed, or elements inherent in the process, method, article, or device. Without further limitations, an element defined by the phrase "comprising a ..." does not preclude the presence of additional identical elements in the process, method, article, or apparatus comprising that element. In addition, it should be pointed out that the scope of the methods and devices in the embodiments of the present application is not limited to performing functions in the order shown or discussed, and may also include performing functions in a substantially simultaneous manner or in reverse order according to the functions involved. Functions are performed, for example, the described methods may be performed in an order different from that described, and various steps may also be added, omitted, or combined. Additionally, features described with reference to certain examples may be combined in other examples.

Through the description of the above embodiments, those skilled in the art can clearly understand that the methods of the above embodiments can be implemented by means of software plus a necessary general-purpose hardware platform, and of course also by hardware, but in many cases the former is better implementation. Based on such an understanding, the technical solution of the present application can be embodied in the form of computer software products, which are stored in a storage medium (such as ROM/RAM, magnetic disk, etc.) , optical disc), including several instructions to make an electronic device (which may be a mobile phone, a computer, a server, or a network device, etc.) execute the method described in each embodiment of the present application.

The embodiments of the present application have been described above in conjunction with the accompanying drawings, but the present application is not limited to the above-mentioned specific implementations. The above-mentioned specific implementations are only illustrative and not restrictive. Those of ordinary skill in the art will Under the inspiration of this application, without departing from the purpose of this application and the scope of protection of the claims, many forms can also be made, all of which belong to the protection of this application.

Claims

An image processing method, the method comprising:

Obtain the target mask image of the face area in the first image;

Based on the binarized face mask image of the target mask image, obtain a reference face image matching the target mask image in the set of reference face images;

Perform image processing on the reference face image and the target mask image to obtain N reference face sub-images and N target mask sub-images;

Fusing the image of the target area of the reference face sub-image with the image of the corresponding area of the target mask sub-image to generate a second image corresponding to the first image;

Wherein, the reference face image set includes a plurality of face mask images processed by skin texture; the face skin quality value of the target area is greater than that of the people in the target mask image corresponding to the target area Face value.
The method according to claim 1, wherein said binary face mask image based on said target mask image acquires a reference face matching said target mask image in a set of reference face images Before the image, the method also includes:

Obtain N face images processed by skin quality, where N is a positive integer;

extracting facial features information of each facial image in the N facial images, and constructing a binarized mask image according to the facial features information; one facial image corresponds to a binarized mask image;

The set of reference human face images is generated based on the N human face images subjected to skin texture processing and a binarized mask image corresponding to each human face image.
The method according to claim 1, wherein said image processing is performed on said reference face image and said target mask image to obtain N reference face sub-images and N target mask sub-images, comprising:

Based on the reference face image, construct the first image pyramid, the first image pyramid includes N reference face sub-images;

Based on the target mask image, construct a second image pyramid, where the second image pyramid includes N target mask sub-images.
The method according to claim 1, wherein the image of the target area of the reference face sub-image is fused with the image of the corresponding area of the target mask sub-image to generate an image corresponding to the first image. Second image, including:

extracting the first feature point of the reference face image and the second feature point of the target mask image;

Obtain the binarized face mask image of each of the N reference face sub-images based on the first feature point, and the M triangles contained in the binarized face mask image of each image Vertex coordinates of each triangle block in the block;

Obtain the vertex coordinates of each triangle block in the K triangle blocks contained in the binarized face mask image of each image of the N target mask sub-images based on the second feature point;

Based on the vertex coordinates, the image of the target area of the reference face sub-image is fused with the image of the corresponding area of the target mask sub-image.
The method according to claim 4, wherein said merging the image of the target area of the reference face sub-image with the image of the corresponding area of the target mask sub-image based on the vertex coordinates comprises:

Based on the vertex coordinates of the K triangular blocks, affine transformation is performed on the vertex coordinates of the M triangular blocks;

The first target area image of the first reference face sub-image in the N reference face sub-images after the affine transformation, and the second target area of the first target mask sub-image in the N target mask sub-images Image fusion is carried out to obtain N processed target mask sub-images;

Reconstructing the N processed target mask sub-images to generate the second image;

Wherein, the first reference face sub-image is: any one of the N reference face sub-images; the first target mask sub-image is one of the N target mask sub-images and the first A target mask sub-image corresponding to the reference face sub-image; the first target area image is an image of the first target area of the first reference face sub-image, and the first target area and the first target mask The image area corresponding to the second target area of the plate image.
The method according to claim 3, wherein, after the second image pyramid is constructed based on the target mask image, the method further comprises:

According to the preset radius and the relative precision of the floating point, guide filter skinning treatment is performed on each layer image of the second image pyramid.
An image processing device, the device comprising: an acquisition module and an image processing module;

The obtaining module is used to obtain the target mask image of the face area in the first image;

The acquiring module is further configured to acquire, based on the binarized face mask image of the target mask image, a reference face image matching the target mask image in the set of reference face images;

The acquiring module is further configured to perform image processing on the reference face image and the target mask image to obtain N reference face sub-images and N target mask sub-images;

The image processing module is used to fuse the image of the target area of the reference face sub-image acquired by the acquisition module with the image of the corresponding area of the target mask sub-image acquired by the acquisition module, and generate an image corresponding to the first mask sub-image. a second image corresponding to the image;

Wherein, the reference face image set includes a plurality of face mask images that have undergone skin texture processing; the face skin quality value of the target area is higher than that of the area corresponding to the target area in the target mask image. Human face skin quality value.
The device according to claim 7, wherein the device further comprises: a generating module:

The acquisition module is also used to acquire N face images processed through skin quality, where N is a positive integer;

The acquisition module is also used to extract facial features information of each facial image in the N facial images, and construct a binarized mask image according to the facial features information; one facial image corresponds to a binarized mask image ;

The generating module is configured to generate the reference based on the N skin texture-processed face images obtained by the obtaining module, and a binary mask image corresponding to each face image obtained by the obtaining module. A collection of face images.
The device according to claim 7, wherein the device further comprises: a building block;

The building module is configured to construct a first image pyramid based on the reference face image, and the first image pyramid includes N reference face sub-images;

The construction module is further configured to construct a second image pyramid based on the target mask image, and the second image pyramid includes N target mask sub-images.
The apparatus according to claim 7, wherein,

The acquisition module is also used to extract the first feature point of the reference face image and the second feature point of the target mask image;

The acquisition module is also used to acquire the binarized face mask image of each of the N reference face sub-images based on the first feature point, and the binarized face mask image of each image. The vertex coordinates of each triangle block in the M triangle blocks contained in the plate image;

The obtaining module is also used to obtain, based on the second feature point, the value of each of the K triangular blocks contained in the binarized face mask image of each of the N target mask sub-images vertex coordinates;

The image processing module is specifically configured to fuse the image of the target area of the reference face sub-image with the image of the corresponding area of the target mask sub-image based on the vertex coordinates.
The device according to claim 10, wherein the device further comprises: a transformation module;

The transformation module is used to perform affine transformation on the vertex coordinates of the M triangle blocks obtained by the acquisition module based on the vertex coordinates of the K triangle blocks obtained by the acquisition module;

The image processing module is specifically used to combine the first target area image of the first reference face sub-image in the N reference face sub-images after affine transformation with the first target in the N target mask sub-images Image fusion is performed on the second target area image of the mask sub-image to obtain N processed target mask sub-images;

The image processing module is further configured to reconstruct the N processed target mask sub-images to generate the second image;

Wherein, the first reference face sub-image is: any one of the N reference face sub-images; the first target mask sub-image is one of the N target mask sub-images and the first A target mask sub-image corresponding to the reference face sub-image; the first target area image is an image of the first target area of the first reference face sub-image, and the first target area and the first target mask The image area corresponding to the second target area of the plate image.
The apparatus according to claim 9, wherein,

The image processing module is further configured to perform guided filtering and microdermabrasion on each layer of the image of the second image pyramid constructed by the construction module according to the preset radius and the relative precision of the floating point.
An electronic device, comprising a processor, a memory, and a program or instruction stored on the memory and operable on the processor, when the program or instruction is executed by the processor, claims 1 to 6 are realized The steps of any one of the image processing methods.
A readable storage medium, storing programs or instructions on the readable storage medium, and implementing the steps of the image processing method according to any one of claims 1 to 6 when the programs or instructions are executed by a processor.
A computer program product, the program product is executed by at least one processor to implement the image processing method according to any one of claims 1 to 6.
An electronic device, characterized in that the electronic device is configured to execute the image processing method according to any one of claims 1 to 6.
A chip, the chip includes a processor and a communication interface, the communication interface is coupled to the processor, the processor is used to run a program or an instruction, and implement the method described in any one of claims 1 to 6 image processing method.