CN113052783A

CN113052783A - Face image fusion method based on face key points

Info

Publication number: CN113052783A
Application number: CN201911382548.6A
Authority: CN
Inventors: 张赟; 肖敬; 张王晟
Original assignee: Hangzhou Shenhui Intelligent Technology Co ltd
Current assignee: Hangzhou Shenhui Intelligent Technology Co ltd
Priority date: 2019-12-27
Filing date: 2019-12-27
Publication date: 2021-06-29

Abstract

The invention discloses a face image fusion method based on face key points, which comprises the following steps: (a) acquiring key points of a human face; (b) and drawing a contour map of non-empty pixel points through coordinate points of eyes, a nose and lips given by the acquired key points of the face, and performing or operating on the contour map and an original map to obtain a mask map. The Mongolian layout refers to a face frame without any other background; (c) when the input reference face has obtained the montage, the target face needs to be aligned to the original reference face; (d) when the two images input by the user have larger color difference, the color of the target face needs to be calibrated to be close to that of the reference face, and (e) after the color correction weighted average of the person, the face of the reference image can be fused. The method can solve the problems of determination of key points of the face, alignment of face orientation, elimination of chromatic aberration between a reference face and a target face and feathering of a face fit edge.

Description

Face image fusion method based on face key points

Technical Field

The invention relates to the field of image fusion, in particular to a face image fusion method based on face key points.

Background

With the development of various technologies, face processing technologies in the aspect of images are more and more extensive. All the large camera software has the functions of beautifying, pasting, changing hair style, fusing and the like. The face image fusion technology used for fusion is to fuse the apparent features of a reference face into a target face, so as to generate new content and style in an image or a video. The technology has wide application in the cultural creative industry, such as movie and television production, digital entertainment, social media and personal image editing. The common realization method uses Photoshop software, but the method requires the image processing technology of operators to have a certain foundation, and in addition, each image processing needs a large amount of time, so that the requirements of efficient and batch processing in the big data era are difficult to meet.

Therefore, some solutions have been proposed, which detect face key points of a reference face and a target face; extracting a reference face and a target face by adopting a mask and face alignment method; then correcting the color according to the color difference of the two colors; and finally, attaching the Poisson clone to the original image after face weighted summation, thereby obtaining a final face fusion image.

However, the face fusion method of the conventional image processing technology is mainly implemented by using a mask and a boundary feathering technology, but the method can achieve a better effect only under the condition that the appearance difference between the target face and the reference face is not large. In addition, when the target face and the reference face have large illumination difference, the method can introduce visual flaws in the fusion area. Additionally, deep learning schemes: the capability of the model to generalize the background is weak, the training data is usually insufficient, the precision of the training data is not high, the training time of the model is long, the resolution of the image output by the model is low, and the practical use requirement cannot be met.

Disclosure of Invention

The invention aims to solve the technical problem of providing a face image fusion method based on face key points, which can solve the problems of determination of the face key points, alignment of face orientations, elimination of chromatic aberration between a reference face and a target face and face fit edge feathering.

In order to solve the technical problems, the technical scheme of the invention is as follows:

a face image fusion method based on face key points is characterized by comprising the following steps:

(a) acquiring key points of a human face;

(b) drawing a contour map of non-empty pixel points through coordinate points of eyes, a nose and lips given by the acquired key points of the face, and performing or operating on the contour map and an original map to obtain a mask map;

(c) when the input reference face has obtained the montage, the target face needs to be aligned to the original reference face;

(d) when two pictures input by a user have a large color difference, the color of a target face needs to be calibrated to be close to that of a reference face;

(e) after the weighted average of the color correction of the character, the face of the reference picture can be fused.

Preferably, step (a) further comprises the steps of:

(a1) firstly, detecting the face of an input picture by using a convolutional neural network;

(a2) and estimating the positions of the key points of the human face from the sparse subset of the pixel intensity by using a cascade regression tree algorithm.

Preferably, the step (a2) further comprises the following steps:

(a21) firstly, randomly selecting two points in a face frame identified by a convolutional neural network;

(a22) calculating pixel values of two points of each picture in the training set to obtain pixel differences of the two points of each picture so as to obtain a splitting threshold of the tree;

(a23) splitting the tree left and right respectively according to a threshold value, and finally dividing all pictures into a left part and a right part;

(a24) after repeating the steps (a21) - (a23) for a plurality of times, storing the result with the best split;

(a25) repeating the steps (a21) - (a24) for each point of the face frame, so as to obtain 68 key points.

Preferably, step (c) further comprises the steps of:

(c1) aligning all samples to the origin;

(c2) for multiple samples in the training set, optionally one sample as a reference sample, the other samples are rotated, scaled and translated to align with them, assuming the selected sample is X1, resulting in a set (X1, X2, X3, … …)

(c3) Calculating an average shape value X0 for the changed shape;

(c4) align the average shape value X0 rotation, scaling, and translation with sample X1;

(c5) rotating, scaling and translating the set (X2, X3, … …) other than X1, in alignment with the adjusted average shape value;

(c6) if the average shape value X0 converges, the shape alignment is complete and the algorithm stops, otherwise loop to step (c 2).

Preferably, the step (c4) can be further configured as the following steps: or standard version processing is performed on the average shape value X0 so that | X0|, 1.

Preferably, step (e) further comprises the steps of:

(e1) firstly, solving a gradient field of a target face to obtain v;

(e2) then solving a gradient field for the background of the reference face to obtain s;

(e3) covering v on s to obtain a gradient field of an image to be reconstructed;

(e4) solving the divergence of the fused image;

(e5) and d, according to a Poisson reconstruction equation Ax, wherein b is the divergence obtained in the step d, and A is a coefficient matrix obtained according to the Poisson equation of the image.

By adopting the technical scheme, the face masks of the respective images are obtained by taking the face characteristic points as the reference, and then the positions of the reference face and the target face are aligned, so that the fusion of the two faces can be better realized; when the target face and the reference face have large illumination difference, the scheme can solve the problem of visual inconsistency caused by face fusion by performing color correction on the target face; finally, the target face is better attached to the background image of the reference face by using the Poisson fusion technology, so that the natural face fusion effect is realized.

Drawings

FIG. 1 is a flow chart of a face image fusion method based on face key points in the invention;

FIG. 2 is a diagram of the recognition effect of key points of a human face in the present invention;

FIG. 3 is a diagram of a face recognition model training process in the present invention;

FIG. 4 is a diagram of a face recognition model prediction process in the present invention;

FIG. 5 is a mask diagram obtained according to key points of a human face in the present invention;

FIG. 6 is a flow chart of Poisson fusion in the present invention.

Detailed Description

The following further describes embodiments of the present invention with reference to the drawings. It should be noted that the description of the embodiments is provided to help understanding of the present invention, but the present invention is not limited thereto. In addition, the technical features related to the respective embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.

The invention provides the following embodiments, and relates to a face image fusion method based on face key points, which comprises the following steps:

step (a): obtaining face key points

In this embodiment, a face keypoint detection technique is used to obtain face keypoints. The face key point detection technology comprises the following steps:

in step (a1), a face of an input picture is first detected using a Convolutional Neural Network (CNN).

As shown in fig. 1, this embodiment provides a flowchart of a method of an operation scheme, which includes inputting two images, analyzing the two images by using the same deep learning face key point detection algorithm, obtaining respective face key point coordinates, and generating a mask according to the key points. And performing face alignment, color correction and weighted summation on the two masks, and finally attaching the obtained fusion mask Poisson clone back to the reference face.

And (a2) estimating the positions of the key points of the human face from sparse subsets of pixel intensity (pixel gray value) by using a cascade regression tree (ERT) algorithm.

As shown in fig. 2, the algorithm of the present embodiment provides 68 coordinate points as key points, which represent the eyebrow, eye, nose, mouth, etc.

In the specific operation flow of the above-mentioned face key point detection technology, a face recognition model training stage shown in fig. 3 and a prediction stage shown in fig. 4 are further included. The method for detecting the positions of the key points of the human face by the cascade regression tree comprises the following steps:

step (a21), firstly, randomly selecting two points in a face frame identified by a convolutional neural network;

step (a22), calculating pixel values of two points of each picture in the training set to obtain pixel differences of the two points of each picture so as to obtain a splitting threshold of the tree;

step (a23), splitting the tree left and right according to a threshold value, and finally dividing all pictures into left and right parts;

step (a24), after repeating steps (a21) - (a23) for a plurality of times, storing the result with the best split;

and (a25) repeating the steps (a21) - (a24) for each point of the face frame, namely obtaining 68 key points.

As shown in fig. 5, after step (a), comprising step (b): and drawing a contour map of non-empty pixel points through coordinate points of eyes, a nose and lips given by the acquired key points of the face, and performing or operating on the contour map and an original map to obtain a mask map. The montage layout refers to a face frame without any other background.

After step (b), comprising step (c): when the input reference face has obtained the montage, the target face needs to be aligned to the original reference face.

The alignment method of the scheme selects a common analysis method, and the basic idea of the common analysis is to minimize the sum of distances from all shapes to an average shape, namely a minimization formula:

the specific alignment steps in step (c) are as follows:

(c1) align all samples to the origin (subtract the mean from each sample x, y coordinate)

(c3) Calculating an average shape value X0 for the changed shape;

(c4) rotate, scale, and translate the average shape value X0 to align with sample X1 (or standard version the average shape value X0 such that | X0| ═ 1);

After step (c), further comprising step (d): when there is a large color difference between the two pictures input by the user, the color of the target face needs to be calibrated to be close to that of the reference face. For the color correction method, the present scheme uses RGB scaled color correction.

Specifically, a specific method of RGB scaled color correction is to implement RGB scaled color correction by constructing a 3 × 3 diagonal matrix. Where R, G and B are the color balanced red, green and blue components of a pixel in the image; and is the red, green and blue components of the image before color balancing, and, is the red, green and blue components of the pixel.

According to the above, the color correction method is as follows:

but each pixel of the image has its own local scaling factor instead of using a constant scaling factor throughout the image. Therefore, only the proper area with the double pupils of the target human face as the center is selected for color correction, so as to eliminate the impurities outside the facial area.

After step (d), further comprising step (e): after the weighted average of the color correction of the character, the face of the reference picture can be fused.

As shown in fig. 6, specific fusion embodiments are as follows:

(e1) firstly, solving a gradient field of a target face to obtain v;

(e4) solving the divergence of the fused image;

(e5) and d, according to the Poisson reconstruction equation Ax ═ b, wherein b is the divergence obtained in the step d. And A is a coefficient matrix obtained according to the Poisson equation of the image.

According to the scheme, the face characteristic points of a target face and a reference face are obtained according to a deep neural network face key point detection algorithm; the face masks of the respective images are obtained by taking the face characteristic points as the reference, and then the positions of the reference face and the target face are aligned, so that the fusion of the two faces can be better realized; when the target face and the reference face have large illumination difference, the scheme can solve the problem of visual inconsistency caused by face fusion by performing color correction on the target face; finally, the Poisson fusion technology is used for better fitting the target face to the background image of the reference face, so that the natural face fusion effect is achieved.

The embodiments of the present invention have been described in detail with reference to the accompanying drawings, but the present invention is not limited to the described embodiments. It will be apparent to those skilled in the art that various changes, modifications, substitutions and alterations can be made in these embodiments without departing from the principles and spirit of the invention, and the scope of protection is still within the scope of the invention.

Claims

1. A face image fusion method based on face key points is characterized by comprising the following steps:

(a) acquiring key points of a human face;

(b) drawing a contour map of non-empty pixel points through coordinate points of eyes, a nose and lips given by the acquired key points of the face, and then combining the contour map with an original map to obtain a mask map;

(d) when two images input by a user have a large color difference, the color of a target face needs to be calibrated to be close to that of a reference face;

2. The method for fusing human face images according to claim 1, wherein the step (a) further comprises the following steps:

3. The method for fusing human face images according to claim 2, wherein the step (a2) further comprises the following steps:

(a23) splitting the tree left and right respectively according to a threshold value, and dividing all pictures into a left part and a right part;

4. The method of fusing human face images according to claim 1, wherein the step (c) further comprises the steps of:

(c1) aligning all samples to the origin;

(c2) for multiple samples in the training set, optionally one sample as a reference sample, and the other samples are rotated, scaled and translated to align with them, assuming the selected sample is X1, a set is obtained (X1, X2, X3, … …)

(c3) Calculating an average shape value X0 for the changed shape;

(c5) rotating, scaling and translating the set (X2, X3, … …) other than X1 to align with the adjusted average shape value;

5. The method of fusing human face images according to claim 4, wherein the step (c4) is further configured as the following steps: or standard version processing is performed on the average shape value X0 so that | X0|, 1.

6. The method of claim 1, wherein the image correction in step (d) is performed by

7. The method of fusing human face images according to claim 1, wherein the step (e) further comprises the steps of:

(e1) firstly, solving a gradient field of a target face to obtain v;

(e4) solving the divergence of the fused image;