CN114240743A

CN114240743A - Skin beautifying method based on high-contrast buffing human face image

Info

Publication number: CN114240743A
Application number: CN202111561470.1A
Authority: CN
Inventors: 何鑫; 杨梦宁; 李小斌; 汪涵; 李亚涛; 向刚; 陈开润
Original assignee: Chongqing Mihong Technology Co ltd
Current assignee: Chongqing Mihong Technology Co ltd
Priority date: 2021-12-15
Filing date: 2021-12-15
Publication date: 2022-03-25
Anticipated expiration: 2041-12-15
Also published as: CN114240743B

Abstract

The invention relates to a skin beautifying method of a human face image based on high-contrast buffing, which comprises the following steps: s1: obtaining an original human face image S, and performing buffing processing on the original human face image by adopting a high-contrast buffing operator to obtain an image B; s2: inputting the image B into an oil removing operator to remove oil to obtain an image C; s3: and inputting the image C into a whitening operator to obtain an image D subjected to whitening treatment to obtain a final skin-beautified image, and outputting the image D. The method of the invention aims at the skin of the human body to be accurately trimmed, and the automatic trimming mode can greatly reduce the workload of a repairman and has better quality effect of the trimmed photos.

Description

Skin beautifying method based on high-contrast buffing human face image

Technical Field

The invention relates to a beautifying method, in particular to a skin beautifying method based on a high-contrast dermabrasion face image.

Background

With the rapid iteration of hardware technology in recent years, no matter a digital camera or a smart phone, the photographing function is continuously improved, the imaging pixels are larger and larger, the definition of the photographed image is higher and higher, and therefore the flaw content such as spots, acne marks, wrinkles and the like of the face skin can be photographed. Aiming at the problem, in the field of portrait beautifying, the skin of the human face needs to be rubbed, so that noise, black spots and flaws in portrait pictures are effectively removed, and the smooth, texture and softening of the face are realized. Whitening of dark and yellowish skin is required to make the skin fair and ruddy.

The beauty algorithms of the skin based on digital image processing include a peeling algorithm, a whitening algorithm, and the like. Arakawa et al propose a dermabrasion algorithm for removing wrinkles and spots of facial skin based on a nonlinear filter, compare the effects of a simple filter and an extended filter on skin treatment, the simple filter can remove roughness and small spots, and the extended filter can remove larger spots. Lee C and the like partition human faces and neck regions based on human face detection and human face alignment technologies, and enhance skins by applying a smoothing filter. Liang L and the like use a region sensing mask to extract a skin region, divide the skin into three layers, namely a smooth layer, an illumination layer and a color layer, through an edge-preserving operator, and finish the beautification of the skin through adjusting parameters. Velusamy S and the like propose that the attribute perception dynamic smoothing filter is guided by the number of skin defects and the roughness of textures, the skin textures are recovered through wavelet band processing, and the beautifying effect is good. S Liu et al uses deep learning to divide makeup into two steps, makeup recommendation and makeup migration, and can adjust the degree of makeup by adjusting the weight of each cosmetic.

Some common beauty software and tools on the market need manual intervention to adjust parameters of functions of peeling, whitening and the like, and some common beauty software and tools use the same parameters for all input photos. The problem caused by the method is that firstly, tools or software which are manually participated are needed, and batch image beautifying processing cannot be automatically completed; secondly, aiming trimming can not be carried out on different photos according to the photographed scene, shadow structure and skin condition; thirdly, if the same picture is repeatedly subjected to operation, the picture is repeatedly subjected to skin grinding and whitening, so that the effect is more and more 'fake', and the texture is more and more poor.

Disclosure of Invention

Aiming at the problems in the prior art, the technical problems to be solved by the invention are as follows: the picture processing effect is good, and the texture of the picture after the skin grinding and whitening is good.

In order to solve the technical problems, the invention adopts the following technical scheme: a skin beautifying method based on a high-contrast dermabrasion face image comprises the following steps:

s1: obtaining an original human face image S, and performing buffing processing on the original human face image by adopting a high-contrast buffing operator to obtain an image B;

s2: inputting the image B into an oil removing operator to remove oil to obtain an image C;

s3: and inputting the image C into a whitening operator to obtain an image D subjected to whitening treatment to obtain a final skin-beautified image, and outputting the image D.

1. Preferably, in S1, the process of performing the peeling process on the original face image S by the high-contrast peeling operator is as follows:

s11: carrying out edge-preserving filtering on the face original image S to obtain an image A;

s12: and performing high contrast calculation on the image A and the original human face image S to obtain an image F, wherein the calculation formula is as follows:

B＝S-Gauss(S,r₁)+128 (1-1)；

s13: radius r for image F₂Carrying out Gaussian blur to obtain an image H;

s14: and performing linear light layer mixing calculation on the image H and the image S to obtain an image E, wherein a linear light mixing calculation formula 1-2 of the layers is as follows:

E＝2*C+S-256 (1-2)；

s15: carrying out skin detection and face detection on the image A to obtain a skin mask image M;

s16: and according to the image F, fusing the image A and the image E to obtain an image G, wherein fused formulas 1-3 are as follows:

G＝A*(1-M)+M*E (1-3)；

s17: according to the parameter r₃And fusing the image G and the image S to obtain a final output image B, wherein a fusion formula 1-4 is as follows:

B＝S*(1-r₃)+G*r₃ (1-4)。

preferably, the S13 represents the radius r of the image F₂The process of obtaining the image H by performing gaussian blurring is as follows:

s131: dividing the image F into grayscale images of three channels of RGB;

s132: and performing Gaussian filtering on all pixel points in the three gray level images obtained in the step S131 by respectively adopting formulas 1 to 5:

wherein, (x, y) represents the coordinates of points in the window, and σ represents a parameter value set artificially;

s133: and iterating the graphs of the three channels obtained by the Gaussian filtering of S132 to obtain an image H.

Preferably, the S15 image a performs skin detection and face detection, and the process of obtaining the skin mask image M of the face is as follows:

s151: establishing a skin segmentation model, wherein the model comprises a backbone network for extracting image characteristics, a generator for generating a mask image and a discriminator for judging the accuracy of the generated mask image;

adopting U-NET + + as a skin segmentation model and ResNet-101 as a backbone network;

acquiring a plurality of face original images, marking the face original images, marking a human body area in each face original image to obtain a first mask image as a first training sample, and marking a skin part in the first training sample to obtain a second mask image as a second training sample;

s152: initializing skin segmentation model parameters;

s153: inputting each first training sample into a backbone network to obtain a plurality of corresponding characteristic graphs with different scales;

s154: inputting a plurality of feature maps with different scales corresponding to each first training sample into a generator to obtain a first prediction mask map of each first training sample;

s155: inputting the first prediction mask image and the first mask image of each first training sample into a discriminator to obtain prediction accuracy, obtaining a suboptimal skin segmentation model and executing the next step when the accuracy reaches a set accuracy threshold, and otherwise, updating parameters of the skin segmentation model and returning to S153;

s156: inputting each second training sample into a backbone network in a suboptimal skin segmentation model to obtain a plurality of corresponding feature maps with different scales;

s157: inputting a plurality of feature maps with different scales corresponding to each second training sample into a generator in a suboptimal skin segmentation model to obtain a second prediction mask map of each second training sample;

s158: inputting the second prediction mask image and the second mask image of each second training sample into a discriminator in the suboptimal skin segmentation model to obtain prediction accuracy, obtaining the optimal skin segmentation model and executing the next step when the accuracy reaches a set accuracy threshold, and otherwise, updating parameters of the skin segmentation model and returning to S156;

s159: and inputting the image A into the optimal skin segmentation model to obtain a skin mask image M.

Preferably, the process of the oil polishing operator in S2 removing oil from the image B is as follows:

s21: classifying the skin of the image B, and determining the oil light level of the image B;

the oil gloss grades are classified into a fourth-level oil gloss, a third-level oil gloss, a second-level oil gloss and a first-level oil gloss, and each oil gloss grade is assigned with a value in sequence;

s22: calculating the maximum chroma sigma at each pixel point of the image B_maxAnd storing the image B as a gray image I;

s23: calculating the maximum value lambda of the approximate diffuse reflection chromaticity at each pixel point of the image B_maxAnd storing it as a grayscale image II;

s24: using the gray level image II as a guide image, applying a joint bilateral filter to the image gray level image I, and storing the filtered image as a preprocessed image;

s25: calculating σ for each pixel p in the preprocessed image_max(p) comparison

And σ_maxTaking the maximum value as shown in equations 2-17:

s26: repeating steps S24 and S25 until each pixel

Executing the next step;

s27: determining sigma of each pixel point p in a preprocessed image_max(p) the channel in RGB selected, the pixel of the selected channel of each pixel point p is sigma_max(p) multiplied by 255, and then iterating the pixels of the two unselected channels of each pixel point p and the pixels of the selected channel to obtain a preprocessed image;

s28: and performing skin type classification on the preprocessed image to determine the oil light level, outputting the preprocessed image as an image C when the oil light level of the preprocessed image is lower than that of the image B in S21 and not more than a preset oil light level threshold, otherwise, returning to the step S22, and updating the image B by using the preprocessed image.

Preferably, the maximum chromaticity σ at each pixel point of the image B is calculated in S22_maxThe process of (2) is as follows:

the reflected light color J in RGB color space is represented as a diffuse reflectance value J^DAnd specular reflectance value J^SLinear combination of colors, formula 2-5:

J＝J^D+J^S (2-5)；

defining chrominance as a color component σ_cThe formula is 2-6:

wherein c is ∈ { r, g, b }, J_cRepresenting the reflected light color;

diffuse reflectance chromaticity Λ_cAnd illumination chromaticity r_cEquations 2-7 and equations 2-8 are defined as follows:

wherein,

which represents the diffuse reflection component of the light,

representing a diffuse reflection component;

according to the above formula, the reflected light color J is_cDefined as formulas 2-9:

wherein u represents a layer, and u can be an r layer, a g layer or a b layer,

representing the diffuse reflection component in the layer u,

representing the diffuse reflection component in the layer u;

using white estimation with illumination chromaticity, the input image B is normalized

And

Γ_r，Γ_gand Γ_bRespectively representing the illumination chromaticity of the r, g and b layers,

and

respectively representing the specular reflection values of the r, g and b layers;

then the diffuse reflection assembly according to the previous formula is as shown in formulas 2-10:

wherein,

representing the diffuse reflection value of the c-th image layer;

the maximum chroma is defined by equations 2-11:

σ_max＝max(σ_r，σ_g，σ_b) (2-11)；

wherein σ_r，σ_g，σ_bRepresenting the maximum color components of the r, g and b layers, respectively;

the maximum diffuse reflectance chromaticity is defined as equation 2-12:

Λ_max＝max(Λ_r，Λ_g，Λ_b) (2-12)；

wherein, Λ_r，Λ_g，Λ_bRespectively representing the maximum diffuse reflection chroma of the r layer, the g layer and the b layer;

the diffuse reflection component may be Λ_maxExpressed as equations 2-13:

Λ_maxin the range of

Preferably, in S22, the maximum value λ of the approximate diffuse reflectance chromaticity at each pixel point of the image B is calculated_maxThe process of (2) is as follows:

let sigma_min＝min(σ_r，σ_g，σ_b) Using λ_cTo estimate Λ_cThe equations 2-14 are calculated as follows:

λ_cintermediate variables, with no actual meaning;

approximate diffuse reflectance chromaticity λ_cAnd true diffuse reflectance chromaticity Λ_cThe relationship between them is described as 1) and 2).

1) For any two pixels p and q, if Λ_c(p)＝Λ_c(q), then λ_c(p)＝λ_c(q)

2) For any two pixels p and q, if λ_c(p)＝λ_c(q), then only if Λ_min(p)＝Λ_minWhen (q) is higher than_c(p)＝Λ_c(q)

The maximum value of the approximate diffuse reflectance chromaticity is the formula 2-15:

wherein λ is_r，λ_g，λ_bThe calculated variables representing the layers r, g and b, respectively, have no actual meaning;

filtered maximum chromaticity σ using the approximate maximum diffuse reflectance chromaticity value as a smoothing parameter_maxEquations 2-16 are calculated as follows:

wherein,

meaning that the calculated variable for pixel point p has no actual meaning,

and

are typically gaussian distributed spatial and distance weighting functions.

Preferably, in S24, the process of applying the joint bilateral filter to the image grayscale image I as the guide image is as follows:

wherein, I_D(i, j) represents the pixel value of the pixel point with the coordinate (i, j) after the joint bilateral filtering, (k, l) represents the pixel coordinate of other points in the filtering window,

the pixel value of the center point is represented,

the pixel values of the rest nodes are shown, and w (j, j, k, l) is a parameter for multiplying a Gaussian distribution space function and a Gaussian function of the similarity of the pixel intensity;

the joint bilateral filter is defined as follows:

is that

This part is related only to the coordinates of the pixel points p (i, j) and q (k, l),

by substituting into the formula_max(q) is equal to the portion of I (k, l) in the bilateral filter, representing the pixel value at point q.

Preferably, the whitening process of the image C by the whitening operator in S3 is as follows:

s31: performing skin type classification on the image C, wherein skin color grades are classified into four classes of four, namely four, three, two and one, and each skin color grade is sequentially assigned with values of beta which is 3, 2, 1 and 0;

s32: carrying out layer-by-layer normalization on the pixel values of the R, G, B channel of the image C, wherein the normalization mode is as follows:

wherein f (x, y) represents the pixel value of each pixel point of the input image C, and w (x, y) is equal to [0,1] to represent the output image C ', and the output image C' has R, G, B three layers in total;

s33: the input image C' is enhanced using the following formula, as in formula 3-2:

where w (x, y) is the input image C', v (x, y) is the output image D after whitening, and β is a parameter for controlling the degree of whitening.

Compared with the prior art, the invention has at least the following advantages:

1. most automatic photo trimming modes in the market at present are all carried out on the whole photo, but the method of the invention is used for carrying out accurate trimming on human skin, the workload of a repairman can be greatly reduced through the automatic trimming mode, and the quality effect of the trimmed photo is better.

2. The high contrast dermabrasion algorithm utilizes a filter to better soften dry, rough skin. Compared with other buffing methods, the high-contrast buffing method has a good edge-protecting buffing effect, although the process is complex, the speed of the buffing method is increased along with the increase of the calculation force of gpu and cpu, so the high-contrast buffing method has a good application effect, is suitable for delicate adjustment of human face images, does not have distortion and false effects, and is suitable for application.

Drawings

FIG. 1 is a block flow diagram of the method of the present invention.

Fig. 2 is a schematic structural diagram of a skin segmentation model.

Fig. 3 is a schematic diagram of a polygonal outer frame of a face and a region of interest.

Fig. 4 is a schematic diagram of skin tone grading.

FIG. 5 is a schematic representation of oil light fractionation.

Fig. 6 is a schematic view of wrinkle classification.

Fig. 7 is a schematic illustration of pore grading.

FIG. 8 is a block flow diagram of the method of the present invention.

Fig. 9a is a face artwork.

FIG. 9b is a graph after polish-removing operator processing.

Fig. 10 is a graph comparing whitening parameter curves.

Fig. 11a is a face original.

Fig. 11b is a diagram after the whitening operator processing.

Fig. 12a is a face artwork.

Fig. 12b is a diagram after the whitening operator processing.

Fig. 13 is a comparison of skin grading before and after beauty.

Detailed Description

The present invention is described in further detail below.

A skin beautifying method based on a high-contrast dermabrasion face image comprises the following steps:

s3: and inputting the image C into a whitening operator to obtain an image D subjected to whitening treatment to obtain a final skin-beautified image, and outputting the image D. Specifically, the process of performing the peeling treatment on the original human face image S by the high-contrast peeling operator in S1 is as follows:

B＝S-Gauss(S，r₁)+128 (1-1)；

s13: radius r for image F₂Carrying out Gaussian blur to obtain an image H; the default value for the radius parameter is 3.

E＝2*C+S-256 (1-2)；

G＝A*(1-M)+M*E (1-3)；

B＝S*(1-r₃)+G*r₃ (1-4)；

specifically, when the edge-preserving filtering is performed on the original face image S in S11 to obtain the image a, any existing filtering method may be used to perform, for example, surface filtering, and the radius parameter is r 1;

natural images all have noise, for example, due to the influence of illumination, environment, etc., a small portion of sharp pixel points on a human face is formed, and noise and edges (edges are, for example, the boundary between skin and eyes, and the difference of pixels at the edge portion is large) are similar in some aspects, and a general filter cannot distinguish noise and edges, so that the noise and the edges are uniformly processed, and in many cases, the edges are also processed and blurred while filtering. An edge-preserving filter is a type of filter that can effectively preserve edge information in an image during filtering.

The invention adopts a bilateral filter, is a nonlinear filtering method of the bilateral filter, is a compromise treatment combining the spatial proximity and the pixel value similarity of an image, and simultaneously considers the spatial domain information and the gray level similarity to achieve the purpose of edge-preserving and denoising. The bilateral filter can achieve smooth denoising and well preserve edges. The calculation formula of bilateral filtering is as follows:

where I '(I, j) represents the pixel value of the point I' after bilateral filtering, where (I, j) represents its coordinates, which are typically (0, 0), since the point is centered to form a half-circleDiameter r₁Square of radius r₁Refers to the value of the outward expansion of the pixel origin, so the side length of the square is 2r₁+1, where f '(k, l) is the pixel value of the other point in the square except for point I, and w' (I, j, k, l) ═ w_r* w_s，w_rIs a Gaussian function of the similarity of pixel values, w_sIs a spatially adjacent gaussian function. Through bilateral filtering, the function (limited function) of removing noise in the S picture (original picture) and simultaneously saving edge pixels can be effectively achieved. High contrast calculation and skin detection at the later stage are facilitated.

Specifically, the process of performing edge preserving filtering on the face original image S to obtain the image a is as follows:

because the skin is subjected to fuzzy processing, details, five sense organs, edges, contours and the like of the human face are weakened, and after the image is processed by a common filter, the image is too fake and loses texture. The bilateral filtering can be performed by edge preservation, so that in the skin-grinding algorithm, the skin quality can be preserved while skin flaws and spots are weakened, and the appearance of an output image is fine and smooth. The formula is defined as 1-6:

I^filtered(x) Pixel value, x, representing a point after bilateral filtering_iIs a pixel point in the picture, I (x)_i) The pixel value (also called pixel brightness value, because it needs to do the bilateral filtering operation to R, G, B three layers of color picture, each layer is gray image, the value range is 0-255), the omega represents the plane set of the point composed of a certain central point and a square with radius r, f_rThe range kernel i (x) indicates the pixel value of the center point.

Examples are as follows: in the color picture formed by R, G, B, a layer is randomly extracted and operated (gray image), and it is assumed here that the radius of bilateral filtering is 1, i.e. a 3 × 3 square is formed, as shown in table 1:

where (#) denotes coordinates, corresponding to x and y, and the lower value represents the pixel value (intensity) of the image

The bilateral filtered pixel values for the coordinate (i, j) point are calculated according to equations 1-6 (ps: calculating which point the bilateral filtered pixel values for which point make a square with radius 1 centered on which point, with its coordinates set to (0, 0)), i.e. we now want to calculate the pixel with coordinate (0, 0).

Calculating w according to equations 1-7₁(i, j, k, l), where i and j correspond to the (0, 0) point and k and l correspond to the coordinate values of the other portions in the square. Sigma_dValue set manually for oneself, I₁(I, j) is I₁(0, 0) is the pixel value (intensity) of the center point, and similarly, I₁(k, l) represents the pixel value (intensity) σ of the other points in the square_rAlso for manually set values:

corresponding pixel value of the corresponding point to w of the point according to formulas 1-8₁And (j, j, k, l) adding (9 points are total, the original point is substituted into a formula to obtain the pixel value which is the original value, and the pixel value is added with the bilateral filtered values of other 8 points) to obtain the bilateral filtered pixel value (intensity) of the point.

Normalization function W_pDefined as equations 1-9:

using spatial proximity (spatial kernel g)_s) Sum intensity difference (range kernel f)_r) To assign a value to the weight W_p。

W_p＝g_s*f_r (1-12)

Setting the coordinates of the pixel being denoised in the image to (i, j), using the coordinates of one of the surrounding pixels to (k, l), and assigning the pixel (k, l) with the weight equations 1-13 for the denoised pixel (i, j) assuming that the range kernel and the spatial kernel are gaussian kernel functions as follows:

wherein σ_dAnd σ_rIs a smoothing parameter, I₂(I, j) and I₂(k, l) are the intensities of pixel (i, j) and pixel (k, l), respectively, which are normalized after the weights are calculated, and equations 1-14 are:

wherein, I₃(i, j) is the de-noising intensity of pixel (i, j).

In the experimental process, the results show that although the skin-polishing algorithm based on the bilateral filter has good effect, when the skin has serious wrinkles, acne marks and other flaws, the parameters of the filter need to be increased, and when the intensity parameters are too large, the algorithm speed is greatly influenced. Therefore, the implementation of the peeling algorithm based on the improved lee filter will be presented below.

Specifically, the S13 represents the radius r of the image F₂The process of obtaining the image H by performing gaussian blurring is as follows:

gaussian blurring is gaussian filtering, i.e. convolution of a (grey scale) image with a gaussian kernel, where theoretically the gaussian distribution has non-negative values over all domains of definition, which requires an infinite convolution kernel. In the actual calculation process, the convolution kernel is fixed in size, only the central point to be calculated needs to be taken as the original point, the surrounding points are distributed with weights according to a normal distribution function, and the weighted average value is calculated to obtain the final value.

S131: dividing the image F into grayscale images of three channels of RGB;

where (x, y) represents the coordinates of a point within the window and σ represents a parameter value (hyper-parameter) that is set artificially.

In specific embodiments, the radius r₂Since 3, a square with a side length of 2 × 3+1 should be designed, but for the sake of example, r is given below_2＝1The square of (2), i.e. a 3 x 3 squared box, is used to illustrate the process of gaussian blur, and its position information is shown in the squared box, and the coordinate information of the central point is (0, 0).

TABLE 1

(-1，1)	(0，1)	(1，1)
			(-1，0)	(0，0)	(1，0)
(-1，-1)	(0，-1)	(1，-1)

The coordinate information of each point in the table is substituted into x and y corresponding to the above formula, and σ is set to 1.5 artificially, and the calculated information is shown in table 2:

TABLE 2

0.0453	0.0566	0.0453
			0.0566	0.0707	0.0566
0.0453	0.0566	0.0453

Then, normalization is performed by summing 0.4783 the matrix and dividing 0.4783 the 9 values of the matrix to obtain the final convolution kernel (weight matrix).

TABLE 3

0.0947	0.1183	0.0947
			0.1183	0.1478	0.1183
0.0947	0.1183	0.0947

The gaussian kernel is calculated through the above steps, and a gaussian filtering operation can be performed based on the gaussian kernel. Assuming that there are 9 pixels, the gray-scale values (0-255) are shown in table 4:

TABLE 4

14	15	16
			24	25	26
34	35	36

And multiplying the value of each pixel point by the corresponding weight value in the phase Gaussian kernel to obtain a final distribution value. Assume that the pixel values of the picture are as described in table 5:

TABLE 5

The values of the pixels to be multiplied by the gaussian kernel are shown in table 6.

TABLE 6

1.3258	1.7745	1.5152
			2.8392	3.6950	3.0758
3.2198	4.1405	3.4092

This process is repeated for all the pixel points in the picture, i.e. each pixel point will get 9 gaussian filter values, which are added together to get the image after gaussian filtering.

The original gaussian is 3 in radius, so the gaussian kernel should be 7 × 7 squares, and the gaussian filtered image can be obtained by the same method.

Specifically, the process of performing skin detection and face detection on the S15 image a to obtain a skin mask image M of a face is as follows:

the method comprises the steps of obtaining a plurality of original human face images, marking the original human face images, marking a human body area in each original human face image to obtain a first mask image as a first training sample, and marking a skin part in the first training sample to obtain a second mask image as a second training sample.

First-time mask diagram: a mask of a human body region including hair, clothes, skin, and the like is prepared using 1000 portrait photographs, and the color of the human body region is set to white, R-G-B-255, and the color of the non-human body region is set to R-G-B-0. The second mask image was created by masking the data set with the same 1000 portrait photographs, and the color of the skin area was designated as R ═ G ═ B ═ 255, and the color of the non-skin area was designated as R ═ G ═ B ═ 0.

S152: skin segmentation model parameters are initialized.

S153: inputting each first training sample into a backbone network to obtain a plurality of corresponding characteristic graphs with different scales; (each feature map includes a plurality of features, such as content, color, gradient, and texture in training sample No. 1).

S154: and inputting the feature maps of different scales corresponding to each first training sample into a generator to obtain a first prediction mask map of each first training sample.

S155: and inputting the first prediction mask image and the first mask image of each first training sample into a discriminator to obtain the prediction accuracy, obtaining a suboptimal skin segmentation model and executing the next step when the accuracy reaches a set accuracy threshold, and otherwise, updating parameters of the skin segmentation model and returning to S153.

the segmentation algorithm is expressed in units of seconds, and the shorter the time, the faster the segmentation speed. The evaluation index of the segmentation accuracy uses an Intersection ratio (IoU), and the calculation formula is 3.1

Where | represents the number of pixels of the set, a represents the set of predicted mask maps, and B represents the set of actual mask maps. IoU, the value range of the index is [0,1], when the value is closer to 1, the higher the coincidence rate of the two sets is, the higher the accuracy rate is.

In the field of post-processing of portrait photos, because people are influenced by various aspects such as illumination, environment, camera exposure, angle, photographer level and the like, the skin image boundary is fuzzy, the gradient is complex, and background interference is large, so that the traditional image segmentation algorithm is low in accuracy and stability. Through comparative analysis, the strong capability of the U-Net + + model in deep learning on image segmentation is found, and the method is suitable for human skin image segmentation tasks.

The invention provides a method for segmenting the portrait skin by a skin segmentation model, namely a U-Net + + model, which is a high-efficiency, accurate and practical algorithm. The interference of similar color backgrounds and similar structural features can be effectively avoided, the human skin part in the image can be completely deducted, and the corresponding mask image and the corresponding skin image layer are generated to provide input for the subsequent skin beautifying model.

The backbone network of U-Net + + uses ResNet101, the structure of the model is shown in FIG. 2, a portrait photo is input, an image A in the structure is learned and trained by the U-Net + + model, and a skin mask image M is output.

The skin grinding needs to satisfy the functions of fading moles, acne marks, flaws, spots and the like, retain the details of the face, the outlines of five sense organs, the contrast between shadows and illumination and weaken the mark of beauty treatment. Therefore, in the commercial repair picture of Photoshop software, the method of good edge protection effect and fine texture of improved surface filtering is adopted.

Specifically, the process of removing oil from the image B by the oil polishing operator in S2 is as follows:

s23: calculating each pixel point of image BAt a maximum value λ of the approximate diffuse reflectance chromaticity_maxAnd storing it as a grayscale image II;

for example: in general, a color image has three layers of R G B, each layer has a pixel value (0-255), for example, a pixel value of a certain point is (1, 2, 5)

Storing the value as a gray scale map (the gray scale map has only one layer, so the value can be regarded as the value of one pixel, i.e. the image B is obtained, and the same principle is that according to lambda_maxFormula (2)

Take the above point as an example to obtain lambda_maxIs composed of

This value is stored as a pixel point of the gray scale map.

And σ_maxTaking the maximum value as shown in equations 2-17:

s26: repeating steps S24 and S25 until each pixel

Executing the next step;

s27: determining sigma of each pixel point p in a preprocessed image_max(p) the channel in RGB selected, the pixel of the selected channel of each pixel point p is sigma_max(p) multiplied by 255, and then iterating the pixels of the two unselected channels of each pixel point p and the pixels of the selected channel to obtain a preprocessed image; the preprocessed image to be substituted into the calculation is a gray scale image, and the preprocessed image is sigma which is updated iteratively_max(p) x 255+ pixels of the other two channels form a three-layer RGB color image, e.g., σ for pixel p_max(p) if the R channel is selected, the pixel of the pixel point p in the R channel is sigma_max(p) x 255, and then the pixels of the pixel point p in the G channel and the B channel and the pixel sigma of the R channel_maxAnd (p) multiplied by 255 are iterated, and the three layers of RGB color images, namely the preprocessed images, are obtained by repeating the operation on all the pixel points p.

Specifically, the maximum chromaticity σ at each pixel point of the image B is calculated in S22_maxThe process of (2) is as follows:

J＝J^D+J^S (2-5)；

defining chrominance as a color component σ_cThe formula is 2-6:

wherein c is ∈ { r, g, b }, J_cRepresenting the reflected light color;

diffuse reflectance chromaticity Λ_cAnd an illumination chromaticity Γ_cAre defined as the following formulae 2 to 7 andequations 2-8:

wherein,

which represents the diffuse reflection component of the light,

representing a diffuse reflection component;

wherein u represents a layer, and u can be an r layer, a g layer or a b layer,

representing the diffuse reflection component in the layer u,

representing the diffuse reflection component in the layer u;

And

and

wherein,

representing the diffuse reflection value of the c-th image layer;

the maximum chroma is defined by equations 2-11:

σ_max＝max(σ_r，σ_g，σ_b) (2-11)；

the maximum diffuse reflectance chromaticity is defined as equation 2-12:

Λ_max＝max(Λ_r，Λ_g，Λ_b) (2-12)；

the diffuse reflection component may be Λ_maxExpressed as equations 2-13:

Λ_maxin the range of

Specifically, theThe maximum value λ of the approximate diffuse reflectance chromaticity at each pixel point of the image B is calculated in S22_maxThe process of (2) is as follows:

λ_cintermediate variables, with no actual meaning;

1) For any two pixels p and q, if Λ_c(p)＝Λ_c(q), then λ_c(p)＝λ_c(q)

2) For any two pixels p and q, if λ_c(p)＝λ_c(q), then only if A_min(p)＝A_minWhen (q) is higher than_c(p)＝Λ_c(q)

wherein,

meaning that the calculated variable for pixel point p has no actual meaning,

and

are typically gaussian distributed spatial and distance weighting functions. .

Specifically, the grayscale image II in S24 is used as a guide image, and a joint bilateral filter is applied to the image grayscale image I

The filtering process is as follows:

the pixel value of the center point is represented,

pixel values of the remaining nodes are indicated, w (j, j, k, l) is a parameter of the multiplication of the gaussian distribution space function and the gaussian function of the similarity of pixel intensities.

The joint bilateral filter is defined as follows:

is that

The joint bilateral filter is applied to sigma_maxThe gray-scale image is continuously updated iteratively (in the algorithm, continuous iteration)

A final sigma of the algorithm flow_maxA gray scale map of values, consisting of σ_max＝max(σ_r，σ_g，σ_b) Is defined as follows, see_maxIs a decimal number between 0 and 1, and comes from the channel with the largest ratio of chromatic values in the r g b channel, while for the whole graph, the r g b pixel value of each point is changed, so that for the whole graph, the calculation of the sigma value of the rgb channel is equivalent to the calculation of the sigma value of the rgb channel at the same time, and after iteration, we obtain a total sigma_maxA gray scale map of the composition, will_maxAnd multiplying the pixel by 255+ to obtain the image C without highlight by the remaining two channels of the pixel.

And taking a face image as an effect test, wherein the skin oil-light grading is a three-level grade, so that the oil-light removal parameter is set to be 2, and the effect of the face original image and the effect after the oil-light removal operator treatment are shown in fig. 9a and 9 b. The left image is the original image, and the right image is the image after the matte operator treatment. As can be seen from fig. 9a and 9b, the oil removing operator implemented based on the mirror highlight removing algorithm provided by the present invention has an obvious effect of removing the skin reflection and the oil light in the forehead and the left cheek region.

Specifically, the whitening process of the whitening operator in S3 on the image C is as follows:

where w (x, y) is the input image C', v (x, y) is the output image D after whitening, and β is a parameter for controlling the degree of whitening. The effect of the algorithm on the adjustment of the luminance component when β takes 1, 2, 3, respectively, is shown in fig. 10. The parameter is mapped to a skin color classification result. When the classification is four levels, β has a value of 3; when the classification is tertiary, β has a value of 2; when the classification is secondary, β has a value of 1; when the classification is one-level, β has a value of 0, and no calculation is performed.

Experiment and analysis of whitening operator effect:

the experimental result is divided into three parts, and the difference of comparison graphs before and after skin beautification is displayed by sampling and comparison. And then verifying the effectiveness of the whole algorithm by defining the algorithm action rate. And finally, verifying whether the algorithm achieves the quality of post-processing of the professional portrait or not through blind test scoring of the professional.

Sampling comparison

The faces of a woman and a man are respectively taken as displays, an original image is shown in fig. 11a and fig. 12a, and a processed image is shown in fig. 11b and fig. 12b, so that the processing effect of a skin beautifying model consisting of operators for skin grinding, skin whitening, oil and light removal and the like on the images is obvious, and the skin quality and the appearance are greatly improved.

Algorithm rate of action

The test set contains 145 pieces of face information, and the comparison of the classified data amounts of different indexes before and after beautifying is shown in table 4.1. For the index of each skin type, the number/total number of the first-stage classification after beautifying is defined to represent the action rate of the algorithm, as shown in a formula 4.18, namely the proportion of the algorithm which achieves the skin beautifying effect after acting on the current data set. The algorithm action rate can directly quantify the effect realized by the model provided by the invention, and whether the effect can meet the target and the requirement of automatic skin beautifying and repairing.

Where p is the algorithm's contribution rate, Count₁Is the number of pictures classified at one level, Count_iThe total number of the pictures classified in the first level, the second level, the third level and the fourth level.

TABLE 7 test set skin beautifying algorithm action rate

As can be seen from table 3, the processing of the skin color, the gloss, the wrinkles and the pores by the whitening, polishing and buffing algorithms respectively reaches 97.24%, 97.93%, 96.55% and 95.17%, which shows that the skin segmentation, skin type classification and skin beautifying algorithms based on deep learning provided by the invention have obvious skin beautifying effect and are suitable for automatic skin beautifying processing task in large-scale scene.

③ professional blind test

In order to further verify the effect of the overall algorithm, a professional diagraph reviewer is invited to carry out blind test scoring on the original images and the beautified images of the test set, labels of all the images are removed before scoring, only disordered digital numbers are left, and the scores are counted according to the numbers after scoring.

The scoring criteria were skin beautification, texture retention, fair skin tone, blemish removal. The scoring results are shown in fig. 13 below.

As shown in fig. 13, 145 human face skin types were scored at 10 points, and the average score of the original figure was 7.79, and the average score of the figure after skin makeup was 9.09. The original image shows a large score fluctuation and a low general score due to factors such as photographer level, light shading, and skin condition. The score of the graph after skin beautifying is high, the graph is uniformly distributed, the classification of the algorithm on the skin is accurate, the parameter for guiding the skin beautifying algorithm is not deviated, and the skin beautifying algorithm has good effect.

Based on self-evaluation results and blind test scoring of professional diagraphers, the invention is proved to have obvious effects on the aspects of buffing effect, texture retention, whitening effect, oily light removal and the like.

The method for classifying the skin types of the original human face image comprises the following steps:

defining a plurality of characteristic points in the original image of the human face, connecting all the characteristic points in sequence to form a polygon, and defining the obtained mask as a complete human face area as M_pointsThe mask for the skin region of the whole body of the human body is M_humanMask of human face skin area is M_face，

M_face＝M_points∩M_human (3-3)；

And obtaining aligned 81 feature points by using a TensorFlow-based deep neural network face detection algorithm provided by OpenCV and a face alignment algorithm proposed by Adrian Bulat. Sequentially connecting points of the outermost frame of the face to form a polygon, wherein the obtained mask is a complete face area and is defined as M_pointsAs shown by the outer frame polygon of fig. 3.

The human face is affected by factors such as hair, glasses, ornaments, light shadow and the like, so that the skin type classification is inaccurate, and therefore, on the basis of key point positioning segmentation, intersection needs to be obtained with the result of whole-body skin segmentation to obtain the final human face skin area.

S32: four-dimensional classification is carried out on the mask image of the human face skin area according to skin color, oil light, wrinkles and pores, and the four-dimensional classification is as follows

The skin types of human skin are various and can be divided into a plurality of types according to four dimensions of skin color, gloss, wrinkles and pores. In the beauty task, firstly, the skin type is judged, and then parameters of an algorithm for processing different flaws are determined.

Skin color: at present, research related to human skin color mainly focuses on the fields of medical diagnosis, face comparison, expression recognition and the like, and the grade subdivision of the skin color provided by the invention is to better determine parameters of a beautifying algorithm and is different from a standard skin color grading standard. In the portrait photography, the skin color of the same person can present different results due to differences of illumination, shooting equipment, shooting parameters and the like. The invention thus classifies skin tones based on the shade and color of the image reflection, rather than the human body itself.

The skin color grades are divided into four classes of four, three, two and one, and each skin color grade is assigned with 1, 2, 3 and 0 in sequence. The four-level skin color is dark skin color or dark skin color caused by light shadow during shooting, the three-level skin color is yellow skin color caused by yellow skin color, ambient light or white balance setting and the like, the two-level skin color is white skin color caused by white skin color or shooting overexposure and the like, and the one-level skin color is normal skin color type which does not need to be adjusted, as shown in fig. 4.

The gloss grades are classified into four-level gloss, three-level gloss, two-level gloss and one-level gloss, and each gloss grade is assigned with 1, 2, 3 and 0 in sequence.

In the portrait photography, the face highlight region is a region having the highest L-average value in the Lab color space. The degree of exposure of the photograph can be determined from the L value of the highlight region, and is generally classified into underexposure, normal exposure, and overexposure. In the later trimming process, the under-exposed and over-exposed photos need to be brightened and brightened respectively.

Because oily skin secretes grease, the grease reflects during imaging, which causes the phenomenon of reflecting in the highlight area of human face, therefore, the highlight area often appears along with the highlight area. And determining parameters of the oil removing polishing algorithm through classification of the oil polishing grade.

The four-level oil light means that grease is secreted much, and the reflection degree of the portrait is high; the first-order gloss is the secretion of a small amount of oil from the skin, and the human image has no reflection phenomenon, as shown in fig. 5.

The wrinkle grades are divided into four grades, three grades, two grades and one grade, and each wrinkle grade is sequentially assigned with 1, 2, 3 and 0.

Wrinkles may appear in different grades due to the person being at different age stages. A plurality of wrinkle quantitative determination methods based on computer vision are proposed at home and abroad, and are greatly influenced by illumination, shadow, resolution and the like during image shooting, and the detection effect is unstable. The emphasis of the dermabrasion algorithm is on wrinkles in the skin, so that the accuracy of the grading of wrinkles directly determines the effectiveness of the dermabrasion algorithm. The fourth level characterizes the level with the most wrinkles, the deepest texture, and the final level, and the first level characterizes the level with few wrinkles, very light texture, and the lowest level, as shown in fig. 6.

Pore grades are divided into four grades, three grades, two grades and one grade, and each pore grade is sequentially assigned with 1, 2, 3 and 0.

Rough skin is also the content of the key treatment of the dermabrasion algorithm. The size and size of pores in the skin reflect whether the skin is smooth and fine. The skin conditions of different people vary greatly, and the skin is divided into four grades, three grades, two grades and one grade according to the roughness degree. The fourth level represents the rough, pore-evident level, and the first level represents the smooth, fine level, as shown in fig. 7.

S33: in the original image of the face, selecting four parts of the forehead, the left cheek, the right cheek and the chin of the face as interested areas, setting the weight of each area divided into skin color, gloss, wrinkles and pores, then calculating the grade assignment of the four parts by adopting the following formula, wherein the value of sigma is equal to the grade assignment:

wherein,

γ＝1，23, 4 represent the weights of skin colors in four regions of the forehead, the left cheek, the right cheek and the chin,

gamma is 1, 2, 3, 4, which respectively represents the weight of the oil light in four areas of the forehead, the left cheek, the right cheek and the chin,

γ is 1, 2, 3, 4, which represents the weight of wrinkles in four regions of the forehead, left cheek, right cheek and chin,

γ is 1, 2, 3, 4, which indicates the weight of the pore in four regions of the forehead, left cheek, right cheek, and chin, respectively.

In the portrait photo, after a face rectangular frame is detected and key points of the face are aligned, an interested area is selected, and parameters of a beautifying algorithm are finally determined according to the skin classification indexes.

When the skin is classified according to indexes, the skin classification weights of different areas of the human face are different, the forehead highlight area is usually an area with heavy oil and bright skin color, the cheek area is usually an area with heavy oil and heavy wrinkles, and the chin area is usually an area with light oil and light wrinkles. In order to always select a skin region which is not influenced by factors such as illumination shadow, shooting angle and the like, four parts of the forehead, the left cheek, the right cheek and the chin of a human face are selected as regions of interest, and when index calculation is carried out on the four regions, a weight matrix shown in the following table 8 is set according to experience.

TABLE 8 weight table of skin type indexes of interested area of face

	Forehead head	Left face	Right face	Jaw
					Skin tone	0.35	0.25	0.25	0.15
Oil polish	0.4	0.2	0.2	0.1
					Wrinkle (wrinkle)	0.2	0.3	0.3	0.2
Pores of skin	0.2	0.3	0.3	0.2

The forehead, the left cheek, the right cheek and the chin of the human face as the interested regions can be extracted in the following way:

table of key points of human faceIs expressed as Loc_i＝(x_i，y_i) 1, 2, 81, wherein x_i，y_iThe horizontal and vertical coordinates of the points are shown, and the specific area is shown in table 9 below.

TABLE 9 regions corresponding to face Key points

Range of key points	Face region
		Loc₁～Loc₁₇	Cheek edge
Loc₁₈～Loc₂₂	Left eyebrow
		Loc₂₃～Loc₂₇	Right side eyebrow
Loc₂₈～Loc₃₆	Nose
		Loc₃₇～Loc₄₂	Left eye
Loc₄₃～Loc₄₈	Right eye
		Loc₄₉～Loc₆₈	Mouth bar
Loc₆₉～Loc₈₁	Forehead head

In the skin classification task of the human face, if the whole Region is taken as an input, the whole Region is interfered by pose, shadow and the like, so that a division of four regions of Interest (ROI) is proposed, and a schematic diagram is shown in fig. 3. Setting Rect_lx，Recti_ly，Recti_rx，Recti_ryAnd i is 1, 2, 3 and 4, which respectively represent the forehead, the left cheek, the right cheek and the lower jaw.

The key point positions of the upper left corner and the lower right corner of the forehead area are respectively as follows: (Rect1_lx，Rect1l_y)＝(x₂₁，max(y₇₁，y₇₂，y₈₁))，(Rect1_rx，Rect1_ry)＝(x₂₄，min(y₂₁，y₂₄))。

The key point positions of the upper left corner and the lower right corner of the left cheek region are respectively as follows: (Rect2_lx，Rect2_ly)＝(x₃₇，y₂₉)， (Rect2_rx，Rect2_ry)＝(x₃₂，y₃₂)。

The key point positions of the upper left corner and the lower right corner of the right cheek region are respectively as follows: (Rect3_lx，Rect3_ly)＝(x₃₆，y₂₉)， (Rect3_rx，Rect3_ry)＝(x₄₆，y₃₂)。

The key point positions of the upper left corner and the lower right corner of the lower jaw area are respectively as follows: (Rect4_lx，Rect4_ly)＝(x₈，max(y₅₇，y₅₈，y₅₉))，(Rect4_rx，Rect4_ry)＝(x₁₀，min(y₈，y₉，y₁₀))。

The schematic of the four regions is shown in the inner frame rectangle of fig. 3.

Finally, the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting, although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions may be made to the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention, and all of them should be covered in the claims of the present invention.

Claims

1. A skin beautifying method based on a human face image of high contrast buffing is characterized in that: the method comprises the following steps:

2. The method for beautifying the skin based on the high-contrast dermabrasion human face image as claimed in claim 1, wherein the dermabrasion operator in S1 dermabrades the human face original image S as follows:

B＝S-Gauss(S,r₁)+128 (1-1)；

s13: radius r for image F₂Carrying out Gaussian blur to obtain an image H;

E＝2*C+S-256 (1-2)；

G＝A*(1-M)+M*E (1-3)；

B＝S*(1-r₃)+G*r₃ (1-4)。

3. the method for beautifying the skin based on the high contrast dermabrasion human face image as claimed in claim 2, wherein said S13 is applied to the image F with a radius r₂The process of obtaining the image H by performing gaussian blurring is as follows:

s131: dividing the image F into grayscale images of three channels of RGB;

4. The method for beautifying the skin based on the high contrast dermabrasion human face image as claimed in claim 2, wherein said S15 image a is used for skin detection and human face detection, and the process of obtaining the skin mask image M of the human face is as follows:

s152: initializing skin segmentation model parameters;

5. A method of skin beautifying based on high contrast dermabrasion face image as claimed in claim 4, wherein: the process of the oil polishing operator to polish the image B in S2 is as follows:

s21, performing skin type classification on the image B, and determining the oil light level of the image B;

s22 maximum chroma sigma at each pixel point of image B is calculated_maxAnd storing the image B as a gray image I;

s23 maximum value lambda of approximate diffuse reflection chromaticity at each pixel point of image B_maxAnd storing the image as a gray level image II;

s24, using the gray level image II as a guide image, applying a combined bilateral filter to the image gray level image I, and storing the filtered image as a preprocessed image;

s25 calculating σ for each pixel p in the preprocessed image_max(p) comparison

And σ_maxTaking the maximum value as shown in equations 2-17:

s26 repeating steps S24 and S25 until each pixel

Executing the next step;

s27 determining sigma of each pixel point p in the preprocessed image_max(p) the channel in RGB selected, the pixel of the selected channel of each pixel point p is sigma_max(p) multiplied by 255, and then iterating the pixels of the two unselected channels of each pixel point p and the pixels of the selected channel to obtain a preprocessed image;

and S28, performing skin type classification on the preprocessed image to determine the oil light level, outputting the preprocessed image as an image C when the oil light level of the preprocessed image is lower than that of the image B in the S21 and not higher than a preset oil light level threshold, and otherwise, returning to the step S22 and updating the image B by using the preprocessed image.

6. A method of skin beautifying based on an image of a high contrast dermabrasion human face as defined in claim 5, wherein: in said S22, the maximum chromaticity σ at each pixel point of the image B is calculated_maxThe process of (2) is as follows:

J＝J^D+J^S (2-5)；

defining chrominance as a color component σ_cThe formula is 2-6:

wherein c is ∈ { r, g, b }, J_cRepresenting the reflected light color;

diffuse reflectance chromaticity Λ_cAnd an illumination chromaticity Γ_cEquations 2-7 and equations 2-8 are defined as follows:

wherein,

which represents the diffuse reflection component of the light,

representing a diffuse reflection component;

wherein u represents a layer, and u can be an r layer, a g layer or a b layer,

representing the diffuse reflection component in the layer u,

representing the diffuse reflection component in the layer u;

And

and

wherein,

representing the diffuse reflection value of the c-th image layer;

the maximum chroma is defined by equations 2-11:

σ_max＝max(σ_r,σ_g,σ_b) (2-11)；

wherein σ_r,σ_g,σ_bRepresenting the maximum color components of the r, g and b layers, respectively;

the maximum diffuse reflectance chromaticity is defined as equation 2-12:

Λ_max＝max(Λ_r,Λ_g,Λ_b) (2-12)；

wherein, Λ_r,Λ_g,Λ_bRespectively representing the maximum diffuse reflection chroma of the r layer, the g layer and the b layer;

the diffuse reflection component may be Λ_maxExpressed as equations 2-13:

Λ_maxin the range of

7. A method of skin beautifying based on an image of a high contrast dermabrasion human face as defined in claim 5, wherein: in said S22, the maximum value λ of the approximate diffuse reflectance chromaticity at each pixel point of the image B is calculated_maxThe process of (2) is as follows:

let sigma_min＝min(σ_r,σ_g,σ_b) Using λ_cTo estimate Λ_cThe equations 2-14 are calculated as follows:

λ_cintermediate variables, with no actual meaning;

approximate diffuse reflectionChroma lambda_cAnd true diffuse reflectance chromaticity Λ_cThe relationship between them is described as 1) and 2).

1) For any two pixels p and q, if Λ_c(p)＝Λ_c(q), then λ_c(p)＝λ_c(q)

wherein λ is_r,λ_g,λ_bThe calculated variables representing the layers r, g and b, respectively, have no actual meaning;

wherein,

meaning that the calculated variable for pixel point p has no actual meaning,

and

are typically gaussian distributed spatial and distance weighting functions.

8. A method of skin beautifying based on high contrast dermabrasion face image as claimed in claim 7, wherein: and taking the gray image II in the S24 as a guide image, and applying a combined bilateral filter filtering process to the image gray image I as follows:

the pixel value of the center point is represented,

the joint bilateral filter is defined as follows:

is that

by substituting into a formula，σ_max(q) is equal to the portion of I (k, l) in the bilateral filter, representing the pixel value at point q.

9. A method of skin beautifying based on high contrast dermabrasion face image according to claim 1 or 8, characterized by: the whitening process of the whitening operator in S3 on the image C is as follows: