CN113658088B

CN113658088B - Face synthesis method and device based on multiple discriminators

Info

Publication number: CN113658088B
Application number: CN202110994564.1A
Authority: CN
Inventors: 安丽军
Original assignee: Novartis Film Technology Jiangsu Co ltd
Current assignee: Novartis Film Technology Jiangsu Co ltd
Priority date: 2021-08-27
Filing date: 2021-08-27
Publication date: 2022-12-02
Anticipated expiration: 2041-08-27
Also published as: CN113658088A

Abstract

The invention relates to the technical field of face synthesis, and discloses a face synthesis method based on multiple discriminators, which comprises the following steps: acquiring a real face image, preprocessing the acquired real face image to obtain a normalized real face image, and taking the normalized real face image as a data set 1; constructing a generator network, generating a face image by using the generator network, and taking the generated face image as an image set 2; inputting the images in the image set 1 and the image set 2 into a plurality of discriminators respectively; and weighting the discriminators according to an AHP (advanced high performance packet) analytic hierarchy process, selecting the discriminator with the highest hierarchical weight to judge whether the input image is the face image or not according to the hierarchical weight of the discriminators, adjusting the generator network according to the judgment result, and generating the face image by using the adjusted optimal generator network. The invention also provides a face synthesis device based on the multi-discriminator. The invention realizes the synthesis of the face image.

Description

Face synthesis method and device based on multiple discriminators

Technical Field

The invention relates to the technical field of face synthesis, in particular to a face synthesis method and device based on multiple discriminators.

Background

With the rapid development of the internet and digital image capturing devices, people's daily life generates massive image data, which accelerates the understanding of modern computers on image contents. In the field of human face image understanding, the human face synthesis technology can edit attributes such as human face age and the like, synthesize human face images with invariable identity characteristics, and reduce adverse effects of attribute changes such as age, hair style and the like on identity authentication in the process of cross-age identity authentication, so the human face synthesis technology becomes a popular field of current research.

The traditional face synthesis technology is mainly based on a feature expression technology, including a principal component analysis method and a sparse expression method, the method can express a face image into a feature vector and a feature matrix, different face feature vectors are fused through weighting to synthesize a new face image, and the quality of the synthesized face image is low.

In view of this, how to synthesize a face image with higher image quality becomes an urgent problem to be solved by those skilled in the art.

Disclosure of Invention

The invention provides a face synthesis method based on multiple discriminators, which comprises the steps of preprocessing a real face image to obtain a normalized face image serving as an image set 1; and constructing a generator network, generating a face image by using the generator network, taking the generated face image as an image set 2, respectively inputting the images in the image set 1 and the image set 2 into a plurality of discriminators for judgment, judging whether the input image is the face image according to an AHP (advanced high-performance analysis) analytic hierarchy process, and adjusting the generator network according to a judgment result, so that the face image is generated by using the generator network.

In order to achieve the above object, the present invention provides a face synthesis method based on multiple discriminators, which includes:

acquiring a real face image, preprocessing the acquired real face image to obtain a normalized real face image, and taking the normalized real face image as a data set 1;

constructing a generator network, generating a face image by using the generator network, and taking the generated face image as an image set 2;

inputting the images in the image set 1 and the image set 2 into a plurality of discriminators respectively;

and weighting the discriminators according to an AHP (advanced high performance packet) analytic hierarchy process, selecting the discriminator with the highest hierarchical weight to judge whether the input image is the face image or not according to the hierarchical weight of the discriminators, adjusting the generator network according to the judgment result, and generating the face image by using the adjusted optimal generator network.

Optionally, the preprocessing the acquired real face image includes:

acquiring a real face image, and normalizing the acquired real face image to make all face images have a uniform image size, wherein the normalized face image size is M pixel by N pixel, the image normalization step comprises the steps of performing stretching and rotating processing on the image, in a specific embodiment of the invention, the value of M is 256, and the value of N is 256;

preprocessing the acquired real face image, wherein the preprocessed real face image is a normalized real face image, all normalized real face images are used as a data set 1, and the image preprocessing process comprises the following steps:

1) Solving the maximum value of three color components of each pixel in the real face image, and setting the maximum value as the gray value of the pixel point to obtain the gray image of the real face image, wherein the formula of the gray processing is as follows:

Gray(i,j)＝max{R(i,j),G(i,j),B(i,j)}

wherein:

(i, j) is a pixel point in the real face image;

r (i, j), G (i, j), B (i, j) are the values of the pixel point (i, j) in the R, G, B three color channels respectively;

gray (i, j) is the Gray value of the pixel point (i, j);

2) For the gray-scale image, the gray scale of the image is linearly stretched by utilizing a piecewise linear transformation mode, and the formula is as follows:

wherein:

gray (i, j) is the original Gray value of the pixel point (i, j);

MIN _Gray is the minimum gray value in the gray map;

MAX _Gray is the maximum gray value in the gray map;

g (i, j) is the gray value of the pixel point (i, j) after gray stretching.

Optionally, the constructing a generator network, and generating the face image by using the generator network, includes:

labeling image areas of different facial components in a real face image, constructing an M multiplied by N gray matrix Q, filling the gray value of each pixel point in the real face image into the gray matrix according to the position of the pixel point, and obtaining a plurality of face image gray matrices; in one embodiment of the invention, the facial components include eyes, hair, ears, nose, etc. of a human face;

constructing a generator network, wherein in a specific embodiment of the invention, the generator network consists of three convolution layers, a residual error layer and a transposed convolution kernel, the size of the convolution kernel in each convolution layer is 3 × 3 pixels, the number of filters in a first convolution layer is 32, the number of filters in a second convolution layer is 64, the number of filters in a third convolution layer is 128, each residual error layer consists of 9 residual error blocks, the step size of each residual error block is 1, the size of each residual error block is 3 × 3 pixels, and the size of the transposed convolution kernel is 3 × 3 pixels, so that the convolved features are restored into an image with the original size of M × N pixels;

inputting the gray level matrix of the face image into a generator network, wherein the process of generating the face image by using the generator network comprises the following steps:

1) The convolution layer receives the human face image gray matrix and performs convolution processing on the human face image gray matrix Q, and the flow of the convolution processing is as follows:

h _1c ＝Conv1(Q _c )

h _2c ＝Conv2(h _1c )

h _3c ＝Conv3(h _2c )

wherein:

conv, conv1, conv2 are the three convolutional layers of the generator network, conv (·) denotes the convolution of the input values;

Q _c an image gray matrix representing a facial component c, the facial component comprising eyes, hair, ears, nose, facial shape;

h _1c ,h _2c ,h _3c a feature map representing a face component c;

in a specific embodiment of the present invention, there are 3 residual blocks behind each convolutional layer, so that the residual of the feature map after convolution is 0;

2) Combining feature maps of different face components, wherein the combined feature map comprises all face components and does not have the same face components;

3) Inputting the combined feature map h into a normalization layer, and optimizing the positions of the face components in the feature map by the normalization layer:

wherein:

c denotes a face component, c _m Representing a face shape in a face component;

SP (-) denotes normalizing the input feature map;

h _n,x,y the number of channels representing the feature map is n, the width of the feature map is x, and the height of the feature map is y;

μ _n the average value of the feature map in the dimension of the feature channel is taken;

σ _n the standard deviation of the feature map on the dimension of the feature channel is taken as the standard deviation;

α _n,x,y (. For a mapping function in the normalization layer, mapping facial components except the facial shape to the corresponding position of the face image;

β _n,x,y (. The) is a mapping function in the normalization layer, and the facial shape is mapped into the facial shape of the facial image;

4) The output of the normalization layer is input to a transposed convolution kernel, mapping the feature map to an M x N pixel size image.

Optionally, the process of discriminating the image in the image set 2 by using the convolutional neural network discriminator is as follows:

1) Inputting all images in the image set 1, and respectively extracting color features and shape features of different face images and proportion features of face components by using 4 layers of convolution layers; in one embodiment of the present invention, the size of the convolution kernel in the convolutional layer is 3*3 pixels, the number of the first layer convolutional layer filters is 32, the number of the second layer convolutional layer filters is 64, the number of the third layer convolutional layer filters is 128, and the number of the fourth layer convolutional layer filters is 256;

2) Inputting generated images in the image set 2, respectively extracting the color feature and the shape feature of each generated image and the proportion feature of a face component by using 4 layers of convolution layers, calculating the maximum similarity between the color feature and the shape feature of the generated images and the proportion feature of the face component and the corresponding dimension feature in the image set 1 for any generated image, and taking the maximum similarity between the color feature and the shape feature and the proportion feature of the face component as the degree of the generated image meeting the standard of a real face image in the face skin color, the face shape and the proportion dimension of each part of the face;

3) Calculating the similarity between the feature map of the generated image in the image set 2 and the feature map of the real face image in the image set 1, and generating an image x in the real face image set Y = { Y = ₁ ,y ₂ ,., the maximum similarity value in the image x is the probability that the generated image x is a face image.

Optionally, the process of discriminating the image in the image set 2 by using the KNN discriminator is as follows:

calculating the difference value between the pixel matrix of the generated image x in the image set 2 and the pixel matrix of all real face images in the image set 1:

wherein:

Z _x generating a pixel matrix for image x in image set 2;

a pixel matrix of the kth real face image in the image set 1;

is the difference between pixel matrixes including a color value pixel matrix and a human face shape part pixelThe matrix, the pixel matrix of the face component part and the pixel matrix of the face image are calculated by a Mahattan distance calculation method;

normalizing the calculated N difference values to obtain 1-d _min As the degree of the generated image in which the skin color of the face, the shape of the face, the dimension of the proportion of each part of the face meet the standard of the real face image, and the probability that the generated image x is the face image, wherein d _min Representing the minimum difference value between the generated image and the real image in different pixel matrixes.

Optionally, the weighting the classifiers according to the AHP analytic hierarchy process, and selecting the classifier with the highest hierarchical weight to determine whether the input image is a face image according to the hierarchical weight of the classifiers includes:

1) Establishing an AHP hierarchical model structure, wherein a target layer of the established AHP hierarchical model structure is whether an image of an input image set 2 is a face image, a criterion layer is the skin color of the face, the shape of the face and the proportion of each part of the face, and a measure layer is a convolutional neural network discriminator and a KNN discriminator;

2) Establishing a judgment matrix for the criterion layer and the measure layer, wherein elements in the matrix are comparison of importance degrees of the two indexes; in one embodiment of the present invention, for example, a criterion layer is determined, wherein the skin color of the face, the shape of the face and the proportion of each part of the face are a ₁ ,a ₂ ,a ₃ The established judgment matrix is:

wherein:

a _ij i.e. the index a _i For the index a _j Higher value of (a) indicates a _i The greater the degree of influence on the target layer;

3) Respectively calculating the maximum characteristic root lambda of the criterion layer judgment matrix ₁ Maximum characteristic root lambda of sum measure layer judgment matrix ₂ And respectively calculating the consistency index CI = (lambda-r)/(r-1), wherein r is the order of the judgment matrix, andcorrecting the consistency index CR = CI/RI, wherein RI is a correction factor, when CR is used<0.1, the judgment matrix is considered to be credible, otherwise, the judgment matrix needs to be modified;

4) And determining the influence weight of the measure layer on the target result layer by layer according to the final judgment matrix, and taking the influence weight as the empowerment of the discriminator.

And selecting a discriminator with the highest hierarchical weight to judge whether the input image is a face image.

In addition, to achieve the above object, the present invention further provides a face synthesis apparatus based on multiple discriminators, the apparatus comprising:

the image acquisition device is used for acquiring a real face image;

the image processor is used for preprocessing the acquired real face image to obtain a normalized real face image, and taking the normalized real face image as a data set 1; inputting the images in the image set 1 and the image set 2 into a plurality of discriminators respectively;

the face image synthesis device is used for constructing a generator network, generating a face image by using the generator network and taking the generated face image as an image set 2; and weighting the discriminators according to an AHP (advanced high performance packet) analytic hierarchy process, selecting the discriminator with the highest hierarchical weight to judge whether the input image is the face image or not according to the hierarchical weight of the discriminators, adjusting the generator network according to the judgment result, and generating the face image by using the adjusted optimal generator network.

In addition, to achieve the above object, the present invention also provides a computer readable storage medium, which stores thereon human face image synthesis program instructions, which are executable by one or more processors to implement the steps of the implementation method of multi-discriminator based human face synthesis as described above.

Compared with the prior art, the invention provides a face synthesis method based on multiple discriminators, which has the following advantages:

firstly, the scheme provides a generator network for generating a face image, and the process of generating the face image by using the generator network comprises the following steps: the convolution layer receives the face image gray matrix and performs convolution processing on the face image gray matrix Q, and the flow of the convolution processing is as follows:

h _1c ＝Conv1(Q _c )

h _2c ＝Conv2(h _1c )

h _3c ＝Conv3(h _2c )

wherein: conv, conv1, conv2 are the three convolutional layers of the generator network, conv (·) denotes the convolution of the input values; q _c An image gray matrix representing a facial component c, the facial component comprising eyes, hair, ears, nose, facial shape; h is _1c ,h _2c ,h _3c A feature map representing a face component c; in a specific embodiment of the present invention, there are 3 residual blocks behind each convolutional layer, so that the residual of the feature map after convolution is 0; combining feature maps of different face components, wherein the combined feature map comprises all face components and does not have the same face components; inputting the combined feature map h into a normalization layer, and optimizing the positions of the face components in the feature map by the normalization layer:

wherein: c denotes a face component, c _m Representing a face shape in a face component; SP (-) denotes normalizing the input feature map; h is a total of _n,x,y The number of channels representing the feature map is n, the width of the feature map is x, and the height of the feature map is y; mu.s _n The average value of the feature map on the feature channel dimension is obtained; sigma _n The standard deviation of the feature map on the dimension of the feature channel is taken as the standard deviation; alpha is alpha _n,x,y (. For a mapping function in the normalization layer, mapping facial components except the facial shape to the corresponding position of the face image; beta is a _n,x,y (. For a mapping function in the normalization layer, the facial shape is mapped into the facial shape of the facial image; inputting the output result of the normalization layer to a transposed convolution kernel so as to input the feature mapThe mapping is to an image of size M x N pixels. Compared with the prior art, the scheme extracts the features of different facial components respectively, the facial components form a feature map for face synthesis, the different facial components are mapped to the positions corresponding to the faces through function mapping, and meanwhile, the scheme can set the function mapping relation by self to realize a user-defined face image generation scheme.

Meanwhile, the images in the image set 1 and the image set 2 are respectively input into a plurality of discriminators, the similarity of the images in the image set 2 and the real face images in the image set 1 in each dimension is judged according to the face skin color, the face shape and the proportional dimension of each part of the face, whether the images in the image set 2 are face images or not is judged, and the selected discriminators comprise a convolutional neural network discriminator and a KNN discriminator; meanwhile, the method establishes an AHP hierarchical model structure, wherein a target layer of the established AHP hierarchical model structure is whether an input image set 2 image is a face image, a criterion layer is the face color, the face shape and the proportion of each part of the face, and a measure layer is a convolutional neural network discriminator and a KNN discriminator; establishing a judgment matrix for the criterion layer and the measure layer, wherein elements in the matrix are comparison of importance degrees of the two indexes; respectively calculating the maximum characteristic root lambda of the criterion layer judgment matrix ₁ Maximum characteristic root lambda of sum measure layer judgment matrix ₂ And respectively calculating consistency indexes CI = (lambda-r)/(r-1), wherein r is the order number of the judgment matrix, correcting the consistency indexes CR = CI/RI, wherein RI is a correction factor, and when CR is the correction factor<0.1, considering the judgment matrix to be credible, otherwise, modifying the judgment matrix; and determining the influence weight of the measure layer on the target result layer by layer according to the final judgment matrix, taking the influence weight as the weighting of a discriminator, selecting the discriminator with the highest hierarchical weight to judge whether the input image is the face image, and updating and adjusting parameters in the generator network until the generated images are all judged to be the face images. Compared with the prior art that only a single discriminator exists and the loss function is needed to be used for parameter adjustment, the scheme judges the face image from the dimensions such as face complexion, face shape, face part proportion and the like by introducing a plurality of discriminators, thereby establishingThe hierarchical analysis model realizes the judgment of whether the generated image is a human face image by utilizing the hierarchical analysis method, the judgment flow of the scheme is simpler and more concise, and the calculation flow is simple.

Drawings

Fig. 1 is a schematic flow chart of a face synthesis method based on multiple discriminators according to an embodiment of the present invention;

fig. 2 is a schematic structural diagram of a face synthesis apparatus based on multiple classifiers according to an embodiment of the present invention;

the implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.

Detailed Description

It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

Preprocessing a real face image to obtain a normalized face image as an image set 1; and constructing a generator network, generating a face image by using the generator network, taking the generated face image as an image set 2, respectively inputting the images in the image set 1 and the image set 2 into a plurality of discriminators for judgment, judging whether the input image is the face image according to an AHP (advanced high-performance analysis) analytic hierarchy process, and adjusting the generator network according to a judgment result, so that the face image is generated by using the generator network. Fig. 1 is a schematic diagram of a face synthesis method based on multiple classifiers according to an embodiment of the present invention.

In this embodiment, the method for synthesizing a face based on multiple discriminators includes:

s1, acquiring a real face image, preprocessing the acquired real face image to obtain a normalized real face image, and taking the normalized real face image as a data set 1.

Firstly, acquiring a real face image, and normalizing the acquired real face image to make all face images have a uniform image size, wherein the normalized face image size is M pixel by N pixel, the image normalization step comprises the steps of performing stretching and rotating processing on the image, in a specific embodiment of the invention, the value of M is 256, and the value of N is 256;

further, the present invention preprocesses the acquired real face image, the preprocessed real face image is a normalized real face image, and all normalized real face images are used as a data set 1, and the image preprocessing process includes:

Gray(i,j)＝max{R(i,j),G(i,j),B(i,j)}

wherein:

(i, j) is a pixel point in the real face image;

gray (i, j) is the Gray value of the pixel point (i, j);

wherein:

gray (i, j) is the original Gray value of the pixel point (i, j);

MIN _Gray is the minimum gray value in the gray map;

MAX _Gray is the maximum gray value in the gray map;

g (i, j) is the gray value of the pixel point (i, j) after gray stretching.

And S2, constructing a generator network, generating a face image by using the generator network, and taking the generated face image as an image set 2.

For a real face image in a data set 1, labeling image areas of different face components in the real face image, constructing an M multiplied by N gray matrix Q, filling the gray value of each pixel point in the real face image into the gray matrix according to the position of the pixel point, and obtaining a plurality of face image gray matrices; in one embodiment of the invention, the facial components include eyes, hair, ears, nose, etc. of a human face;

further, the present invention constructs a generator network, in a specific embodiment of the present invention, the generator network is composed of three convolutional layers, a residual layer and a transposed convolutional kernel, the size of the convolutional kernel in the convolutional layers is 3 × 3 pixels, the number of filters of the first convolutional layer is 32, the number of filters of the second convolutional layer is 64, the number of filters of the third convolutional layer is 128, the residual layer is composed of 9 residual blocks, the step size of the residual block is 1, the size is 3 × 3 pixels, and the size of the transposed convolutional kernel is 3 × 3 pixels, so as to restore the convolved features to the original image with the size of M × N pixels;

1) The convolution layer receives the face image gray matrix and performs convolution processing on the face image gray matrix Q, and the flow of the convolution processing is as follows:

h _1c ＝Conv1(Q _c )

h _2c ＝Conv2(h _1c )

h _3c ＝Conv3(h _2c )

wherein:

conv, conv1, conv2 are the three convolution layers of the generator network, conv (·) represents the convolution of the input values;

Q _c an image gray scale matrix representing a facial component c, the facial component comprising eyes, hair, ears, nose, facial shape;

h _1c ,h _2c ,h _3c a feature map representing a face component c;

in a specific embodiment of the present invention, there are 3 residual blocks behind each convolution layer, so that the residual of the feature map after convolution is 0;

wherein:

c denotes a face component, c _m Representing a face shape in a face component;

SP (-) indicates the normalization of the input feature map;

μ _n the average value of the feature map on the feature channel dimension is obtained;

σ _n the standard deviation of the feature map on the dimension of the feature channel is obtained;

β _n,x,y (. For a mapping function in the normalization layer, the facial shape is mapped into the facial shape of the facial image;

Further, the present invention takes all face images generated by the generator network as data set 2.

And S3, inputting the images in the image set 1 and the image set 2 into a plurality of discriminators respectively.

Further, the images in the image set 1 and the image set 2 are respectively input into a plurality of discriminators, the similarity of the images in the image set 2 and the real face images in the image set 1 in each dimension is judged according to the skin color of the face, the shape of the face and the proportional dimension of each part of the face, and whether the images in the image set 2 are face images is judged; in one embodiment of the invention, the selected discriminators include a convolutional neural network discriminator and a KNN discriminator;

the process of distinguishing the images in the image set 2 by using the convolutional neural network discriminator is as follows:

1) Inputting all images in the image set 1, and respectively extracting color features and shape features of different face images and proportion features of face components by using 4 layers of convolution layers; in a specific embodiment of the present invention, the size of the convolution kernel in the convolution layer is 3*3 pixels, the number of the convolution layer filters in the first layer is 32, the number of the convolution layer filters in the second layer is 64, the number of the convolution layer filters in the third layer is 128, and the number of the convolution layer filters in the fourth layer is 256;

The process of distinguishing the images in the image set 2 by using the KNN discriminator is as follows:

wherein:

Z _x generating a pixel matrix for image x in image set 2;

a pixel matrix of the kth real face image in the image set 1;

the difference between pixel matrixes is obtained, the pixel matrixes comprise a color value pixel matrix, a face shape part pixel matrix, a face component part pixel matrix and a face image pixel matrix, and the calculation method is a Mahattan distance calculation method;

And S4, weighting the discriminators according to an AHP (advanced high performance packet) analytic hierarchy process, selecting the discriminator with the highest hierarchical weight to judge whether the input image is the face image or not according to the hierarchical weight of the discriminators, adjusting the generator network according to the judgment result, and generating the face image by using the adjusted optimal generator network.

Furthermore, the invention assigns weights to the judgers according to an AHP analytic hierarchy process, wherein the process of the AHP analytic hierarchy process comprises the following steps:

2) To the criterion layerEstablishing a judgment matrix by the measure layer, wherein the elements in the matrix are the comparison of the importance degrees of the two indexes; in one embodiment of the present invention, for example, a criterion layer is determined, wherein the skin color of the face, the shape of the face and the proportion of each part of the face are a ₁ ,a ₂ ,a ₃ The established judgment matrix is:

wherein:

3) Respectively calculating the maximum characteristic root lambda of the criterion layer judgment matrix ₁ Maximum characteristic root lambda of sum measure layer judgment matrix ₂ And respectively calculating consistency indexes CI = (lambda-r)/(r-1), wherein r is the order number of the judgment matrix, correcting the consistency indexes CR = CI/RI, wherein RI is a correction factor, and when CR is the correction factor<0.1, the judgment matrix is considered to be credible, otherwise, the judgment matrix needs to be modified;

Selecting a discriminator with the highest hierarchical weight to judge whether the input image is a human face image;

updating and adjusting parameters in the generator network until all generated images are judged to be face images, and performing face synthesis by using the final generator network.

The following describes embodiments of the present invention through an algorithmic experiment and tests of the inventive treatment method. The hardware test environment of the algorithm of the invention is as follows: inter (R) Core (TM) i7-6700K CPU, wherein the software is Matlab2018b; the comparison method is a face synthesis method based on RNN and a face synthesis method based on GAN.

In the algorithm experiment, the face image is generated by the experiment through the algorithm model, the accuracy of the face image generation is used as the evaluation index of the algorithm feasibility, and the higher the accuracy of the face image generation is, the higher the effectiveness and the feasibility of the algorithm are.

According to the experimental result, the accuracy of generating the face image based on the RNN face synthesis method is 77.6%, the accuracy of generating the face image based on the GAN face synthesis method is 84.5%, and the accuracy of generating the face image based on the multi-discriminator of the method is 89.3%.

The invention also provides a face synthesis device based on the multi-discriminator. Fig. 2 is a schematic diagram of an internal structure of a face synthesis apparatus based on multiple classifiers according to an embodiment of the present invention.

In the present embodiment, the multi-discriminator-based face synthesis apparatus 1 includes at least an image acquisition device 11, an image processor 12, a face image synthesis device 13, a communication bus 14, and a network interface 15.

The image capturing device 11 may be a PC (Personal Computer), a terminal device such as a smart phone, a tablet Computer, a portable Computer, or a camera, or may be a server.

Image processor 12 includes at least one type of readable storage medium including flash memory, a hard disk, a multi-media card, a card-type memory (e.g., SD or DX memory, etc.), a magnetic memory, a magnetic disk, an optical disk, and the like. The image processor 12 may in some embodiments be an internal storage unit of the multi-discriminator based face synthesis apparatus 1, for example a hard disk of the multi-discriminator based face synthesis apparatus 1. The image processor 12 may also be an external storage device of the multi-discriminator based face synthesis apparatus 1 in other embodiments, such as a plug-in hard disk provided on the multi-discriminator based face synthesis apparatus 1, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like. Further, the image processor 12 may also include both an internal storage unit and an external storage device of the multi-discriminator based face synthesis apparatus 1. The image processor 12 can be used not only to store application software installed in the multi-discriminator based face synthesis apparatus 1 and various types of data, but also to temporarily store data that has been output or is to be output.

The face image synthesizing device 13 may be a Central Processing Unit (CPU), a controller, a microcontroller, a microprocessor or other data Processing chip in some embodiments, and includes a monitoring Unit for running program codes or Processing data stored in the image processor 12, such as the face image synthesizing program instructions 16.

The communication bus 14 is used to enable connection communication between these components.

The network interface 15 may optionally comprise a standard wired interface, a wireless interface (e.g. WI-FI interface), typically used for establishing a communication connection between the multi-discriminator based face synthesis apparatus 1 and other electronic devices.

Optionally, the multi-discriminator based face synthesis apparatus 1 may further include a user interface, the user interface may include a Display (Display), an input unit such as a Keyboard (Keyboard), and the optional user interface may also include a standard wired interface, a wireless interface. Alternatively, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch device, or the like. The display, which may also be referred to as a display screen or display unit, is suitable for displaying information processed in the multi-discriminator based face synthesis apparatus 1 and for displaying a visualized user interface.

While fig. 2 only shows the face synthesis apparatus 1 with components 11-15 and multi-discriminator based, it will be appreciated by those skilled in the art that the structure shown in fig. 1 does not constitute a limitation of the multi-discriminator based face synthesis apparatus 1, and may include fewer or more components than shown, or some components may be combined, or a different arrangement of components.

In the embodiment of the multi-discriminator based face synthesis apparatus 1 shown in fig. 2, face image synthesis program instructions 16 are stored in the image processor 12; the steps of the face image synthesis device 13 executing the face image synthesis program instructions 16 stored in the image processor 12 are the same as the implementation method of the multi-discriminator-based face synthesis method, and are not described here.

Furthermore, an embodiment of the present invention provides a computer-readable storage medium, on which facial image synthesis program instructions are stored, where the facial image synthesis program instructions are executable by one or more processors to implement the following operations:

It should be noted that the above-mentioned numbers of the embodiments of the present invention are merely for description, and do not represent the merits of the embodiments. And the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, apparatus, article, or method that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, apparatus, article, or method. Without further limitation, an element described by the phrase "comprising" does not exclude the presence of other identical elements in a process, apparatus, article, or method which comprises the element.

Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium (e.g., ROM/RAM, magnetic disk, optical disk) as described above and includes instructions for enabling a terminal device (e.g., a mobile phone, a computer, a server, or a network device) to execute the method according to the embodiments of the present invention.

The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims

1. A face synthesis method based on multiple discriminators is characterized by comprising the following steps:

s1, acquiring a real face image, preprocessing the acquired real face image to obtain a normalized real face image, and taking the normalized real face image as a data set 1;

s2, constructing a generator network, generating a face image by using the generator network, and taking the generated face image as an image set 2;

s3, respectively inputting the images in the image set 1 and the image set 2 into a plurality of discriminators;

s4, weighting the discriminators according to an AHP analytic hierarchy process, selecting the discriminator with the highest hierarchical weight to judge whether the input image is the face image or not according to the hierarchical weight of the discriminators, adjusting the generator network according to the judgment result, and generating the face image by using the adjusted optimal generator network;

the method for weighting the discriminators according to the AHP analytic hierarchy process and selecting the discriminator with the highest hierarchical weight to judge whether the input image is the face image or not according to the hierarchical weight of the discriminators comprises the following steps:

s41, establishing an AHP hierarchical model structure, wherein a target layer of the AHP hierarchical model structure is whether an input image set 2 image is a face image, a criterion layer is face skin color, face shape and proportions of all parts of the face, and a measure layer is a convolutional neural network discriminator and a KNN discriminator;

s42, establishing a judgment matrix for the criterion layer and the measure layer, wherein the elements in the matrix are the comparison of the importance degrees of the two indexes, and establishing judgment for the criterion layer, wherein the skin color of the human face, the shape of the human face and the proportion of each part of the human face are respectively

The established judgment matrix is:

wherein:

i.e. the indication index

For the index

Higher value of (A) indicates

The greater the degree of influence on the target layer;

s43, respectively calculating the maximum characteristic root of the criterion layer judgment matrix

Maximum feature root of sum measure layer decision matrix

And respectively calculate the consistency indexes

Wherein r is the order of the judgment matrix, and the consistency index is corrected

Wherein

To correct the factor when

If the judgment matrix is credible, otherwise, the judgment matrix needs to be modified;

s44, determining the influence weight of the measure layer on the target result layer by layer according to the final judgment matrix, and taking the influence weight as the empowerment of the discriminator;

the process of discriminating the image in the image set 2 by using the convolutional neural network discriminator comprises the following steps:

a1, inputting all images in an image set 1, and respectively extracting color features and shape features of different face images and proportion features of face components by using 4 layers of convolution layers;

a2, inputting the generated images in the image set 2, respectively extracting the color feature, the shape feature and the proportion feature of the face component of each generated image by using 4 layers of convolution layers, calculating the maximum similarity between the color feature, the shape feature and the proportion feature of the face component of each generated image and the corresponding dimension feature in the image set 1 for any generated image, and taking the maximum similarity of the color feature, the shape feature and the proportion feature of the face component as the degree of the generated image meeting the standard of a real face image in the dimensions of face skin color, face shape and proportion of each part of the face;

a3, calculating the similarity comparison between the characteristic graph of the image generated in the image set 2 and the characteristic graph of the real face image in the image set 1 to generate an image x in the real face image set

The medium maximum similarity value is the probability that the generated image x is a face image;

the process of discriminating the images in the image set 2 by using the KNN discriminator comprises the following steps:

b1, calculating difference values between a pixel matrix of the generated image x in the image set 2 and pixel matrices of all real face images in the image set 1:

wherein:

generating a pixel matrix for image x in image set 2;

a pixel matrix of the kth real face image in the image set 1;

b2, normalizing the calculated N difference values to obtain a normalized difference value

As the degree of the generated image in which the skin color of the face, the shape of the face, the dimension of the proportion of each part of the face meet the standard of the real face image, and the probability that the generated image x is the face image

Representing the minimum difference value between the generated image and the real image in different pixel matrixes;

and S5, selecting a discriminator with the highest hierarchical weight to judge whether the input image is a face image.

2. The method for synthesizing a human face based on multiple discriminators as claimed in claim 1, wherein the pre-processing of the acquired real human face image comprises:

acquiring a real face image, and normalizing the acquired real face image to make all face images have a uniform image size, wherein the normalized face image size is

The step of image normalization comprises the steps of stretching and rotating the image;

wherein:

a pixel point in the real face image;

are respectively pixel points

Values in R, G, B three color channels;

is a pixel point

The gray value of (a);

wherein:

is a pixel point

The original gray value of (a);

is the minimum gray value in the gray map;

is the maximum gray value in the gray map;

is pixel point after gray stretching

Of the gray scale value of (a).

3. The method as claimed in claim 2, wherein the constructing a generator network for generating the face image comprises:

labeling image regions of different face components in real face image, and constructing

Filling the gray value of each pixel point in the real face image into the gray matrix according to the position of the pixel point to obtain a plurality of face image gray matrices;

wherein:

for the three convolutional layers of the generator network,

representing convolution processing of input values;

an image gray scale matrix representing a facial component c, the facial component comprising eyes, hair, ears, nose, facial shape;

a feature map representing a face component c;

3) Inputting the combined feature map h into a standardization layer, and optimizing the positions of the face components in the feature map by the standardization layer:

wherein:

c denotes a face component which is a face component,

representing a face shape in a face component;

showing the input characteristic diagram is standardized;

the number of channels representing the feature map is n, the width of the feature map is x, and the height of the feature map is y;

the average value of the feature map on the feature channel dimension is obtained;

the standard deviation of the feature map on the dimension of the feature channel is obtained;

mapping facial components except the facial shape to corresponding positions of the face image for a mapping function in the normalization layer;

mapping the face shape into a face shape of the face image for a mapping function in the normalization layer;

4) The output result of the normalization layer is input to a transposed convolution kernel, thereby mapping the feature map into

Pixel-sized images.

4. A computer-readable storage medium having stored thereon, human face image synthesis program instructions executable by one or more processors to implement the multi-discriminator based face synthesis method of claim 1.