CN113436061B

CN113436061B - Face image reconstruction method and system

Info

Publication number: CN113436061B
Application number: CN202110748179.9A
Authority: CN
Inventors: 李琦; 单彩峰; 王卫宁; 胡力娟; 王海滨
Original assignee: Cas Artificial Intelligence Research Qingdao Co ltd
Current assignee: Cas Artificial Intelligence Research Qingdao Co ltd
Priority date: 2021-07-01
Filing date: 2021-07-01
Publication date: 2022-08-09
Anticipated expiration: 2041-07-01
Also published as: CN113436061A

Abstract

The invention provides a face image reconstruction method and a face image reconstruction system, which belong to the technical field of computer vision and are used for acquiring a face image to be processed; acquiring a target attribute feature difference vector corresponding to a face image to be processed; obtaining a face reconstruction image with target attribute characteristics for the face image to be processed and the corresponding target attribute characteristic difference vector by using a face reconstruction model which is constructed in advance; the pre-constructed face reconstruction model is obtained by training a training set, wherein the training set comprises a plurality of real face images with different resolutions and corresponding target attribute feature difference vectors. The invention realizes the multi-attribute face editing function under high-definition pixels; the face feature detail information is reserved, the attribute editing effect is improved, the resolution ratio of image generation is improved, the image is clearer, the more abundant detail features of face attribute editing under the requirement of high resolution ratio are guaranteed, and attribute conversion is successfully realized.

Description

Face image reconstruction method and system

Technical Field

The invention relates to the technical field of computer vision, in particular to a face image reconstruction method and a face image reconstruction system.

Background

The human face attribute editing is an important technical branch of image and video generation in the computer vision field, and aims to generate a vivid human face image with specified characteristic information on the basis of a given human face image and a multi-attribute characteristic vector difference value on the premise of keeping irrelevant information of a human face target unchanged.

At present, most of the conventional face attribute editing technologies use models based on GANs (resist generation Networks — generic adaptive Networks, GANs) to synthesize corresponding attribute edits, and the major defects of the editing technologies are as follows: firstly, the method mainly focuses on low-quality large-area local feature editing (hair color, Liuhai hairstyle and the like), and the model is difficult to achieve stronger detail feature retention and editing capacity required by high-definition face attribute editing on the model, so that noise and distortion exist in the generated image, and the quality and the fidelity of the generated image are reduced; secondly, the attention degree of the whole style editing is low, such as makeup, age and the like, and because the style attribute requires to modify finer local areas (eye circumference, forehead, mouth and the like) and global features (skin texture, color and the like) at the same time, the existing unsupervised model lacks attention and editing to detailed information, so that the editing effect is fuzzy or the editing degree is too small.

In summary, the above two defects result in that the algorithm cannot be optimized, training against the generation network is broken, or the edited face attribute does not have sufficient detail transformation, so that the attribute editing framework can only process the face replacement task at 256 × 256 resolution.

Disclosure of Invention

The invention aims to provide a face image generation method and a face image generation system which can realize a 512 x 512 multi-attribute face editing task under high-definition pixels and acquire a high-definition face image, so as to solve at least one technical problem in the background technology.

In order to achieve the purpose, the invention adopts the following technical scheme:

in one aspect, the present invention provides a face image reconstruction method, including:

acquiring a face image to be processed;

acquiring a target attribute feature difference vector corresponding to a face image to be processed, wherein the difference vector is the difference between a target attribute vector and an original attribute vector;

obtaining a face reconstruction image with target attribute characteristics for the face image to be processed and the corresponding target attribute characteristic difference vector by using a face reconstruction model which is constructed in advance; the pre-constructed face reconstruction model is obtained by training a training set, wherein the training set comprises a plurality of real face images with different resolutions and corresponding target attribute feature difference vectors.

Preferably, the obtaining of the face reconstruction model by training of the training set includes:

acquiring a plurality of real face images with different resolutions;

acquiring a target attribute feature difference vector corresponding to a real face image;

based on a dual spatial domain attention mechanism of a spatial domain product attention mechanism and a spatial domain face segmentation semantic information injection mechanism, a face reconstructor (a structure with generation function in a countermeasure generation network) generates real synthetic images with different resolutions of target attribute features according to input real face images and corresponding target attribute feature difference vectors;

calculating a loss value of a loss function according to real images with different resolutions and corresponding real synthetic images based on a multi-level face discriminator (a structure with discrimination in a countermeasure generation network);

and iteratively adjusting the weights of the face reconstructor and the face discriminator by utilizing a multi-level loss gradient back propagation algorithm until convergence according to the loss value to obtain a face reconstruction model.

Preferably, obtaining real synthetic images with different resolutions of the target attribute features comprises:

and connecting the coded real images with different resolutions and target attribute feature difference vectors by using a residual module and an attention mechanism, injecting the connected real images into a decoder, introducing the spatial domain semantic segmentation facial information into an upper sampling layer of the decoder, and decoding and outputting real synthetic images with different resolutions by using a feature extraction excitation mechanism.

Preferably, the loss values include: the method comprises the steps of generating a multilevel original attribute face reconstruction loss value and a multilevel attribute classification loss value by using a multilevel original attribute countermeasure loss value, a multilevel target attribute countermeasure loss value, a multilevel original attribute face reconstruction loss value, a multilevel target attribute generation original attribute face reconstruction loss value and a multilevel attribute classification loss value.

Preferably, the first and second liquid crystal materials are,

per-layer original attribute penalty-combating value

Comprises the following steps:

per-layer target attribute penalty against value

Comprises the following steps:

loss value L of face reconstruction of each layer of original attribute _pix Comprises the following steps:

L _pix ＝||x ^a -G(x ^a ,a-a)|| ₁ ；

generating original attribute face reconstruction loss value L by each layer of target attribute _cyc Comprises the following steps:

then, the overall face reconstruction loss value L _rec ：

Attribute classification loss value minL for discriminators _clsr Comprises the following steps:

attribute classification loss value minL for discriminators _clsf Comprises the following steps:

wherein, x represents the original face image,

representing the reconstructed image of the face of a person,

representing the sampled image between the real image and the generated image, | · | | luminance ₁ Indicating the calculated absolute distance, x ^a Representing real images with attribute features a, c ^diff Representing a feature difference vector of the object attribute, D ⁱ Denotes the ith discriminator, G denotes the generator, E _x Indicating the calculation of the expected value, D, sampled by the input image _adv (. cndot.) represents the penalty function of the arbiter,

indicating the expected value, λ, of the calculation to generate the image as a sample _gp The parameters of the constraint of the gradient are represented,

represents a gradient operator, | · | | luminance ₂ The expression of the euclidean distance,

the representation calculation is based on the expectation between the input image and the feature difference vector,

representing the generated image with the target attribute, c represents the target attribute vector,

representing the expectation of computing samples of the original attribute image and the original attribute vector,

representing the expectation of computing a sample of the original attribute image and the target attribute vector, D _cls Representing the classifier in the discriminator.

Preferably, iteratively adjusting the weights of the face reconstructor and the face discriminator by using a multilevel gradient back propagation algorithm until convergence comprises:

calculating the total loss value L of the discriminator at each layer _D ：

For each layer output of the generator, a loss value L is calculated _G Comprises the following steps:

calculating total loss value on generator

Comprises the following steps:

wherein λ is _clsr 、λ _clsf 、λ _rec 、λ _g Respectively, the loss value L _clsr 、L _clsf 、L _rec 、L _G The weight of (c);

total loss value L by discriminator _D Total loss value on sum generator

Performing iterative optimization on the minimum target; and continuously updating the weights of the discriminator and the generator by carrying out gradient back propagation on the discriminator and the generator of different layers until convergence.

In a second aspect, the present invention provides a face image reconstruction system, including:

the first acquisition module is used for acquiring a face image to be processed;

the second acquisition module is used for acquiring a target attribute feature difference vector corresponding to the face image to be processed;

the reconstruction module is used for utilizing a pre-constructed human face reconstruction model to obtain a human face reconstruction image with target attribute characteristics for the human face image to be processed and the corresponding target attribute characteristic difference vector; the pre-constructed face reconstruction model is obtained by training a training set, wherein the training set comprises a plurality of real face images with different resolutions and corresponding target attribute feature difference vectors.

In a third aspect, the present invention provides a non-transitory computer readable storage medium for storing computer instructions which, when executed by a processor, implement the instructions of the face image reconstruction method as described above.

In a fourth aspect, the invention provides a computer program (product) comprising a computer program for implementing the method for face image reconstruction as described above when run on one or more processors.

In a fifth aspect, the present invention provides an electronic device, comprising: a processor, a memory, and a computer program; wherein a processor is connected to the memory, the computer program is stored in the memory, and when the electronic device runs, the processor executes the computer program stored in the memory, so as to make the electronic device execute the facial image reconstruction method as described above.

The invention has the beneficial effects that:

the multi-attribute face editing function under high-definition pixels is realized; the face feature detail information is reserved, the attribute editing effect is improved, the resolution ratio of image generation is improved, the image is clearer, the more abundant detail features of face attribute editing under the requirement of high resolution ratio are guaranteed, and attribute conversion is successfully realized.

Additional aspects and advantages of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

Fig. 1 is a flowchart of a method for generating a high-definition face multi-attribute editing countermeasure according to embodiment 3 of the present invention.

Fig. 2 is a flowchart of a method for generating a high-definition face multi-attribute editing countermeasure according to embodiment 4 of the present invention.

Fig. 3 is a flow chart of a connected layer airspace attention mechanism of the high-definition face multi-attribute editing confrontation generation method according to embodiment 4 of the present invention.

Fig. 4 is a flowchart of introducing spatial semantic information between decoders in the high-definition face multi-attribute editing countermeasure generation method according to embodiment 4 of the present invention.

Fig. 5 is a flowchart of transmitting semantic information and output connection of a previous layer decoder to a next layer decoder by using a compression excitation mechanism according to embodiment 4 of the present invention.

Detailed Description

Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below by way of the drawings are illustrative only and are not to be construed as limiting the invention.

It will be understood by those skilled in the art that, unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.

It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the prior art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.

As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.

For the purpose of facilitating an understanding of the present invention, the present invention will be further explained by way of specific embodiments with reference to the accompanying drawings, which are not intended to limit the present invention.

It should be understood by those skilled in the art that the drawings are merely schematic representations of embodiments and that the elements shown in the drawings are not necessarily required to practice the invention.

Example 1

An embodiment 1 of the present invention provides a face image reconstruction system, including:

In this embodiment 1, the above facial image reconstruction system is used to implement a facial image reconstruction method, and the method includes:

acquiring a face image to be processed by using a first acquisition module;

acquiring a target attribute feature difference vector corresponding to the face image to be processed by using a second acquisition module;

using a reconstruction module, and obtaining a face reconstruction image with target attribute characteristics for the face image to be processed and the corresponding target attribute characteristic difference vector by using a pre-constructed face reconstruction model; the pre-constructed face reconstruction model is obtained by training a training set, wherein the training set comprises a plurality of real face images with different resolutions and corresponding target attribute feature difference vectors.

In this embodiment 1, the training of the training set to obtain the face reconstruction model includes:

acquiring a plurality of real face images with different resolutions;

a face reconstructor based on a dual spatial domain attention mechanism obtains real synthetic images with different resolutions of target attribute characteristics according to an input real face image and a corresponding target attribute characteristic difference vector;

calculating a loss value of a loss function according to real images with different resolutions and corresponding real synthetic images based on a multi-level face discriminator;

and iteratively adjusting the weights of the face reconstructor and the face discriminator by using a multi-level loss gradient back propagation algorithm until convergence according to the loss value to obtain a face reconstruction model.

In this embodiment 1, obtaining real synthetic images with different resolutions and target attribute features includes: and connecting the coded real images with different resolutions and target attribute feature difference vectors by using a residual module and an attention mechanism, injecting the connected real images into a decoder, introducing the spatial domain semantic segmentation facial information into an upper sampling layer of the decoder, and decoding and outputting real synthetic images with different resolutions by using a feature extraction excitation mechanism.

In this embodiment 1, the loss values include: the method comprises the steps of generating a multilevel original attribute face reconstruction loss value and a multilevel attribute classification loss value by using a multilevel original attribute countermeasure loss value, a multilevel target attribute countermeasure loss value, a multilevel original attribute face reconstruction loss value, a multilevel target attribute generation original attribute face reconstruction loss value and a multilevel attribute classification loss value.

Wherein,

per-layer original attribute penalty-combating value

Comprises the following steps:

per-layer target attribute penalty against value

Comprises the following steps:

L _pix ＝||x ^a -G(x ^a ,a-a)|| ₁ ；

then, the overall face reconstruction loss value L _rec ：

wherein, x represents the original face image,

representing the reconstructed image of the face of a person,

sample map representing between real image and generated imageLike | · | luminance ₁ Indicating the calculated absolute distance, x ^a Representing real images with attribute features a, c ^diff Representing a feature difference vector of the object attribute, D ⁱ Denotes the ith discriminator, G denotes the generator, E _x Indicating the calculation of the expected value, D, sampled by the input image _adv (. cndot.) represents the penalty function of the arbiter,

representing the expectation of computing a sample of the original attribute image and the original attribute vector,

In this embodiment 1, iteratively adjusting the weights of the face reconstructor and the face discriminator by using a multi-level loss gradient back propagation algorithm until convergence includes:

calculating the total loss value L of the discriminator at each layer _D ：

calculating total loss value on generator

Comprises the following steps:

total loss value L by discriminator _D Total loss value on sum generator

Example 2

In embodiment 2 of the present invention, a method for generating a high-definition face multi-attribute editing countermeasure is provided, so as to implement a multi-attribute face editing task under high-definition pixels (512 × 512). The generation method comprises the following steps:

acquiring a plurality of pairs of real face images and target attribute feature difference vectors;

a face generator based on a dual spatial domain attention mechanism obtains a synthetic image according to an input real face image and a corresponding target attribute feature difference vector;

and calculating the loss value of the loss function according to the real images with different resolutions and the corresponding synthetic images based on the multi-level face discriminator.

Iteratively adjusting the weights of the face generator and the face discriminator by utilizing a multilevel loss gradient back propagation algorithm until convergence according to the loss value to obtain a current face generator;

and based on the current face generator, obtaining a face image with target attribute characteristics according to the face image to be processed and the corresponding target attribute characteristic difference vector.

In this embodiment 2, the obtaining of the synthetic image by the face generator based on the dual spatial domain attention mechanism according to the real face image and the target attribute feature difference vector specifically includes:

based on an airspace attention mechanism, splicing the target attribute feature difference vectors along the channel dimension and the multi-level vectors output by the encoder passing through the selective residual error module one by one, and injecting the spliced target attribute feature difference vectors into a decoder;

a decoder based on a semantic segmentation supervision mechanism fuses real face images and corresponding target attribute feature vectors by using a compression excitation mechanism to obtain synthetic images with different resolutions of target attribute features.

In this embodiment 2, the loss value of the face feature encoder of the loss function includes:

the method comprises the steps of obtaining a confrontation loss value of a multi-level original attribute and target attribute face image, a multi-level original attribute face reconstruction loss value, and a multi-level target attribute generation original attribute face reconstruction loss value and a multi-level attribute classification loss value.

In this example 2, the loss value of the loss function is calculated according to the following formula:

per-layer original attribute penalty-combating value

Comprises the following steps:

per-layer target attribute penalty against value

Comprises the following steps:

L _pix ＝||x ^a -G(x ^a ,a-a)|| ₁ ；

then, the overall face reconstruction loss value L _rec ：

wherein, x represents the original face image,

representing the reconstructed image of the face of a person,

representing real images and creaturesSampling image between formed images, | · | | non-woven phosphor ₁ Indicating the calculated absolute distance, x ^a Representing real images with attribute features a, c ^diff Feature difference vector representing target attribute, D ⁱ Denotes the ith discriminator, G denotes the generator, E _x Indicating the calculation of the expected value, D, sampled by the input image _adv (. cndot.) represents the penalty function of the arbiter,

In this embodiment 2, iteratively adjusting the weights of the face generator and the face discriminator by using a loss gradient back propagation algorithm until convergence according to the loss value specifically includes:

calculating the total loss value L of the discriminator at each layer _D ：

calculating total loss value on generator

Comprises the following steps:

total loss value L by discriminator _D Total loss value on sum generator

In this embodiment 2, a model based on GANs (generic adaptive Networks, anti-generation network) is used, and a face attribute generator with a dual spatial domain attention mechanism and a feature compression excitation mechanism and an anti-generation method for multi-discriminator discrimination are introduced to enhance selectivity at a jump connection layer, so as to implement a multi-attribute face editing function under high-definition pixels. Target attribute difference vectors are injected and spliced in a selective jump connection layer of the generator, and a region needing to be changed is enhanced and guided by utilizing a spatial domain attention mechanism and the injection of spatial domain semantic information, so that irrelevant regions are reserved to a greater extent. Meanwhile, the generation capability of the model is enhanced by the application of the multiple discriminators, so that the face attribute editing under the requirement of high resolution has rich detail retention and successful attribute transformation.

Example 3

In embodiment 3 of the present invention, a method for generating a high-definition face multi-attribute editing countermeasure is provided, so as to solve the problems of information loss or unobvious editing effect of face feature details, poor editing effect of the overall style attributes of a face, and limited capability of generating a high-definition picture by using a model, thereby implementing reconstruction of a face image edited by a high-definition (512 × 512) face multi-attribute.

As shown in fig. 1, the high definition face multi-attribute editing of the present invention includes:

step 100: processing and acquiring a certain number of high-definition to low-definition face images with different resolutions and target attribute feature difference vectors;

step 200: based on a generator of a dual spatial domain attention mechanism and a feature extraction mechanism, connecting the coded images with different resolutions and target attribute feature difference vectors by using a residual error module and the attention mechanism, injecting the coded images into a decoder, introducing spatial domain semantic segmentation face information into an upper sampling layer of the decoder, and decoding and outputting synthetic images with different resolutions through a feature extraction excitation mechanism;

step 300: respectively calculating corresponding loss values according to real face images with different resolutions and synthetic images with corresponding sizes based on a multi-level discriminator;

step 400: according to the loss value, a back propagation algorithm is carried out on a plurality of discriminators by using a loss gradient, and the weights of the face generator and the face discriminators are iteratively adjusted until convergence, so that the current face generator is obtained;

step 500: and based on the current face generator, processing the real input image and the target attribute feature difference vector to obtain an edited face image with the target attribute.

In step 200, the generator based on the dual spatial domain attention mechanism and the feature extraction mechanism and in each layer of jump connection, and a reconstructed face image is obtained through the face generator.

In step 300, the face feature encoder loss values of the loss function include: the method comprises the steps of obtaining a confrontation loss value of a multi-level original attribute and target attribute face image, a multi-level original attribute face reconstruction loss value, and a multi-level target attribute generation original attribute face reconstruction loss value and a multi-level attribute classification loss value.

Specifically, the loss value of the loss function is calculated according to the following formula:

in this example 3, the loss value of the loss function is calculated according to the following formula:

per-layer original attribute penalty-combating value

Comprises the following steps:

per-layer target attribute penalty against value

Comprises the following steps:

loss value L of original attribute face reconstruction of each layer _pix Comprises the following steps:

L _pix ＝||x ^a -G(x ^a ,a-a)|| ₁ ；

then, the overall face reconstruction loss value L _rec ：

wherein, x represents the original face image,

representing the reconstructed image of the face of a person,

In this embodiment 3, iteratively adjusting the weights of the face generator and the face discriminator by using a loss gradient back propagation algorithm until convergence according to the loss value specifically includes:

calculating the total loss value L of the discriminator at each layer _D ：

calculating total loss value on generator

Comprises the following steps:

wherein λ is _clsr 、λ _clsf 、λ _rec 、λ _g Respectively, the loss value L _clsr 、L _clsf 、L _rec 、L _G Weight of (2)；

Total loss value L by discriminator _D Total loss value on sum generator

In this embodiment 3, a model based on GANs (generic adaptive Networks, robust generation network) is used, a face attribute generator with a dual spatial domain attention mechanism and a feature compression excitation mechanism, a multi-discriminant decision and a robust generation method of a loss function are introduced through selective enhancement of jump connection, and a multi-attribute face editing function under high definition pixels is implemented. After the skip selective connection layer of the generator is coded, the input image of the residual error module is spliced with the target attribute difference vector, and the spatial domain attention mechanism is utilized to enhance the characteristic information of the target attribute. Meanwhile, the space domain semantic information is injected into the decoder, and a compression excitation mechanism is adopted to further emphasize and fuse the characteristics, so that the up-sampling process of the decoder is completed, the enhanced guidance of the region needing to be changed is realized, and the irrelevant region is reserved to a greater extent.

In order to enable the face reconstruction model based on the GANs to have good capability of detail generation or editing, in this embodiment 3, a multi-discriminator is adopted to discriminate generated images with different resolutions, and this kind of image discrimination from low to high resolution and multiple gradients are reflowed simultaneously, so that the generation capability of the model is enhanced, and the face attribute editing under the requirement of high resolution has rich detail preservation and successful attribute transformation.

In order to supervise the training process of the model, the embodiment adopts the confrontation loss value of the multi-level original attribute and target attribute face image, the multi-level original attribute face reconstruction loss value, the multi-level target attribute generation original attribute face reconstruction loss value and the multi-level attribute classification loss value for constraint. Specifically, the confrontation loss improves the fidelity of the synthesized face image by punishing the difference between the data distribution of the synthesized face and the data distribution of the real face; attribute classification loss causes the composite image to have the characteristics of target attribute editing; reconstruction loss is mainly aimed at target irrelevant areas in the input image, and irrelevant area change in the image is reduced, so that the model has the capability of generating the similar original image.

Example 4

As shown in fig. 2 and fig. 3, in this embodiment 4, a high-definition face multi-attribute editing countermeasure generating method is provided, in which an original attribute high-definition face image and a target attribute difference vector are respectively input to a face attribute editing generator based on a double attention mechanism, and a face image with a target attribute is synthesized. The method specifically comprises the following steps:

step S1: an original attribute high-definition face image is input into a decoder, as shown in fig. 3, the decoder is spliced with a target attribute difference vector after passing through a residual error module, meanwhile, an attention feature map is obtained based on a spatial domain attention mechanism as shown in fig. 4, and the attention feature map is injected into decoders of different layers through jumping connection.

Step S12: as shown in fig. 5, semantic segmentation maps with different resolution sizes are injected into the decoder, and the semantic information and the output connection of the decoder in the previous layer are simultaneously transmitted to the decoder in the next layer by using a compression excitation mechanism.

Step S13: as shown in fig. 3, the multi-discriminator discriminates the images with different resolutions output by each layer, and further constrains and optimizes the generator to synthesize a face image with target attributes.

In this embodiment 3, the face contrast loss, the attribute classification loss, and the reconstruction loss are calculated for the real and reconstructed face images, and the weight values of the face attribute editing generator and the plurality of discriminators are iteratively adjusted by using the loss gradient back propagation algorithm until convergence. The method specifically comprises the following steps:

step S21: and inputting the obtained target attribute face image and the real face image into a loss function together based on the obtained target attribute face image, and calculating the value of the loss function. The loss function is divided into three parts: face confrontation loss value, face attribute classification loss value and face reconstruction loss value.

Step S22: and inputting the generated image into the generator again based on the target attribute face image obtained in the step, generating an original attribute face image and calculating a face reconstruction loss value.

Step S23: based on the calculated face reconstruction loss value, the face confrontation loss value and the face attribute classification loss value, the weights of the face generator and the discriminator are iteratively adjusted by utilizing a gradient back propagation algorithm until convergence. And editing the target attribute by using the obtained face generator to obtain a high-definition face image with corresponding attribute.

In summary, in the face image reconstruction method provided in this embodiment 4, first, a difference vector between a real face image and a target attribute feature is obtained; then, a face reconstructor based on a double spatial domain attention mechanism and a compression excitation mechanism obtains a synthetic image according to each pair of real face images and the target attribute feature vector; based on multiple human face discriminators with multiple resolutions, calculating loss values of image loss functions according to each real human face image and the corresponding synthetic image; iteratively adjusting the weights of the face reconstructor and the face discriminators by using a loss gradient back propagation algorithm until convergence according to the loss value to obtain a current face reconstructor; and finally, based on the current face reconstructor, obtaining a face image with target attribute characteristics according to the face image to be processed and the corresponding target age characteristic vector. In this embodiment 3, a multi-attribute face editing function under high definition pixels is implemented by the invention of a dual attention mechanism, feature compression extraction, and a multi-discriminator confrontation network structure. The problems that information is lost or the editing effect is not obvious, the whole style attribute editing effect of the human face is poor and the capability of generating high-definition pictures by the model is limited in the human face feature details of the traditional model are solved, so that the human face attribute editing under the requirement of high resolution has richer detail retention and successful attribute transformation.

Example 5

Embodiment 5 of the present invention provides a non-transitory computer-readable storage medium, which is used for storing computer instructions, and when the computer instructions are executed by a processor, the instructions implement the face image reconstruction method as described above, where the method includes:

acquiring a face image to be processed;

acquiring a target attribute feature difference vector corresponding to a face image to be processed;

EXAMPLE 6

Embodiment 6 of the present invention provides a computer program (product) including a computer program, when running on one or more processors, for implementing the face image reconstruction method as described above, the method including:

acquiring a face image to be processed;

Example 7

Embodiment 7 of the present invention provides an electronic device, including: a processor, a memory, and a computer program; wherein a processor is connected with the memory, the computer program is stored in the memory, when the electronic device runs, the processor executes the computer program stored in the memory, so as to make the electronic device execute the face image reconstruction method, the method comprises:

acquiring a face image to be processed;

In summary, the face image reconstruction method and system according to the embodiments of the present invention use a model based on GANs (generic adaptive Networks, anti-generation Networks), enhance selectivity at a jump connection layer, and implement a multi-attribute face editing function under high definition pixels by introducing a face attribute generator with a dual spatial attention mechanism and a feature compression excitation mechanism, and a countervailing generation method with multi-discriminator discrimination. Target attribute difference vectors are injected and spliced in a selective jump connection layer of the generator, and a region needing to be changed is enhanced and guided by utilizing a spatial domain attention mechanism and the injection of spatial domain semantic information, so that irrelevant regions are reserved to a greater extent. Meanwhile, the generation capability of the model is enhanced by the application of the multiple discriminators, so that the face attribute editing under the requirement of high resolution has rich detail retention and successful attribute transformation.

As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present invention has been described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

Although the embodiments of the present invention have been described with reference to the accompanying drawings, it is not intended to limit the scope of the present invention, and it should be understood by those skilled in the art that various modifications and variations can be made without inventive efforts based on the technical solutions disclosed in the present invention.

Claims

1. A face image reconstruction method is characterized by comprising the following steps:

acquiring a face image to be processed;

processing the face image to be processed and the corresponding target attribute feature difference vector by using a face reconstruction model which is constructed in advance to obtain a face reconstruction image with target attribute features; the pre-constructed face reconstruction model is obtained by training a training set, wherein the training set comprises a plurality of real face images with different resolutions and corresponding target attribute feature difference vectors;

the face reconstruction model obtained by training of the training set comprises the following steps:

acquiring a plurality of real face images with different resolutions;

iteratively adjusting the weights of the face reconstructor and the face discriminator by utilizing a multi-level loss gradient back propagation algorithm until convergence according to the loss value to obtain a face reconstruction model;

obtaining real synthetic images with different resolutions of target attribute features, comprising:

connecting the coded real images with different resolutions and target attribute feature difference vectors by using a residual module and an attention mechanism, injecting the connected real images into a decoder, introducing space domain semantic segmentation facial information into an upper sampling layer of the decoder, and decoding and outputting real synthetic images with different resolutions through a feature extraction excitation mechanism;

the loss values include: the method comprises the steps of generating a multilevel original attribute face reconstruction loss value and a multilevel attribute classification loss value by using a multilevel original attribute countermeasure loss value, a multilevel target attribute countermeasure loss value, a multilevel original attribute face reconstruction loss value and a multilevel attribute classification loss value;

per-layer original attribute counter-loss value

Comprises the following steps:

per-layer target attribute penalty against value

Comprises the following steps:

L _pix ＝||x ^a -G(x ^a ,a-a)|| ₁ ；

then, the overall face reconstruction loss value L _rec ：

wherein, x represents the original face image,

representing the reconstructed image of the face of a person,

2. The method of claim 1, wherein iteratively adjusting weights of the face reconstructor and the face discriminator by using a multi-level gradient back propagation algorithm until convergence comprises:

calculating the total loss value L of the discriminator at each layer _D ：

calculating total loss value on generator

Comprises the following steps:

total loss value L by discriminator _D Total loss value on sum generator

3. A face image reconstruction system, comprising:

the reconstruction module is used for processing the face image to be processed and the corresponding target attribute feature difference vector by utilizing a face reconstruction model which is constructed in advance to obtain a face reconstruction image with target attribute features; the pre-constructed face reconstruction model is obtained by training a training set, wherein the training set comprises a plurality of real face images with different resolutions and corresponding target attribute feature difference vectors;

acquiring a plurality of real face images with different resolutions;

per-layer original attribute penalty-combating value

Comprises the following steps:

per-layer target attribute penalty against value

Comprises the following steps:

L _pix ＝||x ^a -G(x ^a ,a-a)|| ₁ ；

then, the overall face reconstruction loss value L _rec ：

wherein, x represents the original face image,

representing the reconstructed image of the face of a person,

4. A non-transitory computer-readable storage medium storing computer instructions which, when executed by a processor, implement the facial image reconstruction method of any one of claims 1-2.

5. A computer program for implementing a method for reconstructing a face image according to any one of claims 1-2 when run on one or more processors.

6. An electronic device, comprising: a processor, a memory, and a computer program; wherein a processor is connected to the memory, a computer program being stored in the memory, the processor executing the computer program stored in the memory when the electronic device is running, to cause the electronic device to perform the face image reconstruction method according to any one of claims 1-2.