CN115082292A - Human face multi-attribute editing method based on global attribute editing direction - Google Patents

Human face multi-attribute editing method based on global attribute editing direction Download PDF

Info

Publication number
CN115082292A
CN115082292A CN202210628783.2A CN202210628783A CN115082292A CN 115082292 A CN115082292 A CN 115082292A CN 202210628783 A CN202210628783 A CN 202210628783A CN 115082292 A CN115082292 A CN 115082292A
Authority
CN
China
Prior art keywords
attribute
editing
representing
global
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210628783.2A
Other languages
Chinese (zh)
Inventor
徐雪妙
曾瑞华
徐洋洋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China University of Technology SCUT
Original Assignee
South China University of Technology SCUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT filed Critical South China University of Technology SCUT
Priority to CN202210628783.2A priority Critical patent/CN115082292A/en
Publication of CN115082292A publication Critical patent/CN115082292A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/04Context-preserving transformations, e.g. by using an importance map
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Multimedia (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a face multi-attribute editing method based on a global attribute editing direction, which comprises the following steps: 1) acquiring a data set attribute association diagram, an attribute semantic embedded set and a scale factor; 2) constructing a global attribute editing network, wherein the input of the network is a data set attribute association diagram, an attribute semantic embedded set and a scale factor, and the output is a global attribute editing direction; 3) designing three target loss functions to optimize the constructed network and storing the optimized network as a model; 4) and in the model test stage, multi-attribute editing is carried out on the face through a user-defined scale factor and a stored model. The method and the device can edit the attributes of the face based on the global attribute editing direction, can solve the problem that multiple times of single editing are needed when other methods are used for editing, and can generate more natural and reasonable face multi-attribute editing effect and more excellent face appearance characteristic maintenance.

Description

Human face multi-attribute editing method based on global attribute editing direction
Technical Field
The invention relates to the technical field of editing of hidden space attributes based on editing directions, in particular to a human face multi-attribute editing method based on a global attribute editing direction, which is used for editing multiple attributes of a given real human face image to obtain an edited human face image with excellent attribute editing effect and human face appearance characteristics.
Background
The human face attribute editing work can realize the human face attribute editing with coarse granularity such as human face aging, human face conversion and the like, and also can realize the human face attribute editing with fine granularity such as human face expression modification, human face hair color and the like, so that the human face editing task plays an important role in daily life and practical application, and the research on the human face editing task in recent years is also widely concerned by the academic and industrial fields.
Most of the existing work focuses on the aspect of face single-attribute editing, and research on the work of face multi-attribute editing is less. Although the existing human face single-attribute editing work can achieve the result of multi-attribute editing by performing multiple single-attribute editing on a human face, the mode may cause the edited human face to exceed the editing space boundary in a hidden space when the attribute of a single human face is edited for multiple times, so that the result of multi-attribute editing has serious artifacts or ghost faces; meanwhile, some single-attribute editing works can cause some irrelevant attributes to change when the face attributes are edited for many times, so that the identity characteristic information of the face is lost, or the attributes needing to be edited are not reasonably edited due to the conflict among the editing attributes.
Disclosure of Invention
The invention aims to overcome the defects and shortcomings of the prior art, provides a human face multi-attribute editing method based on a global attribute editing direction, can learn the global attribute editing direction through a global attribute editing network, solves the problem that single attribute editing work needs to edit a single attribute for multiple times to achieve the purpose when editing multiple attributes, and simultaneously realizes more excellent multi-attribute editing effect and human face appearance characteristic maintenance.
In order to achieve the purpose, the technical scheme provided by the invention is as follows: the face multi-attribute editing method based on the global attribute editing direction comprises the following steps:
1) acquiring a data set attribute association diagram, an attribute semantic embedded set and a scale factor;
calculating a data set attribute association diagram according to attribute labels of each face in the CelebA-HQ face data set; inputting a set of required attribute description texts into a pre-trained CLIP text encoder, and outputting to obtain an attribute semantic embedded set; selecting an original face image and a target face image from CelebA-HQ face data set, subjecting the original face image and the target face image to a pre-trained hidden variable encoder to obtain an original hidden variable and a target hidden variable, subjecting the original hidden variable and the target hidden variable to a pre-trained hidden variable attribute classifier respectively to obtain an original attribute score and a target attribute score, and calculating the difference between the original attribute score and the target attribute score to obtain a scale factor;
2) model construction
Constructing a global attribute editing network, wherein the input of the network is a data set attribute association diagram, an attribute semantic embedded set and a scale factor, and the output is a global attribute editing direction, the global attribute editing network consists of an improved graph convolution neural network and an improved full-connection network, the input of the improved graph convolution neural network is a data set attribute association diagram and an attribute semantic embedded set, and the output is an initial attribute editing direction, aiming at obtaining the editing direction of each initialized attribute, the input of the improved full-connection network is the initial attribute editing direction and the scale factor, and the output is the global attribute editing direction, aiming at adjusting the editing strength of each initialized attribute editing direction by the scale factor, thereby obtaining a more accurate global attribute editing direction;
3) model optimization
In order to better edit the attributes and retain the identity information, three target loss functions are designed to optimize the constructed global attribute editing network, and after optimization, an optimal global attribute editing network is obtained and comprises an optimal improved graph convolution neural network and an optimal improved full-connection network;
4) face property editing
In the model testing stage, the global attribute editing direction output by the optimal global attribute editing network is acted on a given face image in a hidden space, and the multi-attribute edited face image which is excellent in attribute editing effect and can keep identity information is obtained.
Further, the step 1) comprises the following steps:
a. obtaining dataset attribute association graphs
The CelebA-HQ face data set comprises 30000 face images in total, each face image has labels with 40 attributes, the label value of one attribute is 1 to indicate that the face image has the attribute, the label value is-1 to indicate that the face image does not have the attribute, and the face image attribute association graph A is obtained for one face image i ,A i Essentially, it is a adjacency matrix, firstly, the attribute association diagram A of the face image is obtained i Initializing to all zero, if the labels on the jth attribute and the kth attribute of the face image are both 1, then associating the attribute of the face image with the attribute A i The corresponding position is 1, namely A i [j][k]And A i [k][j]Changing the value of the relation to the value of the attribute of the face image into 1, traversing the combination of all j and k, setting the value of the main diagonal of the adjacent matrix to be 1, and obtaining the adjacent matrix which is the attribute association diagram A of the face image through calculation i Data set attribute association graph
Figure BDA0003678980040000034
The solving method is to sum and normalize 30000 face image attribute association graphs, and the method is represented as follows:
Figure BDA0003678980040000031
Figure BDA0003678980040000032
in the formula, A i The attribute association graph of one face image is shown, n represents the number of the face images of the CelebA-HQ face data set, the value is 30000, A represents that the attribute association graphs of the 30000 images are processedThe matrix obtained after summation, D -1 An inverse matrix of the degree matrix representing the summed matrix a,
Figure BDA0003678980040000033
expressing a matrix obtained by normalizing the matrix obtained by summation, namely a data set attribute correlation diagram to be finally obtained;
b. attribute semantic embedded collections
Firstly, initializing a set of required attribute description texts, wherein each attribute description text is an English character string for describing a corresponding attribute, and obtaining corresponding attribute semantic embedding by passing the English character strings through a pre-trained CLIP text encoder, wherein the attribute semantic embedding set is an attribute semantic embedding set and is expressed as follows:
E=[e 1 ,e 2 ,...,e M ] T
wherein E represents an attribute semantic embedding set, E is a vector with the length of M, M represents the number of attributes and also represents the length of the vector, T represents the transposition of the vector, E 1 Semantic Embedded representation representing the first Attribute, e 2 Semantic Embedded representation representing the second Attribute, e M A semantic embedded representation representing an Mth attribute;
c. scaling factor
Randomly selecting two face images in CelebA-HQ face data set as original image I o And a target image I t The original image I is processed o And a target image I t The original hidden variable w is obtained by the pre-trained hidden variable encoder o And a target hidden variable w t ,w o And w t Respectively representing the original image I o And a target image I t Mapping in a hidden space; implicit variable w o And w t Obtaining an original attribute score S by a pre-trained hidden variable attribute classifier o And a target attribute score S t Expressed as:
Figure BDA0003678980040000041
Figure BDA0003678980040000042
in the formula, w o And w t Respectively representing original hidden variables and target hidden variables, C representing a pre-trained hidden variable attribute classifier, S o Represents the original attribute score, S t Representing a target attribute score, S o And S t Is a length M vector, M representing the number of attributes, and also the length of the attribute score vector, T representing the transpose of the vector,
Figure BDA0003678980040000043
represents the original attribute score S o The value at the first position is,
Figure BDA0003678980040000044
represents the original attribute score S o The value at the second position is such that,
Figure BDA0003678980040000045
represents the original attribute score S o The value at the M-th position,
Figure BDA0003678980040000046
representing a target attribute score S t The value at the first position is,
Figure BDA0003678980040000047
representing a target attribute score S t The value at the second position is such that,
Figure BDA0003678980040000048
representing a target attribute score S t A value at the Mth position;
get the original attribute score S o And a target attribute score S t Thereafter, a scaling factor can be calculated, expressed as:
Figure BDA0003678980040000051
Alpha=[α 12 ,...,α M ] T
in the formula (I), the compound is shown in the specification,
Figure BDA0003678980040000052
representing a target attribute score S t The value at the i-th position,
Figure BDA0003678980040000053
represents the original attribute score S o The value at the i-th position, Alpha, represents a scale factor, Alpha i Representing the value of the scale factor at the i-th position, alpha 1 Representing the value of the scale factor at a first position, alpha 2 Indicating the value of the scale factor at the second position, alpha M Denotes the value of the scale factor at the mth position, M denotes the number of attributes, and T denotes the transpose of the vector.
Further, the step 2) comprises the following steps:
constructing a global attribute editing network, wherein the input of the network is a data set attribute association diagram, an attribute semantic embedded set and a scale factor, the output is a global attribute editing direction, and the global attribute editing network consists of an improved graph convolution neural network and an improved full-connection network, and is specifically represented as follows:
a. improved graph convolution neural network
Constructing an improved graph volume neural network, wherein the improved graph volume neural network comprises two layers of network modules, and the network modules are marked as block1 and block2, the input of block1 is an obtained data set attribute association diagram and an attribute semantic embedded set, the output is an intermediate variable, the input of block2 is an intermediate variable output by block1 and a data set attribute association diagram, the output is an initial attribute editing direction, and the whole process is represented as follows:
Figure BDA0003678980040000054
in the formula, N init Representing improved atlas nerveThe network outputs the resulting initial attribute edit direction,
Figure BDA0003678980040000055
representing a data set attribute correlation diagram, E representing an attribute semantic embedded set, W representing weight parameters needing to be learned in an improved graph convolution neural network, expressing vector point multiplication operation, and sigma (DEG) representing a nonlinear activation function;
b. improved fully connected network
Constructing an improved fully-connected network, wherein the improved fully-connected network comprises two fully-connected layers, which are marked as Linear1 and Linear2, the input of Linear1 is the initial attribute editing direction and the scale factor, the activation function is the Leaky-ReLu function, the output of Linear1 is the input of Linear2, Linear2 is the no activation function, the final output of the global attribute editing direction, and the whole process is represented as:
N g =F(N init ×Alpha)
in the formula, N g Representing global property edit direction resulting from output of the improved fully-connected network, F representing the improved fully-connected network, N init Represents the initial property edit direction with a dimension of [40,512]]Alpha represents a scale factor with dimensions [40,1 ]]And x represents the product operation, specifically, N is init Multiplying each 512-dimensional vector by the corresponding value in Alpha.
Further, in step 3), in order to better edit the attributes and retain the identity information, three objective loss functions are designed to optimize the constructed global attribute editing network, and after optimization, an optimal global attribute editing network is obtained, wherein the optimal global attribute editing network comprises an optimal improved graph convolution neural network and an optimal improved fully-connected network; the optimization is specifically a gradient descent algorithm, and it is desirable to optimize weight parameters in the constructed global attribute editing network by the gradient descent algorithm so that values of three objective functions are as small as possible, where the three objective loss functions are expressed as follows:
a. multi-attribute edit loss
Existing original hidden variable w o And a target hidden variable w t The original attribute score S can be obtained through the hidden variable attribute classifier o And a target attribute score S t Adding the obtained global property editing direction to the original hidden variable w o The edited hidden variable can be obtained, and then the attribute score of the edited hidden variable can be obtained by passing the edited hidden variable through a pre-trained hidden variable attribute classifier, wherein the process is represented as follows:
S e =C(w o +N g )
in the formula, S e Representing edited attribute scores, C representing hidden variable attribute classifier, w o Representing original hidden variables, N g The global attribute editing direction output by the global attribute editing network is represented;
to make the edited attribute score S e As close to the target attribute score S as possible t The multi-attribute editing loss is designed, and is specifically expressed as follows:
Figure BDA0003678980040000071
in the formula, L mae Indicating the loss of the multi-property editing,
Figure BDA0003678980040000072
represents the value of the target attribute score at position i,
Figure BDA0003678980040000073
a value representing the edited attribute score at position i,
Figure BDA0003678980040000074
representing the value of the original attribute score at the position i, and log representing logarithmic operation;
b. multi-attribute retention loss
In order to ensure that the property not edited does not change, the multi-property retention loss is designed, and is specifically expressed as follows:
Figure BDA0003678980040000075
in the formula, L map Indicating that the multi-attribute retention loss is,
Figure BDA0003678980040000076
represents the value of the target attribute score at position i,
Figure BDA0003678980040000077
a value representing the edited attribute score at position i,
Figure BDA0003678980040000078
the value of the original attribute score at the i position, | | · | | non-woven phosphor 2 Represents l 2 -a norm;
c. space conservation loss
In order to prevent the original hidden variables from being excessively changed during editing, the space conservation loss is designed, and is specifically expressed as follows:
L sp =||N g || 2
in the formula, L sp Represents the space conservation loss, N g Representing global property editing direction, | · | | non-woven phosphor 2 Is represented by 2 Norm.
Further, in step 4), in the model testing stage, the global attribute editing direction output by the optimal global attribute editing network is applied to a given face image in the hidden space to obtain a multi-attribute edited face image which has an excellent attribute editing effect and can maintain identity information, including the following steps:
4.1) in the testing stage, the attribute association diagram and the attribute semantic embedded set of the data set are subjected to an optimal improved graph convolution neural network to obtain an initial attribute editing direction, wherein the initial attribute editing direction is expressed as follows:
Figure BDA0003678980040000081
in the formula, N init Indicating initial Property editing Direction, M GCN Represents an optimal improved graph convolution neural network,
Figure BDA0003678980040000082
representing a data set attribute association graph, and E represents an attribute semantic embedded set;
4.2) inputting the scale factor customized by the user and the initial attribute editing direction into the optimal improved fully-connected network, and outputting to obtain the global attribute editing direction, wherein the process is represented as follows:
N g =M F (N init ×Alpha test )
in the formula, N g Indicating global property edit direction, M F Fully connected network representing an optimum improvement, N init Indicating initial Property edit Direction, Alpha test Representing a user-defined scale factor;
4.3) for a given original image, the original image is subjected to a pre-trained hidden variable encoder to obtain an original hidden variable, then the global attribute editing direction is acted on the original hidden variable to obtain an edited hidden variable, finally the edited hidden variable is sent to a pre-trained decoder, and a final multi-attribute edited face image is obtained through output, wherein the process is represented as follows:
I e =G(w o +N g )
in the formula I e Representing a multi-attribute edited face image, G representing a pre-trained decoder, w o Representing original hidden variables, N g Indicating a global property edit direction.
Compared with the prior art, the invention has the following advantages and beneficial effects:
1. the invention solves the problem that the prior single attribute editing work needs to edit the attributes of the human face for many times when multi-attribute editing is carried out by searching a global attribute editing direction through a deep learning network, and is simpler in editing difficulty.
2. Compared with other attribute editing work, the method has the advantages that the reasoning time is shorter, and the editing speed of multi-attribute editing on one face is higher.
3. Compared with most other attribute editing work, the method can edit more attributes on the face.
4. The method can generate more reasonable and natural editing effect when editing the multi-attribute group of the face, and can well keep the appearance characteristic of the face.
Drawings
FIG. 1 is a logic flow diagram of the method of the present invention.
FIG. 2 is a graph of the correlation of the attributes of the data sets obtained by the present invention.
FIG. 3 is a schematic diagram of the scale factor obtained by the present invention.
FIG. 4 is a schematic diagram of an improved graph-convolution neural network constructed in accordance with the present invention.
Fig. 5 is a schematic diagram of an improved fully-connected network constructed in accordance with the present invention.
Fig. 6 is a schematic view of a multi-attribute editing process of a human face according to the present invention.
Detailed Description
The present invention will be described in further detail with reference to examples and drawings, but the present invention is not limited thereto.
As shown in fig. 1, the present embodiment provides a human face multi-attribute editing method based on a global attribute editing direction, which includes the following steps:
1) and acquiring a data set attribute association diagram, an attribute semantic embedded set and a scale factor.
The process of obtaining the attribute association graph of the data set is shown in fig. 2, and the specific process is described as follows: firstly, downloading CelebA-HQ data set from the Internet, wherein the data set comprises 30000 faces, and the resolution of each face image is 1024 × 1024, as shown in the leftmost picture in FIG. 2. Meanwhile, each picture is labeled with 40 attributes, specifically [ "beard", "willow leaf eyebrow", "attractive", "eye pouch", "bald head", "bang", "large lip", "big nose", "black hair", "golden hair", "fuzzy", "brown hair", "thick eyebrow", "round, double chin", "glasses", "goat beard", "gray hair", "thick make-up", "high cheekbone", "male sex bone", "male sex" are labeled"," slightly open mouth "," beard "," slender eyes "," no-urheen "," oval face "," white skin "," sharp nose "," posterior movement of hairline "," red-wet double cheeks "," ludwigia chinensis "," smile "," straight hair "," curly hair "," earring "," hat "," lipstick "," necklace "," tie "," young "]The label value of each attribute is 1 or-1, if the label value of a certain attribute is 1, the attribute of the face image is represented, and if the label value is-1, the attribute of the face image is not represented. For a given face image, we can calculate the face image attribute association graph A i The method is to initialize an all-zero adjacent matrix A i The dimension of the matrix is [40,40 ]]If the labels on the jth attribute and the kth attribute of the face image are both 1, then A is used i The corresponding position on the matrix is 1, i.e. A i [j][k]And A i [k][j]Change to 1, traverse all combinations of j and k, and neighbor matrix A i Has a value of 1, i.e. A i [j][j]Above, the calculated adjacency matrix a i Namely a human face image attribute association graph A i Wherein j and k have values in the range of [0,39 ]]And each obtained attribute association diagram of the face image is a symmetric matrix. The middle part of fig. 2 is a property association diagram representing 30000 individual face images, i.e. 30000 dimensions are [40,40 ]]Matrix of (2), data set attribute correlation diagram
Figure BDA0003678980040000105
The solving method is to sum and normalize 30000 face image attribute association graphs, and the method is represented as follows:
Figure BDA0003678980040000101
Figure BDA0003678980040000102
in the formula, A i Attribute association diagram representing human face imageN represents the number of face images of the CelebA-HQ face data set, the value is 30000, A represents a matrix obtained by summing attribute association graphs of the 30000 images, and the dimensionality is [40,40 ]],D -1 The inverse of the degree matrix representing the summed matrix a, the degree matrix being derived by summing all of the rows in the a matrix and placing them on the main diagonal, and setting all other values outside the main diagonal to 0,
Figure BDA0003678980040000103
represents a matrix obtained by normalizing the matrix obtained by summing, each value in the matrix being in the range of [0,1 ]],
Figure BDA0003678980040000104
Namely a data set attribute association diagram which is finally obtained by us;
the specific process for obtaining attribute semantic embedding is as follows: first, a set of 40 attribute description texts is initialized, each attribute description text is an english character string describing a corresponding attribute, for example, the description of the smile attribute is 'smiling'. The 40 English character strings are embedded with 40 corresponding attribute semantemes by a pre-trained CLIP text encoder, and the CLIP text encoder can be directly downloaded from the Internet. Each attribute description text outputs a semantic embedding, the dimension of each semantic embedding is [1,512], the 40 attribute semantic embeddings are attribute semantic embedding sets, the dimension is [40,512], and the semantic embeddings can be expressed as:
E=[e 1 ,e 2 ,...,e M ] T
in the formula, E represents attribute semantic embedded set, E is a vector with the length of M, M represents the number of attributes and the value of 40, and also represents the length of the vector, T represents transposition of the vector, E 1 Semantic Embedded representation representing the first Attribute, e 2 Semantic Embedded representation representing the second Attribute, e M A semantic embedded representation representing an Mth attribute;
the specific process for obtaining the scale factor is shown in fig. 3, and the specific process is described as follows: randomly selecting two face images in CelebA-HQ face data set as original imagesI o And a target image I t The original image I is processed o And a target image I t The original hidden variable w is obtained by the pre-trained hidden variable encoder o And a target hidden variable w t ,w o And w t Respectively represent the original image I o And a target image I t Mapping in hidden space, w o And w t Has a dimension of [1,18,512 ]](ii) a Implicit variable w o And w t The method comprises the steps of pre-training a hidden variable attribute classifier, wherein an activation function of the last layer of the hidden variable attribute classifier is a sigmoid (·) function, and obtaining an original attribute score S o And a target attribute score S t Expressed as:
Figure BDA0003678980040000111
Figure BDA0003678980040000112
in the formula, w o And w t Respectively representing original hidden variables and target hidden variables, C representing a pre-trained hidden variable attribute classifier, S o Represents the original attribute score, S t Representing a target attribute score, S o And S t Is a vector of length M, with dimensions [1,40 ]]The value at each position is in the range of [0,1 ]]M denotes the number of attributes, and also the length of the attribute score vector, and has a value of 40, T denotes the transpose of the vector,
Figure BDA0003678980040000113
represents the original attribute score S o The value at the first position is,
Figure BDA0003678980040000114
represents the original attribute score S o The value at the second position is such that,
Figure BDA0003678980040000115
represents the original attribute score S o The value at the M-th position,
Figure BDA0003678980040000121
representing a target attribute score S t The value at the first position is,
Figure BDA0003678980040000122
representing a target attribute score S t The value at the second position is such that,
Figure BDA0003678980040000123
representing a target attribute score S t A value at the Mth position;
obtaining the original attribute score S o And a target attribute score S t We can then calculate a scaling factor, expressed as:
Figure BDA0003678980040000124
Alpha=[α 12 ,...,α M ] T
in the formula (I), the compound is shown in the specification,
Figure BDA0003678980040000125
representing a target attribute score S t The value at the i-th position,
Figure BDA0003678980040000126
represents the original attribute score S o The value at the ith position, Alpha, represents a scale factor with dimensions [1,40 ]],α i Representing the value of the scale factor at the i-th position, alpha 1 Representing the value of the scale factor at a first position, alpha 2 Representing the value of the scale factor at the second position, α M Denotes the value of the scale factor at the mth position, M denotes the number of attributes, and T denotes the transpose of the vector. Conditions therein
Figure BDA0003678980040000127
Means if
Figure BDA0003678980040000128
And
Figure BDA0003678980040000129
one greater than 0.5 and the other less than 0.5, since the two values represent scores of the property, the actual representation is if the original hidden variable and the target hidden variable have one property and the other does not have the property.
2) And (5) constructing a model.
Constructing a global attribute editing network, wherein the input of the network is a data set attribute association diagram, an attribute semantic embedded set and a scale factor, and the output is a global attribute editing direction, the global attribute editing network consists of an improved graph convolution neural network and an improved full-connection network, and is specifically represented as follows:
a. improved graph convolution neural network
An improved graph convolution neural network is constructed, as shown in fig. 4, the improved graph convolution neural network comprises two layers of network modules, which are marked as block1 and block2, the input of block1 is an obtained data set attribute association diagram and an attribute semantic embedded set, the output is an intermediate variable, the input of block2 is an intermediate variable and a data set attribute association diagram output by block1, the output obtains an initial attribute editing direction, and the whole process can be simplified and expressed as:
Figure BDA0003678980040000131
in the formula, N init Representing the edit direction of the initial attribute obtained by the output of the improved graph convolution neural network, and the dimension is [40,512]],
Figure BDA0003678980040000132
Representing a data set attribute correlation diagram with dimensions [40,40 ]]E denotes an attribute semantic embedding set with dimension [40,512]]W represents the weight parameter to be learned in the improved convolutional neural network, and the dimension is [512,512 ]]And represents a vector dot product operation, and σ () represents a nonlinear activation function, in particular, a Leaky-ReLu function.
b. Improved fully connected network
An improved fully-connected network is constructed, as shown in fig. 5, the improved fully-connected network includes two fully-connected layers, which are denoted as Linear1 and Linear2, the input of Linear1 is the initial attribute editing direction and the scale factor, the activation function is the leak-ReLu function, the output of Linear1 is the input of Linear2, Linear2 is the no activation function, the global attribute editing direction is finally output, and the whole process can be represented as:
N g =F(N init ×Alpha)
in the formula, N g Representing the editing direction of the global attribute obtained by the output of the improved fully-connected network, and the dimension is [1,512]]F denotes an improved fully-connected network, N init Represents the initial property edit direction with a dimension of [40,512]]Alpha represents a scale factor with dimensions [40,1 ]]X represents the product operation, N is init Multiplying each 512-dimensional vector by the corresponding value in Alpha is equivalent to scaling each value in each 512-dimensional vector by a certain factor.
3) And (6) optimizing the model.
In order to better edit the attributes and retain the identity information, three target loss functions are designed to optimize the constructed global attribute editing network, and after optimization, an optimal global attribute editing network is obtained and comprises an optimal improved graph convolution neural network and an optimal improved full-connection network; the optimization is specifically a gradient descent algorithm, and we want to optimize the weight parameters in the constructed global attribute editing network through the gradient descent algorithm so that the values of the three objective functions are as small as possible, and the loss function from the three objectives can be expressed as follows:
the multi-attribute editing loss is calculated as follows: existing original hidden variable w o And a target hidden variable w t Through the hidden variable attribute classifier, the original attribute score S can be obtained o And a target attribute score S t Adding the obtained global property editing direction to the original hidden variable w o The dimension of the global property edit direction is [1,512]]Original hidden variableHas a dimension of [1,18,512 ]]The specific way of adding here is to copy the last dimension vector 18 times in the global property editing direction to obtain the dimension [1,18,512 ]]The global attribute editing direction is added to the original hidden variable one by one, the edited hidden variable can be obtained, the edited hidden variable passes through a pre-trained hidden variable attribute classifier, and the attribute score of the edited hidden variable can be obtained, wherein the process can be expressed as follows:
S e =C(w o +N g )
in the formula, S e Representing the edited attribute score with a dimension of [1, 40%]C denotes a hidden variable attribute classifier, w o Representing original hidden variables, N g A global property edit direction representing an improved fully-connected network output;
to make the edited attribute score S e As close to the target attribute score S as possible t We have designed multi-attribute edit loss, which is specifically expressed as follows:
Figure BDA0003678980040000141
in the formula, L mae Indicating the loss of the multi-property editing,
Figure BDA0003678980040000142
represents the value of the target attribute score at position i,
Figure BDA0003678980040000143
a value representing the edited attribute score at position i,
Figure BDA0003678980040000144
representing the value of the original attribute score at the position i, and log representing logarithmic operation;
the multi-attribute retention penalty is calculated as follows: in order to make the property not edited not change, we design the multi-property retention loss, which is specifically expressed as follows:
Figure BDA0003678980040000151
in the formula, L map Indicating that the multi-attribute retention loss is,
Figure BDA0003678980040000152
represents the value of the target attribute score at position i,
Figure BDA0003678980040000153
a value representing the edited attribute score at position i,
Figure BDA0003678980040000154
the value of the original attribute score at the i position, | | · | | non-woven phosphor 2 Is represented by 2 -a norm;
the spatial retention penalty is calculated as follows: in order to prevent the original hidden variables from being changed excessively during editing, a space conservation loss is designed, and the space conservation loss is specifically expressed as follows:
L sp =||N g || 2
in the formula, L sp Represents the space conservation loss, N g Representing global property editing direction, | · | | non-woven phosphor 2 Is represented by 2 -a norm.
We optimize ten generations of the improved atlas neural network and the improved fully-connected network using a gradient descent algorithm, and save the improved atlas neural network as the optimal improved atlas neural network and the improved fully-connected network as the optimal improved fully-connected network.
4) And editing the face attribute.
In a model testing stage, a global attribute editing direction output by an optimal global attribute editing network is acted on a given face image in a hidden space to obtain a multi-attribute edited face image which has an excellent attribute editing effect and can keep identity information, and the method comprises the following steps:
4.1) in the testing stage, the data set attribute association diagram and the attribute semantic embedded set pass through an optimal improved graph convolution neural network to obtain an initial attribute editing direction, wherein the initial attribute editing direction can be expressed as follows:
Figure BDA0003678980040000155
in the formula, N init Indicating initial Property edit Direction, M GCN A convolutional neural network representing an optimal improvement,
Figure BDA0003678980040000156
representing a data set attribute association graph, and E represents an attribute semantic embedded set;
4.2) inputting the scale factor customized by the user and the initial attribute editing direction into the optimal improved fully-connected network, and outputting to obtain the global attribute editing direction, wherein the process can be expressed as follows:
N g =M F (N init ×Alpha test )
in the formula, N g Indicating global property edit direction, M F Fully connected network representing an optimum improvement, N init Indicating initial Property edit Direction, Alpha test Representing a user-defined scale factor;
4.3) after obtaining the global attribute editing direction, generating a flow of an edited face as shown in fig. 6, for a given original image, pre-training an original image with a hidden variable encoder to obtain an original hidden variable, then acting the global attribute editing direction on the original hidden variable to obtain an edited hidden variable, finally sending the edited hidden variable to a pre-trained decoder, and outputting to obtain a final multi-attribute edited face image, where the flow can be represented as:
I e =G(w o +N g )
in the formula I e Representing a multi-attribute edited face image, G representing a pre-trained decoder, w o Representing original hidden variables, N g Indicating a global property edit direction.
The above embodiments are preferred embodiments of the present invention, but the present invention is not limited to the above embodiments, and any other changes, modifications, substitutions, combinations, and simplifications which do not depart from the spirit and principle of the present invention should be construed as equivalents thereof, and all such changes, modifications, substitutions, combinations, and simplifications are intended to be included in the scope of the present invention.

Claims (5)

1. The face multi-attribute editing method based on the global attribute editing direction is characterized by comprising the following steps of:
1) acquiring a data set attribute association diagram, an attribute semantic embedded set and a scale factor;
calculating a data set attribute association diagram according to attribute labels of each face in the CelebA-HQ face data set; inputting a set of required attribute description texts into a pre-trained CLIP text encoder, and outputting to obtain an attribute semantic embedded set; selecting an original face image and a target face image from CelebA-HQ face data set, subjecting the original face image and the target face image to a pre-trained hidden variable encoder to obtain an original hidden variable and a target hidden variable, subjecting the original hidden variable and the target hidden variable to a pre-trained hidden variable attribute classifier respectively to obtain an original attribute score and a target attribute score, and calculating the difference between the original attribute score and the target attribute score to obtain a scale factor;
2) model construction
Constructing a global attribute editing network, wherein the input of the network is a data set attribute association diagram, an attribute semantic embedded set and a scale factor, and the output is a global attribute editing direction, the global attribute editing network consists of an improved graph convolution neural network and an improved full-connection network, the input of the improved graph convolution neural network is a data set attribute association diagram and an attribute semantic embedded set, and the output is an initial attribute editing direction, aiming at obtaining the editing direction of each initialized attribute, the input of the improved full-connection network is the initial attribute editing direction and the scale factor, and the output is the global attribute editing direction, aiming at adjusting the editing strength of each initialized attribute editing direction by the scale factor, thereby obtaining a more accurate global attribute editing direction;
3) model optimization
In order to better edit the attributes and retain the identity information, three target loss functions are designed to optimize the constructed global attribute editing network, and after optimization, an optimal global attribute editing network is obtained and comprises an optimal improved graph convolution neural network and an optimal improved full-connection network;
4) face property editing
In the model testing stage, the global attribute editing direction obtained by the output of the optimal global attribute editing network acts on a given face image in a hidden space, and the multi-attribute edited face image which has an excellent attribute editing effect and can keep identity information is obtained.
2. The method for editing human face multiple attributes based on global attribute editing direction as claimed in claim 1, wherein the step 1) comprises the following steps:
a. obtaining dataset attribute association graphs
The CelebA-HQ face data set comprises 30000 face images in total, each face image has labels with 40 attributes, the label value of one attribute is 1 to indicate that the face image has the attribute, the label value is-1 to indicate that the face image does not have the attribute, and the face image attribute association graph A is obtained for one face image i ,A i Essentially a adjacency matrix, firstly, the attribute association graph A of the face image is i Initializing to be all zero, if labels on jth attribute and kth attribute of the face image are both 1, then associating graph A with face image attribute i The corresponding position is 1, namely A i [j][k]And A i [k][j]Changing the value of the relation to the value of the attribute of the face image into 1, traversing the combination of all j and k, setting the value of the main diagonal of the adjacent matrix to be 1, and obtaining the adjacent matrix which is the attribute association diagram A of the face image through calculation i Data set attribute association graph
Figure FDA0003678980030000021
The solution is to sum and normalize 30000 face image attribute association graphs, and is expressed as follows:
Figure FDA0003678980030000022
Figure FDA0003678980030000023
in the formula, A i Representing the attribute association diagram of one face image, n representing the number of face images of the CelebA-HQ face data set, the value being 30000, A representing a matrix obtained by summing the attribute association diagrams of the 30000 images, and D -1 An inverse matrix of the degree matrix representing the summed matrix a,
Figure FDA0003678980030000031
expressing a matrix obtained by normalizing the matrix obtained by summation, namely a data set attribute correlation diagram to be finally obtained;
b. attribute semantic embedded collections
Firstly, initializing a set of required attribute description texts, wherein each attribute description text is an English character string for describing a corresponding attribute, and obtaining corresponding attribute semantic embedding by passing the English character strings through a pre-trained CLIP text encoder, wherein the attribute semantic embedding set is an attribute semantic embedding set and is expressed as follows:
E=[e 1 ,e 2 ,...,e M ] T
wherein E represents an attribute semantic embedding set, E is a vector with the length of M, M represents the number of attributes and also represents the length of the vector, T represents the transposition of the vector, E 1 Semantic Embedded representation representing the first Attribute, e 2 Semantic Embedded representation representing the second Attribute, e M A semantic embedded representation representing an Mth attribute;
c. scaling factor
Randomly selecting two face images in CelebA-HQ face data set as original image I o And a target image I t The original image I is processed o And a target image I t The original hidden variable w is obtained by the pre-trained hidden variable encoder o And a target hidden variable w t ,w o And w t Respectively representing the original image I o And a target image I t Mapping in a hidden space; implicit variable w o And w t Pre-trained hidden variable attribute classifier to obtain original attribute score S o And a target attribute score S t Expressed as:
Figure FDA0003678980030000032
Figure FDA0003678980030000033
in the formula, w o And w t Respectively representing original hidden variables and target hidden variables, C representing a pre-trained hidden variable attribute classifier, S o Represents the original attribute score, S t Representing a target attribute score, S o And S t Is a length M vector, M representing the number of attributes, and also the length of the attribute score vector, T representing the transpose of the vector,
Figure FDA0003678980030000034
represents the original attribute score S o The value at the first position is,
Figure FDA0003678980030000035
represents the original attribute score S o The value at the second position is such that,
Figure FDA0003678980030000036
represents the original attribute score S o The value at the M-th position,
Figure FDA0003678980030000037
representing a target attribute score S t The value at the first position is such that,
Figure FDA0003678980030000041
representing a target attribute score S t The value at the second position is such that,
Figure FDA0003678980030000042
representing a target attribute score S t A value at the Mth position;
obtaining the original attribute score S o And a target attribute score S t Thereafter, a scaling factor can be calculated, expressed as:
Figure FDA0003678980030000043
Alpha=[α 12 ,...,α M ] T
in the formula (I), the compound is shown in the specification,
Figure FDA0003678980030000044
representing a target attribute score S t The value at the i-th position,
Figure FDA0003678980030000045
represents the original attribute score S o The value at the i-th position, Alpha, represents a scale factor, Alpha i Representing the value of the scale factor at the i-th position, alpha 1 Representing the value of the scale factor at a first position, alpha 2 Indicating the value of the scale factor at the second position, alpha M Denotes the value of the scale factor at the mth position, M denotes the number of attributes, and T denotes the transpose of the vector.
3. The method for editing human face multiple attributes based on global attribute editing direction as claimed in claim 1, wherein the step 2) comprises the following steps:
constructing a global attribute editing network, wherein the input of the network is a data set attribute association diagram, an attribute semantic embedded set and a scale factor, the output is a global attribute editing direction, and the global attribute editing network consists of an improved graph convolution neural network and an improved full-connection network, and is specifically represented as follows:
a. improved graph convolution neural network
Constructing an improved graph convolution neural network, wherein the improved graph convolution neural network comprises two layers of network modules which are marked as block1 and block2, the input of block1 is an obtained data set attribute association diagram and an attribute semantic embedded set, the output is an intermediate variable, the input of block2 is an intermediate variable and a data set attribute association diagram output by block1, the output obtains an initial attribute editing direction, and the whole process is represented as:
Figure FDA0003678980030000046
in the formula, N init Representing the initial attribute edit direction resulting from the improved graph convolution neural network output,
Figure FDA0003678980030000047
representing a data set attribute correlation diagram, E representing an attribute semantic embedded set, W representing weight parameters needing to be learned in an improved graph convolution neural network, expressing vector point multiplication operation, and sigma (DEG) representing a nonlinear activation function;
b. improved fully connected network
Constructing an improved fully-connected network, wherein the improved fully-connected network comprises two fully-connected layers, which are marked as Linear1 and Linear2, the input of Linear1 is the initial attribute editing direction and the scale factor, the activation function is the Leaky-ReLu function, the output of Linear1 is the input of Linear2, Linear2 is the no activation function, the final output of the global attribute editing direction, and the whole process is represented as:
N g =F(N init ×Alpha)
in the formula, N g Representing global property edit direction resulting from output of the improved fully-connected network, F representing the improved fully-connected network, N init Indicating initial property edit direction, dimensionDegree of [40,512]Alpha represents a scale factor with dimensions [40,1 ]]And x represents the product operation, specifically, N is init Multiplying each 512-dimensional vector by the corresponding value in Alpha.
4. The method for editing human face multiple attributes based on global attribute editing direction according to claim 1, wherein: in step 3), in order to better edit the attributes and retain the identity information, three target loss functions are designed to optimize the constructed global attribute editing network, and after optimization, an optimal global attribute editing network is obtained, wherein the optimal global attribute editing network comprises an optimal improved graph convolution neural network and an optimal improved fully-connected network; the optimization is specifically a gradient descent algorithm, and it is desirable to optimize weight parameters in the constructed global attribute editing network by the gradient descent algorithm so that values of three objective functions are as small as possible, where the three objective loss functions are expressed as follows:
a. multi-attribute edit loss
Existing original hidden variable w o And a target hidden variable w t Through the hidden variable attribute classifier, the original attribute score S can be obtained o And a target attribute score S t Adding the obtained global property editing direction to the original hidden variable w o The edited hidden variable can be obtained, and then the attribute score of the edited hidden variable can be obtained by passing the edited hidden variable through a pre-trained hidden variable attribute classifier, wherein the process is represented as follows:
S e =C(w o +N g )
in the formula, S e Representing edited attribute scores, C representing hidden variable attribute classifier, w o Representing original hidden variables, N g The global attribute editing direction output by the global attribute editing network is represented;
to make the edited attribute score S e As close to the target attribute score S as possible t The multi-attribute editing loss is designed, and is specifically expressed as follows:
Figure FDA0003678980030000061
in the formula, L mae Indicating the loss of the multi-property editing,
Figure FDA0003678980030000062
represents the value of the target attribute score at position i,
Figure FDA0003678980030000063
a value representing the edited attribute score at position i,
Figure FDA0003678980030000064
representing the value of the original attribute score at the position i, and log representing logarithmic operation;
b. multi-attribute retention loss
In order to ensure that the property not edited does not change, the multi-property retention loss is designed, and is specifically expressed as follows:
Figure FDA0003678980030000065
in the formula, L map Indicating that the multi-attribute retention loss is,
Figure FDA0003678980030000066
represents the value of the target attribute score at position i,
Figure FDA0003678980030000067
a value representing the edited attribute score at position i,
Figure FDA0003678980030000068
the value of the original attribute score at the i position, | | · | | non-woven phosphor 2 Is represented by 2 -a norm;
c. space retention loss
In order to prevent the original hidden variables from being excessively changed during editing, the space conservation loss is designed, and is specifically expressed as follows:
L sp =||N g || 2
in the formula, L sp Represents the space conservation loss, N g Representing global property editing direction, | · | | non-woven phosphor 2 Is represented by 2 -a norm.
5. The method for editing human face multiple attributes based on global attribute editing direction according to claim 1, wherein: in step 4), in a model testing stage, a global attribute editing direction output by the optimal global attribute editing network is acted on a given face image in a hidden space to obtain a multi-attribute edited face image which has an excellent attribute editing effect and can maintain identity information, and the method includes the following steps:
4.1) in the testing stage, the attribute association diagram and the attribute semantic embedded set of the data set are subjected to an optimal improved graph convolution neural network to obtain an initial attribute editing direction, wherein the initial attribute editing direction is expressed as follows:
Figure FDA0003678980030000071
in the formula, N init Indicating initial Property edit Direction, M GCN Represents an optimal improved graph convolution neural network,
Figure FDA0003678980030000072
representing a data set attribute association graph, and E represents an attribute semantic embedded set;
4.2) inputting the scale factor customized by the user and the initial attribute editing direction into the optimal improved fully-connected network, and outputting to obtain the global attribute editing direction, wherein the process is represented as follows:
N g =M F (N init ×Alpha test )
in the formula, N g Indicating Global Property edit Direction, M F Fully connected network representing an optimum improvement, N init Indicating initial Property edit Direction, Alpha test Representing a user-defined scale factor;
4.3) for a given original image, the original image is subjected to a pre-trained hidden variable encoder to obtain an original hidden variable, then the global attribute editing direction is acted on the original hidden variable to obtain an edited hidden variable, finally the edited hidden variable is sent to a pre-trained decoder, and a final multi-attribute edited face image is obtained through output, wherein the process is represented as follows:
I e =G(w o +N g )
in the formula I e Representing a multi-attribute edited face image, G representing a pre-trained decoder, w o Representing original hidden variables, N g Indicating a global property edit direction.
CN202210628783.2A 2022-06-06 2022-06-06 Human face multi-attribute editing method based on global attribute editing direction Pending CN115082292A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210628783.2A CN115082292A (en) 2022-06-06 2022-06-06 Human face multi-attribute editing method based on global attribute editing direction

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210628783.2A CN115082292A (en) 2022-06-06 2022-06-06 Human face multi-attribute editing method based on global attribute editing direction

Publications (1)

Publication Number Publication Date
CN115082292A true CN115082292A (en) 2022-09-20

Family

ID=83249604

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210628783.2A Pending CN115082292A (en) 2022-06-06 2022-06-06 Human face multi-attribute editing method based on global attribute editing direction

Country Status (1)

Country Link
CN (1) CN115082292A (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130108175A1 (en) * 2011-10-28 2013-05-02 Raymond William Ptucha Image Recomposition From Face Detection And Facial Features
US20140160134A1 (en) * 2011-08-31 2014-06-12 Timur Nuruahitovich Bekmambetov Visualization of a natural language text
CN111368662A (en) * 2020-02-25 2020-07-03 华南理工大学 Method, device, storage medium and equipment for editing attribute of face image
CN111914617A (en) * 2020-06-10 2020-11-10 华南理工大学 Face attribute editing method based on balanced stack type generation countermeasure network
CN111932444A (en) * 2020-07-16 2020-11-13 中国石油大学(华东) Face attribute editing method based on generation countermeasure network and information processing terminal
CN112734873A (en) * 2020-12-31 2021-04-30 北京深尚科技有限公司 Image attribute editing method, device, equipment and medium for resisting generation network
CN113963409A (en) * 2021-10-25 2022-01-21 百果园技术(新加坡)有限公司 Training of face attribute editing model and face attribute editing method
CN114240736A (en) * 2021-12-06 2022-03-25 中国科学院沈阳自动化研究所 Method for simultaneously generating and editing any human face attribute based on VAE and cGAN

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140160134A1 (en) * 2011-08-31 2014-06-12 Timur Nuruahitovich Bekmambetov Visualization of a natural language text
US20130108175A1 (en) * 2011-10-28 2013-05-02 Raymond William Ptucha Image Recomposition From Face Detection And Facial Features
CN111368662A (en) * 2020-02-25 2020-07-03 华南理工大学 Method, device, storage medium and equipment for editing attribute of face image
CN111914617A (en) * 2020-06-10 2020-11-10 华南理工大学 Face attribute editing method based on balanced stack type generation countermeasure network
CN111932444A (en) * 2020-07-16 2020-11-13 中国石油大学(华东) Face attribute editing method based on generation countermeasure network and information processing terminal
CN112734873A (en) * 2020-12-31 2021-04-30 北京深尚科技有限公司 Image attribute editing method, device, equipment and medium for resisting generation network
CN113963409A (en) * 2021-10-25 2022-01-21 百果园技术(新加坡)有限公司 Training of face attribute editing model and face attribute editing method
CN114240736A (en) * 2021-12-06 2022-03-25 中国科学院沈阳自动化研究所 Method for simultaneously generating and editing any human face attribute based on VAE and cGAN

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
随海亮;马军山;李丽莹;: "基于生成对抗网络与FACS的面部表情合成研究", 软件导刊, no. 06, 15 June 2020 (2020-06-15) *
黄韬;贾西平;林智勇;马震远;: "基于生成对抗网络的文本引导人物图像编辑方法", 广东技术师范大学学报, no. 03, 25 June 2020 (2020-06-25) *

Similar Documents

Publication Publication Date Title
CN109447906B (en) Picture synthesis method based on generation countermeasure network
CN111814706B (en) Face recognition and attribute classification method based on multitask convolutional neural network
Hou et al. Improving variational autoencoder with deep feature consistent and generative adversarial training
CN107798349B (en) Transfer learning method based on depth sparse self-coding machine
CN109472024A (en) A kind of file classification method based on bidirectional circulating attention neural network
CN108108677A (en) One kind is based on improved CNN facial expression recognizing methods
CN108830287A (en) The Chinese image, semantic of Inception network integration multilayer GRU based on residual error connection describes method
CN111127146B (en) Information recommendation method and system based on convolutional neural network and noise reduction self-encoder
CN109815826A (en) The generation method and device of face character model
CN110188794B (en) Deep learning model training method, device, equipment and storage medium
Xu et al. (Retracted) Method of generating face image based on text description of generating adversarial network
CN117521672A (en) Method for generating continuous pictures by long text based on diffusion model
CN115393933A (en) Video face emotion recognition method based on frame attention mechanism
CN110276396A (en) Picture based on object conspicuousness and cross-module state fusion feature describes generation method
Qu et al. Perceptual-DualGAN: perceptual losses for image to image translation with generative adversarial nets
Da et al. Brain CT image classification with deep neural networks
CN117522697A (en) Face image generation method, face image generation system and model training method
CN116543289B (en) Image description method based on encoder-decoder and Bi-LSTM attention model
Sawant et al. Text to image generation using GAN
CN115082292A (en) Human face multi-attribute editing method based on global attribute editing direction
CN116311472A (en) Micro-expression recognition method and device based on multi-level graph convolution network
Fang et al. Facial makeup transfer with GAN for different aging faces
CN116503499A (en) Sketch drawing generation method and system based on cyclic generation countermeasure network
Yauri-Lozano et al. Generative Adversarial Networks for text-to-face synthesis & generation: A quantitative–qualitative analysis of Natural Language Processing encoders for Spanish
CN113947520A (en) Method for realizing face makeup conversion based on generation of confrontation network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination