CN115082292A - Human face multi-attribute editing method based on global attribute editing direction - Google Patents
Human face multi-attribute editing method based on global attribute editing direction Download PDFInfo
- Publication number
- CN115082292A CN115082292A CN202210628783.2A CN202210628783A CN115082292A CN 115082292 A CN115082292 A CN 115082292A CN 202210628783 A CN202210628783 A CN 202210628783A CN 115082292 A CN115082292 A CN 115082292A
- Authority
- CN
- China
- Prior art keywords
- attribute
- editing
- representing
- global
- network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 46
- 238000010586 diagram Methods 0.000 claims abstract description 42
- 230000006870 function Effects 0.000 claims abstract description 27
- 238000012360 testing method Methods 0.000 claims abstract description 15
- 230000000694 effects Effects 0.000 claims abstract description 9
- 238000013528 artificial neural network Methods 0.000 claims description 34
- 239000011159 matrix material Substances 0.000 claims description 33
- 230000008569 process Effects 0.000 claims description 20
- 230000014759 maintenance of location Effects 0.000 claims description 11
- 230000004913 activation Effects 0.000 claims description 10
- 238000005457 optimization Methods 0.000 claims description 10
- 230000000977 initiatory effect Effects 0.000 claims description 9
- OAICVXFJPJFONN-UHFFFAOYSA-N Phosphorus Chemical compound [P] OAICVXFJPJFONN-UHFFFAOYSA-N 0.000 claims description 6
- 230000008859 change Effects 0.000 claims description 5
- 230000006872 improvement Effects 0.000 claims description 4
- 150000001875 compounds Chemical class 0.000 claims description 3
- 238000013507 mapping Methods 0.000 claims description 3
- 230000017105 transposition Effects 0.000 claims description 3
- 238000004364 calculation method Methods 0.000 claims description 2
- 238000010276 construction Methods 0.000 claims description 2
- 238000012423 maintenance Methods 0.000 abstract description 2
- 210000004209 hair Anatomy 0.000 description 6
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 238000013527 convolutional neural network Methods 0.000 description 2
- 210000004709 eyebrow Anatomy 0.000 description 2
- 210000000887 face Anatomy 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 238000006467 substitution reaction Methods 0.000 description 2
- 238000012549 training Methods 0.000 description 2
- 235000016622 Filipendula ulmaria Nutrition 0.000 description 1
- 241000220645 Leonotis nepetifolia Species 0.000 description 1
- 241000721671 Ludwigia Species 0.000 description 1
- 241000124033 Salix Species 0.000 description 1
- 235000004478 Tragopogon dubius Nutrition 0.000 description 1
- 235000005699 Tragopogon pratensis Nutrition 0.000 description 1
- 230000032683 aging Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 210000000988 bone and bone Anatomy 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000008921 facial expression Effects 0.000 description 1
- 239000011521 glass Substances 0.000 description 1
- 230000037308 hair color Effects 0.000 description 1
- 210000000216 zygoma Anatomy 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/04—Context-preserving transformations, e.g. by using an importance map
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Multimedia (AREA)
- Medical Informatics (AREA)
- Databases & Information Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- General Engineering & Computer Science (AREA)
- Molecular Biology (AREA)
- Data Mining & Analysis (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a face multi-attribute editing method based on a global attribute editing direction, which comprises the following steps: 1) acquiring a data set attribute association diagram, an attribute semantic embedded set and a scale factor; 2) constructing a global attribute editing network, wherein the input of the network is a data set attribute association diagram, an attribute semantic embedded set and a scale factor, and the output is a global attribute editing direction; 3) designing three target loss functions to optimize the constructed network and storing the optimized network as a model; 4) and in the model test stage, multi-attribute editing is carried out on the face through a user-defined scale factor and a stored model. The method and the device can edit the attributes of the face based on the global attribute editing direction, can solve the problem that multiple times of single editing are needed when other methods are used for editing, and can generate more natural and reasonable face multi-attribute editing effect and more excellent face appearance characteristic maintenance.
Description
Technical Field
The invention relates to the technical field of editing of hidden space attributes based on editing directions, in particular to a human face multi-attribute editing method based on a global attribute editing direction, which is used for editing multiple attributes of a given real human face image to obtain an edited human face image with excellent attribute editing effect and human face appearance characteristics.
Background
The human face attribute editing work can realize the human face attribute editing with coarse granularity such as human face aging, human face conversion and the like, and also can realize the human face attribute editing with fine granularity such as human face expression modification, human face hair color and the like, so that the human face editing task plays an important role in daily life and practical application, and the research on the human face editing task in recent years is also widely concerned by the academic and industrial fields.
Most of the existing work focuses on the aspect of face single-attribute editing, and research on the work of face multi-attribute editing is less. Although the existing human face single-attribute editing work can achieve the result of multi-attribute editing by performing multiple single-attribute editing on a human face, the mode may cause the edited human face to exceed the editing space boundary in a hidden space when the attribute of a single human face is edited for multiple times, so that the result of multi-attribute editing has serious artifacts or ghost faces; meanwhile, some single-attribute editing works can cause some irrelevant attributes to change when the face attributes are edited for many times, so that the identity characteristic information of the face is lost, or the attributes needing to be edited are not reasonably edited due to the conflict among the editing attributes.
Disclosure of Invention
The invention aims to overcome the defects and shortcomings of the prior art, provides a human face multi-attribute editing method based on a global attribute editing direction, can learn the global attribute editing direction through a global attribute editing network, solves the problem that single attribute editing work needs to edit a single attribute for multiple times to achieve the purpose when editing multiple attributes, and simultaneously realizes more excellent multi-attribute editing effect and human face appearance characteristic maintenance.
In order to achieve the purpose, the technical scheme provided by the invention is as follows: the face multi-attribute editing method based on the global attribute editing direction comprises the following steps:
1) acquiring a data set attribute association diagram, an attribute semantic embedded set and a scale factor;
calculating a data set attribute association diagram according to attribute labels of each face in the CelebA-HQ face data set; inputting a set of required attribute description texts into a pre-trained CLIP text encoder, and outputting to obtain an attribute semantic embedded set; selecting an original face image and a target face image from CelebA-HQ face data set, subjecting the original face image and the target face image to a pre-trained hidden variable encoder to obtain an original hidden variable and a target hidden variable, subjecting the original hidden variable and the target hidden variable to a pre-trained hidden variable attribute classifier respectively to obtain an original attribute score and a target attribute score, and calculating the difference between the original attribute score and the target attribute score to obtain a scale factor;
2) model construction
Constructing a global attribute editing network, wherein the input of the network is a data set attribute association diagram, an attribute semantic embedded set and a scale factor, and the output is a global attribute editing direction, the global attribute editing network consists of an improved graph convolution neural network and an improved full-connection network, the input of the improved graph convolution neural network is a data set attribute association diagram and an attribute semantic embedded set, and the output is an initial attribute editing direction, aiming at obtaining the editing direction of each initialized attribute, the input of the improved full-connection network is the initial attribute editing direction and the scale factor, and the output is the global attribute editing direction, aiming at adjusting the editing strength of each initialized attribute editing direction by the scale factor, thereby obtaining a more accurate global attribute editing direction;
3) model optimization
In order to better edit the attributes and retain the identity information, three target loss functions are designed to optimize the constructed global attribute editing network, and after optimization, an optimal global attribute editing network is obtained and comprises an optimal improved graph convolution neural network and an optimal improved full-connection network;
4) face property editing
In the model testing stage, the global attribute editing direction output by the optimal global attribute editing network is acted on a given face image in a hidden space, and the multi-attribute edited face image which is excellent in attribute editing effect and can keep identity information is obtained.
Further, the step 1) comprises the following steps:
a. obtaining dataset attribute association graphs
The CelebA-HQ face data set comprises 30000 face images in total, each face image has labels with 40 attributes, the label value of one attribute is 1 to indicate that the face image has the attribute, the label value is-1 to indicate that the face image does not have the attribute, and the face image attribute association graph A is obtained for one face image i ,A i Essentially, it is a adjacency matrix, firstly, the attribute association diagram A of the face image is obtained i Initializing to all zero, if the labels on the jth attribute and the kth attribute of the face image are both 1, then associating the attribute of the face image with the attribute A i The corresponding position is 1, namely A i [j][k]And A i [k][j]Changing the value of the relation to the value of the attribute of the face image into 1, traversing the combination of all j and k, setting the value of the main diagonal of the adjacent matrix to be 1, and obtaining the adjacent matrix which is the attribute association diagram A of the face image through calculation i Data set attribute association graphThe solving method is to sum and normalize 30000 face image attribute association graphs, and the method is represented as follows:
in the formula, A i The attribute association graph of one face image is shown, n represents the number of the face images of the CelebA-HQ face data set, the value is 30000, A represents that the attribute association graphs of the 30000 images are processedThe matrix obtained after summation, D -1 An inverse matrix of the degree matrix representing the summed matrix a,expressing a matrix obtained by normalizing the matrix obtained by summation, namely a data set attribute correlation diagram to be finally obtained;
b. attribute semantic embedded collections
Firstly, initializing a set of required attribute description texts, wherein each attribute description text is an English character string for describing a corresponding attribute, and obtaining corresponding attribute semantic embedding by passing the English character strings through a pre-trained CLIP text encoder, wherein the attribute semantic embedding set is an attribute semantic embedding set and is expressed as follows:
E=[e 1 ,e 2 ,...,e M ] T
wherein E represents an attribute semantic embedding set, E is a vector with the length of M, M represents the number of attributes and also represents the length of the vector, T represents the transposition of the vector, E 1 Semantic Embedded representation representing the first Attribute, e 2 Semantic Embedded representation representing the second Attribute, e M A semantic embedded representation representing an Mth attribute;
c. scaling factor
Randomly selecting two face images in CelebA-HQ face data set as original image I o And a target image I t The original image I is processed o And a target image I t The original hidden variable w is obtained by the pre-trained hidden variable encoder o And a target hidden variable w t ,w o And w t Respectively representing the original image I o And a target image I t Mapping in a hidden space; implicit variable w o And w t Obtaining an original attribute score S by a pre-trained hidden variable attribute classifier o And a target attribute score S t Expressed as:
in the formula, w o And w t Respectively representing original hidden variables and target hidden variables, C representing a pre-trained hidden variable attribute classifier, S o Represents the original attribute score, S t Representing a target attribute score, S o And S t Is a length M vector, M representing the number of attributes, and also the length of the attribute score vector, T representing the transpose of the vector,represents the original attribute score S o The value at the first position is,represents the original attribute score S o The value at the second position is such that,represents the original attribute score S o The value at the M-th position,representing a target attribute score S t The value at the first position is,representing a target attribute score S t The value at the second position is such that,representing a target attribute score S t A value at the Mth position;
get the original attribute score S o And a target attribute score S t Thereafter, a scaling factor can be calculated, expressed as:
Alpha=[α 1 ,α 2 ,...,α M ] T
in the formula (I), the compound is shown in the specification,representing a target attribute score S t The value at the i-th position,represents the original attribute score S o The value at the i-th position, Alpha, represents a scale factor, Alpha i Representing the value of the scale factor at the i-th position, alpha 1 Representing the value of the scale factor at a first position, alpha 2 Indicating the value of the scale factor at the second position, alpha M Denotes the value of the scale factor at the mth position, M denotes the number of attributes, and T denotes the transpose of the vector.
Further, the step 2) comprises the following steps:
constructing a global attribute editing network, wherein the input of the network is a data set attribute association diagram, an attribute semantic embedded set and a scale factor, the output is a global attribute editing direction, and the global attribute editing network consists of an improved graph convolution neural network and an improved full-connection network, and is specifically represented as follows:
a. improved graph convolution neural network
Constructing an improved graph volume neural network, wherein the improved graph volume neural network comprises two layers of network modules, and the network modules are marked as block1 and block2, the input of block1 is an obtained data set attribute association diagram and an attribute semantic embedded set, the output is an intermediate variable, the input of block2 is an intermediate variable output by block1 and a data set attribute association diagram, the output is an initial attribute editing direction, and the whole process is represented as follows:
in the formula, N init Representing improved atlas nerveThe network outputs the resulting initial attribute edit direction,representing a data set attribute correlation diagram, E representing an attribute semantic embedded set, W representing weight parameters needing to be learned in an improved graph convolution neural network, expressing vector point multiplication operation, and sigma (DEG) representing a nonlinear activation function;
b. improved fully connected network
Constructing an improved fully-connected network, wherein the improved fully-connected network comprises two fully-connected layers, which are marked as Linear1 and Linear2, the input of Linear1 is the initial attribute editing direction and the scale factor, the activation function is the Leaky-ReLu function, the output of Linear1 is the input of Linear2, Linear2 is the no activation function, the final output of the global attribute editing direction, and the whole process is represented as:
N g =F(N init ×Alpha)
in the formula, N g Representing global property edit direction resulting from output of the improved fully-connected network, F representing the improved fully-connected network, N init Represents the initial property edit direction with a dimension of [40,512]]Alpha represents a scale factor with dimensions [40,1 ]]And x represents the product operation, specifically, N is init Multiplying each 512-dimensional vector by the corresponding value in Alpha.
Further, in step 3), in order to better edit the attributes and retain the identity information, three objective loss functions are designed to optimize the constructed global attribute editing network, and after optimization, an optimal global attribute editing network is obtained, wherein the optimal global attribute editing network comprises an optimal improved graph convolution neural network and an optimal improved fully-connected network; the optimization is specifically a gradient descent algorithm, and it is desirable to optimize weight parameters in the constructed global attribute editing network by the gradient descent algorithm so that values of three objective functions are as small as possible, where the three objective loss functions are expressed as follows:
a. multi-attribute edit loss
Existing original hidden variable w o And a target hidden variable w t The original attribute score S can be obtained through the hidden variable attribute classifier o And a target attribute score S t Adding the obtained global property editing direction to the original hidden variable w o The edited hidden variable can be obtained, and then the attribute score of the edited hidden variable can be obtained by passing the edited hidden variable through a pre-trained hidden variable attribute classifier, wherein the process is represented as follows:
S e =C(w o +N g )
in the formula, S e Representing edited attribute scores, C representing hidden variable attribute classifier, w o Representing original hidden variables, N g The global attribute editing direction output by the global attribute editing network is represented;
to make the edited attribute score S e As close to the target attribute score S as possible t The multi-attribute editing loss is designed, and is specifically expressed as follows:
in the formula, L mae Indicating the loss of the multi-property editing,represents the value of the target attribute score at position i,a value representing the edited attribute score at position i,representing the value of the original attribute score at the position i, and log representing logarithmic operation;
b. multi-attribute retention loss
In order to ensure that the property not edited does not change, the multi-property retention loss is designed, and is specifically expressed as follows:
in the formula, L map Indicating that the multi-attribute retention loss is,represents the value of the target attribute score at position i,a value representing the edited attribute score at position i,the value of the original attribute score at the i position, | | · | | non-woven phosphor 2 Represents l 2 -a norm;
c. space conservation loss
In order to prevent the original hidden variables from being excessively changed during editing, the space conservation loss is designed, and is specifically expressed as follows:
L sp =||N g || 2
in the formula, L sp Represents the space conservation loss, N g Representing global property editing direction, | · | | non-woven phosphor 2 Is represented by 2 Norm.
Further, in step 4), in the model testing stage, the global attribute editing direction output by the optimal global attribute editing network is applied to a given face image in the hidden space to obtain a multi-attribute edited face image which has an excellent attribute editing effect and can maintain identity information, including the following steps:
4.1) in the testing stage, the attribute association diagram and the attribute semantic embedded set of the data set are subjected to an optimal improved graph convolution neural network to obtain an initial attribute editing direction, wherein the initial attribute editing direction is expressed as follows:
in the formula, N init Indicating initial Property editing Direction, M GCN Represents an optimal improved graph convolution neural network,representing a data set attribute association graph, and E represents an attribute semantic embedded set;
4.2) inputting the scale factor customized by the user and the initial attribute editing direction into the optimal improved fully-connected network, and outputting to obtain the global attribute editing direction, wherein the process is represented as follows:
N g =M F (N init ×Alpha test )
in the formula, N g Indicating global property edit direction, M F Fully connected network representing an optimum improvement, N init Indicating initial Property edit Direction, Alpha test Representing a user-defined scale factor;
4.3) for a given original image, the original image is subjected to a pre-trained hidden variable encoder to obtain an original hidden variable, then the global attribute editing direction is acted on the original hidden variable to obtain an edited hidden variable, finally the edited hidden variable is sent to a pre-trained decoder, and a final multi-attribute edited face image is obtained through output, wherein the process is represented as follows:
I e =G(w o +N g )
in the formula I e Representing a multi-attribute edited face image, G representing a pre-trained decoder, w o Representing original hidden variables, N g Indicating a global property edit direction.
Compared with the prior art, the invention has the following advantages and beneficial effects:
1. the invention solves the problem that the prior single attribute editing work needs to edit the attributes of the human face for many times when multi-attribute editing is carried out by searching a global attribute editing direction through a deep learning network, and is simpler in editing difficulty.
2. Compared with other attribute editing work, the method has the advantages that the reasoning time is shorter, and the editing speed of multi-attribute editing on one face is higher.
3. Compared with most other attribute editing work, the method can edit more attributes on the face.
4. The method can generate more reasonable and natural editing effect when editing the multi-attribute group of the face, and can well keep the appearance characteristic of the face.
Drawings
FIG. 1 is a logic flow diagram of the method of the present invention.
FIG. 2 is a graph of the correlation of the attributes of the data sets obtained by the present invention.
FIG. 3 is a schematic diagram of the scale factor obtained by the present invention.
FIG. 4 is a schematic diagram of an improved graph-convolution neural network constructed in accordance with the present invention.
Fig. 5 is a schematic diagram of an improved fully-connected network constructed in accordance with the present invention.
Fig. 6 is a schematic view of a multi-attribute editing process of a human face according to the present invention.
Detailed Description
The present invention will be described in further detail with reference to examples and drawings, but the present invention is not limited thereto.
As shown in fig. 1, the present embodiment provides a human face multi-attribute editing method based on a global attribute editing direction, which includes the following steps:
1) and acquiring a data set attribute association diagram, an attribute semantic embedded set and a scale factor.
The process of obtaining the attribute association graph of the data set is shown in fig. 2, and the specific process is described as follows: firstly, downloading CelebA-HQ data set from the Internet, wherein the data set comprises 30000 faces, and the resolution of each face image is 1024 × 1024, as shown in the leftmost picture in FIG. 2. Meanwhile, each picture is labeled with 40 attributes, specifically [ "beard", "willow leaf eyebrow", "attractive", "eye pouch", "bald head", "bang", "large lip", "big nose", "black hair", "golden hair", "fuzzy", "brown hair", "thick eyebrow", "round, double chin", "glasses", "goat beard", "gray hair", "thick make-up", "high cheekbone", "male sex bone", "male sex" are labeled"," slightly open mouth "," beard "," slender eyes "," no-urheen "," oval face "," white skin "," sharp nose "," posterior movement of hairline "," red-wet double cheeks "," ludwigia chinensis "," smile "," straight hair "," curly hair "," earring "," hat "," lipstick "," necklace "," tie "," young "]The label value of each attribute is 1 or-1, if the label value of a certain attribute is 1, the attribute of the face image is represented, and if the label value is-1, the attribute of the face image is not represented. For a given face image, we can calculate the face image attribute association graph A i The method is to initialize an all-zero adjacent matrix A i The dimension of the matrix is [40,40 ]]If the labels on the jth attribute and the kth attribute of the face image are both 1, then A is used i The corresponding position on the matrix is 1, i.e. A i [j][k]And A i [k][j]Change to 1, traverse all combinations of j and k, and neighbor matrix A i Has a value of 1, i.e. A i [j][j]Above, the calculated adjacency matrix a i Namely a human face image attribute association graph A i Wherein j and k have values in the range of [0,39 ]]And each obtained attribute association diagram of the face image is a symmetric matrix. The middle part of fig. 2 is a property association diagram representing 30000 individual face images, i.e. 30000 dimensions are [40,40 ]]Matrix of (2), data set attribute correlation diagramThe solving method is to sum and normalize 30000 face image attribute association graphs, and the method is represented as follows:
in the formula, A i Attribute association diagram representing human face imageN represents the number of face images of the CelebA-HQ face data set, the value is 30000, A represents a matrix obtained by summing attribute association graphs of the 30000 images, and the dimensionality is [40,40 ]],D -1 The inverse of the degree matrix representing the summed matrix a, the degree matrix being derived by summing all of the rows in the a matrix and placing them on the main diagonal, and setting all other values outside the main diagonal to 0,represents a matrix obtained by normalizing the matrix obtained by summing, each value in the matrix being in the range of [0,1 ]],Namely a data set attribute association diagram which is finally obtained by us;
the specific process for obtaining attribute semantic embedding is as follows: first, a set of 40 attribute description texts is initialized, each attribute description text is an english character string describing a corresponding attribute, for example, the description of the smile attribute is 'smiling'. The 40 English character strings are embedded with 40 corresponding attribute semantemes by a pre-trained CLIP text encoder, and the CLIP text encoder can be directly downloaded from the Internet. Each attribute description text outputs a semantic embedding, the dimension of each semantic embedding is [1,512], the 40 attribute semantic embeddings are attribute semantic embedding sets, the dimension is [40,512], and the semantic embeddings can be expressed as:
E=[e 1 ,e 2 ,...,e M ] T
in the formula, E represents attribute semantic embedded set, E is a vector with the length of M, M represents the number of attributes and the value of 40, and also represents the length of the vector, T represents transposition of the vector, E 1 Semantic Embedded representation representing the first Attribute, e 2 Semantic Embedded representation representing the second Attribute, e M A semantic embedded representation representing an Mth attribute;
the specific process for obtaining the scale factor is shown in fig. 3, and the specific process is described as follows: randomly selecting two face images in CelebA-HQ face data set as original imagesI o And a target image I t The original image I is processed o And a target image I t The original hidden variable w is obtained by the pre-trained hidden variable encoder o And a target hidden variable w t ,w o And w t Respectively represent the original image I o And a target image I t Mapping in hidden space, w o And w t Has a dimension of [1,18,512 ]](ii) a Implicit variable w o And w t The method comprises the steps of pre-training a hidden variable attribute classifier, wherein an activation function of the last layer of the hidden variable attribute classifier is a sigmoid (·) function, and obtaining an original attribute score S o And a target attribute score S t Expressed as:
in the formula, w o And w t Respectively representing original hidden variables and target hidden variables, C representing a pre-trained hidden variable attribute classifier, S o Represents the original attribute score, S t Representing a target attribute score, S o And S t Is a vector of length M, with dimensions [1,40 ]]The value at each position is in the range of [0,1 ]]M denotes the number of attributes, and also the length of the attribute score vector, and has a value of 40, T denotes the transpose of the vector,represents the original attribute score S o The value at the first position is,represents the original attribute score S o The value at the second position is such that,represents the original attribute score S o The value at the M-th position,representing a target attribute score S t The value at the first position is,representing a target attribute score S t The value at the second position is such that,representing a target attribute score S t A value at the Mth position;
obtaining the original attribute score S o And a target attribute score S t We can then calculate a scaling factor, expressed as:
Alpha=[α 1 ,α 2 ,...,α M ] T
in the formula (I), the compound is shown in the specification,representing a target attribute score S t The value at the i-th position,represents the original attribute score S o The value at the ith position, Alpha, represents a scale factor with dimensions [1,40 ]],α i Representing the value of the scale factor at the i-th position, alpha 1 Representing the value of the scale factor at a first position, alpha 2 Representing the value of the scale factor at the second position, α M Denotes the value of the scale factor at the mth position, M denotes the number of attributes, and T denotes the transpose of the vector. Conditions thereinMeans ifAndone greater than 0.5 and the other less than 0.5, since the two values represent scores of the property, the actual representation is if the original hidden variable and the target hidden variable have one property and the other does not have the property.
2) And (5) constructing a model.
Constructing a global attribute editing network, wherein the input of the network is a data set attribute association diagram, an attribute semantic embedded set and a scale factor, and the output is a global attribute editing direction, the global attribute editing network consists of an improved graph convolution neural network and an improved full-connection network, and is specifically represented as follows:
a. improved graph convolution neural network
An improved graph convolution neural network is constructed, as shown in fig. 4, the improved graph convolution neural network comprises two layers of network modules, which are marked as block1 and block2, the input of block1 is an obtained data set attribute association diagram and an attribute semantic embedded set, the output is an intermediate variable, the input of block2 is an intermediate variable and a data set attribute association diagram output by block1, the output obtains an initial attribute editing direction, and the whole process can be simplified and expressed as:
in the formula, N init Representing the edit direction of the initial attribute obtained by the output of the improved graph convolution neural network, and the dimension is [40,512]],Representing a data set attribute correlation diagram with dimensions [40,40 ]]E denotes an attribute semantic embedding set with dimension [40,512]]W represents the weight parameter to be learned in the improved convolutional neural network, and the dimension is [512,512 ]]And represents a vector dot product operation, and σ () represents a nonlinear activation function, in particular, a Leaky-ReLu function.
b. Improved fully connected network
An improved fully-connected network is constructed, as shown in fig. 5, the improved fully-connected network includes two fully-connected layers, which are denoted as Linear1 and Linear2, the input of Linear1 is the initial attribute editing direction and the scale factor, the activation function is the leak-ReLu function, the output of Linear1 is the input of Linear2, Linear2 is the no activation function, the global attribute editing direction is finally output, and the whole process can be represented as:
N g =F(N init ×Alpha)
in the formula, N g Representing the editing direction of the global attribute obtained by the output of the improved fully-connected network, and the dimension is [1,512]]F denotes an improved fully-connected network, N init Represents the initial property edit direction with a dimension of [40,512]]Alpha represents a scale factor with dimensions [40,1 ]]X represents the product operation, N is init Multiplying each 512-dimensional vector by the corresponding value in Alpha is equivalent to scaling each value in each 512-dimensional vector by a certain factor.
3) And (6) optimizing the model.
In order to better edit the attributes and retain the identity information, three target loss functions are designed to optimize the constructed global attribute editing network, and after optimization, an optimal global attribute editing network is obtained and comprises an optimal improved graph convolution neural network and an optimal improved full-connection network; the optimization is specifically a gradient descent algorithm, and we want to optimize the weight parameters in the constructed global attribute editing network through the gradient descent algorithm so that the values of the three objective functions are as small as possible, and the loss function from the three objectives can be expressed as follows:
the multi-attribute editing loss is calculated as follows: existing original hidden variable w o And a target hidden variable w t Through the hidden variable attribute classifier, the original attribute score S can be obtained o And a target attribute score S t Adding the obtained global property editing direction to the original hidden variable w o The dimension of the global property edit direction is [1,512]]Original hidden variableHas a dimension of [1,18,512 ]]The specific way of adding here is to copy the last dimension vector 18 times in the global property editing direction to obtain the dimension [1,18,512 ]]The global attribute editing direction is added to the original hidden variable one by one, the edited hidden variable can be obtained, the edited hidden variable passes through a pre-trained hidden variable attribute classifier, and the attribute score of the edited hidden variable can be obtained, wherein the process can be expressed as follows:
S e =C(w o +N g )
in the formula, S e Representing the edited attribute score with a dimension of [1, 40%]C denotes a hidden variable attribute classifier, w o Representing original hidden variables, N g A global property edit direction representing an improved fully-connected network output;
to make the edited attribute score S e As close to the target attribute score S as possible t We have designed multi-attribute edit loss, which is specifically expressed as follows:
in the formula, L mae Indicating the loss of the multi-property editing,represents the value of the target attribute score at position i,a value representing the edited attribute score at position i,representing the value of the original attribute score at the position i, and log representing logarithmic operation;
the multi-attribute retention penalty is calculated as follows: in order to make the property not edited not change, we design the multi-property retention loss, which is specifically expressed as follows:
in the formula, L map Indicating that the multi-attribute retention loss is,represents the value of the target attribute score at position i,a value representing the edited attribute score at position i,the value of the original attribute score at the i position, | | · | | non-woven phosphor 2 Is represented by 2 -a norm;
the spatial retention penalty is calculated as follows: in order to prevent the original hidden variables from being changed excessively during editing, a space conservation loss is designed, and the space conservation loss is specifically expressed as follows:
L sp =||N g || 2
in the formula, L sp Represents the space conservation loss, N g Representing global property editing direction, | · | | non-woven phosphor 2 Is represented by 2 -a norm.
We optimize ten generations of the improved atlas neural network and the improved fully-connected network using a gradient descent algorithm, and save the improved atlas neural network as the optimal improved atlas neural network and the improved fully-connected network as the optimal improved fully-connected network.
4) And editing the face attribute.
In a model testing stage, a global attribute editing direction output by an optimal global attribute editing network is acted on a given face image in a hidden space to obtain a multi-attribute edited face image which has an excellent attribute editing effect and can keep identity information, and the method comprises the following steps:
4.1) in the testing stage, the data set attribute association diagram and the attribute semantic embedded set pass through an optimal improved graph convolution neural network to obtain an initial attribute editing direction, wherein the initial attribute editing direction can be expressed as follows:
in the formula, N init Indicating initial Property edit Direction, M GCN A convolutional neural network representing an optimal improvement,representing a data set attribute association graph, and E represents an attribute semantic embedded set;
4.2) inputting the scale factor customized by the user and the initial attribute editing direction into the optimal improved fully-connected network, and outputting to obtain the global attribute editing direction, wherein the process can be expressed as follows:
N g =M F (N init ×Alpha test )
in the formula, N g Indicating global property edit direction, M F Fully connected network representing an optimum improvement, N init Indicating initial Property edit Direction, Alpha test Representing a user-defined scale factor;
4.3) after obtaining the global attribute editing direction, generating a flow of an edited face as shown in fig. 6, for a given original image, pre-training an original image with a hidden variable encoder to obtain an original hidden variable, then acting the global attribute editing direction on the original hidden variable to obtain an edited hidden variable, finally sending the edited hidden variable to a pre-trained decoder, and outputting to obtain a final multi-attribute edited face image, where the flow can be represented as:
I e =G(w o +N g )
in the formula I e Representing a multi-attribute edited face image, G representing a pre-trained decoder, w o Representing original hidden variables, N g Indicating a global property edit direction.
The above embodiments are preferred embodiments of the present invention, but the present invention is not limited to the above embodiments, and any other changes, modifications, substitutions, combinations, and simplifications which do not depart from the spirit and principle of the present invention should be construed as equivalents thereof, and all such changes, modifications, substitutions, combinations, and simplifications are intended to be included in the scope of the present invention.
Claims (5)
1. The face multi-attribute editing method based on the global attribute editing direction is characterized by comprising the following steps of:
1) acquiring a data set attribute association diagram, an attribute semantic embedded set and a scale factor;
calculating a data set attribute association diagram according to attribute labels of each face in the CelebA-HQ face data set; inputting a set of required attribute description texts into a pre-trained CLIP text encoder, and outputting to obtain an attribute semantic embedded set; selecting an original face image and a target face image from CelebA-HQ face data set, subjecting the original face image and the target face image to a pre-trained hidden variable encoder to obtain an original hidden variable and a target hidden variable, subjecting the original hidden variable and the target hidden variable to a pre-trained hidden variable attribute classifier respectively to obtain an original attribute score and a target attribute score, and calculating the difference between the original attribute score and the target attribute score to obtain a scale factor;
2) model construction
Constructing a global attribute editing network, wherein the input of the network is a data set attribute association diagram, an attribute semantic embedded set and a scale factor, and the output is a global attribute editing direction, the global attribute editing network consists of an improved graph convolution neural network and an improved full-connection network, the input of the improved graph convolution neural network is a data set attribute association diagram and an attribute semantic embedded set, and the output is an initial attribute editing direction, aiming at obtaining the editing direction of each initialized attribute, the input of the improved full-connection network is the initial attribute editing direction and the scale factor, and the output is the global attribute editing direction, aiming at adjusting the editing strength of each initialized attribute editing direction by the scale factor, thereby obtaining a more accurate global attribute editing direction;
3) model optimization
In order to better edit the attributes and retain the identity information, three target loss functions are designed to optimize the constructed global attribute editing network, and after optimization, an optimal global attribute editing network is obtained and comprises an optimal improved graph convolution neural network and an optimal improved full-connection network;
4) face property editing
In the model testing stage, the global attribute editing direction obtained by the output of the optimal global attribute editing network acts on a given face image in a hidden space, and the multi-attribute edited face image which has an excellent attribute editing effect and can keep identity information is obtained.
2. The method for editing human face multiple attributes based on global attribute editing direction as claimed in claim 1, wherein the step 1) comprises the following steps:
a. obtaining dataset attribute association graphs
The CelebA-HQ face data set comprises 30000 face images in total, each face image has labels with 40 attributes, the label value of one attribute is 1 to indicate that the face image has the attribute, the label value is-1 to indicate that the face image does not have the attribute, and the face image attribute association graph A is obtained for one face image i ,A i Essentially a adjacency matrix, firstly, the attribute association graph A of the face image is i Initializing to be all zero, if labels on jth attribute and kth attribute of the face image are both 1, then associating graph A with face image attribute i The corresponding position is 1, namely A i [j][k]And A i [k][j]Changing the value of the relation to the value of the attribute of the face image into 1, traversing the combination of all j and k, setting the value of the main diagonal of the adjacent matrix to be 1, and obtaining the adjacent matrix which is the attribute association diagram A of the face image through calculation i Data set attribute association graphThe solution is to sum and normalize 30000 face image attribute association graphs, and is expressed as follows:
in the formula, A i Representing the attribute association diagram of one face image, n representing the number of face images of the CelebA-HQ face data set, the value being 30000, A representing a matrix obtained by summing the attribute association diagrams of the 30000 images, and D -1 An inverse matrix of the degree matrix representing the summed matrix a,expressing a matrix obtained by normalizing the matrix obtained by summation, namely a data set attribute correlation diagram to be finally obtained;
b. attribute semantic embedded collections
Firstly, initializing a set of required attribute description texts, wherein each attribute description text is an English character string for describing a corresponding attribute, and obtaining corresponding attribute semantic embedding by passing the English character strings through a pre-trained CLIP text encoder, wherein the attribute semantic embedding set is an attribute semantic embedding set and is expressed as follows:
E=[e 1 ,e 2 ,...,e M ] T
wherein E represents an attribute semantic embedding set, E is a vector with the length of M, M represents the number of attributes and also represents the length of the vector, T represents the transposition of the vector, E 1 Semantic Embedded representation representing the first Attribute, e 2 Semantic Embedded representation representing the second Attribute, e M A semantic embedded representation representing an Mth attribute;
c. scaling factor
Randomly selecting two face images in CelebA-HQ face data set as original image I o And a target image I t The original image I is processed o And a target image I t The original hidden variable w is obtained by the pre-trained hidden variable encoder o And a target hidden variable w t ,w o And w t Respectively representing the original image I o And a target image I t Mapping in a hidden space; implicit variable w o And w t Pre-trained hidden variable attribute classifier to obtain original attribute score S o And a target attribute score S t Expressed as:
in the formula, w o And w t Respectively representing original hidden variables and target hidden variables, C representing a pre-trained hidden variable attribute classifier, S o Represents the original attribute score, S t Representing a target attribute score, S o And S t Is a length M vector, M representing the number of attributes, and also the length of the attribute score vector, T representing the transpose of the vector,represents the original attribute score S o The value at the first position is,represents the original attribute score S o The value at the second position is such that,represents the original attribute score S o The value at the M-th position,representing a target attribute score S t The value at the first position is such that,representing a target attribute score S t The value at the second position is such that,representing a target attribute score S t A value at the Mth position;
obtaining the original attribute score S o And a target attribute score S t Thereafter, a scaling factor can be calculated, expressed as:
Alpha=[α 1 ,α 2 ,...,α M ] T
in the formula (I), the compound is shown in the specification,representing a target attribute score S t The value at the i-th position,represents the original attribute score S o The value at the i-th position, Alpha, represents a scale factor, Alpha i Representing the value of the scale factor at the i-th position, alpha 1 Representing the value of the scale factor at a first position, alpha 2 Indicating the value of the scale factor at the second position, alpha M Denotes the value of the scale factor at the mth position, M denotes the number of attributes, and T denotes the transpose of the vector.
3. The method for editing human face multiple attributes based on global attribute editing direction as claimed in claim 1, wherein the step 2) comprises the following steps:
constructing a global attribute editing network, wherein the input of the network is a data set attribute association diagram, an attribute semantic embedded set and a scale factor, the output is a global attribute editing direction, and the global attribute editing network consists of an improved graph convolution neural network and an improved full-connection network, and is specifically represented as follows:
a. improved graph convolution neural network
Constructing an improved graph convolution neural network, wherein the improved graph convolution neural network comprises two layers of network modules which are marked as block1 and block2, the input of block1 is an obtained data set attribute association diagram and an attribute semantic embedded set, the output is an intermediate variable, the input of block2 is an intermediate variable and a data set attribute association diagram output by block1, the output obtains an initial attribute editing direction, and the whole process is represented as:
in the formula, N init Representing the initial attribute edit direction resulting from the improved graph convolution neural network output,representing a data set attribute correlation diagram, E representing an attribute semantic embedded set, W representing weight parameters needing to be learned in an improved graph convolution neural network, expressing vector point multiplication operation, and sigma (DEG) representing a nonlinear activation function;
b. improved fully connected network
Constructing an improved fully-connected network, wherein the improved fully-connected network comprises two fully-connected layers, which are marked as Linear1 and Linear2, the input of Linear1 is the initial attribute editing direction and the scale factor, the activation function is the Leaky-ReLu function, the output of Linear1 is the input of Linear2, Linear2 is the no activation function, the final output of the global attribute editing direction, and the whole process is represented as:
N g =F(N init ×Alpha)
in the formula, N g Representing global property edit direction resulting from output of the improved fully-connected network, F representing the improved fully-connected network, N init Indicating initial property edit direction, dimensionDegree of [40,512]Alpha represents a scale factor with dimensions [40,1 ]]And x represents the product operation, specifically, N is init Multiplying each 512-dimensional vector by the corresponding value in Alpha.
4. The method for editing human face multiple attributes based on global attribute editing direction according to claim 1, wherein: in step 3), in order to better edit the attributes and retain the identity information, three target loss functions are designed to optimize the constructed global attribute editing network, and after optimization, an optimal global attribute editing network is obtained, wherein the optimal global attribute editing network comprises an optimal improved graph convolution neural network and an optimal improved fully-connected network; the optimization is specifically a gradient descent algorithm, and it is desirable to optimize weight parameters in the constructed global attribute editing network by the gradient descent algorithm so that values of three objective functions are as small as possible, where the three objective loss functions are expressed as follows:
a. multi-attribute edit loss
Existing original hidden variable w o And a target hidden variable w t Through the hidden variable attribute classifier, the original attribute score S can be obtained o And a target attribute score S t Adding the obtained global property editing direction to the original hidden variable w o The edited hidden variable can be obtained, and then the attribute score of the edited hidden variable can be obtained by passing the edited hidden variable through a pre-trained hidden variable attribute classifier, wherein the process is represented as follows:
S e =C(w o +N g )
in the formula, S e Representing edited attribute scores, C representing hidden variable attribute classifier, w o Representing original hidden variables, N g The global attribute editing direction output by the global attribute editing network is represented;
to make the edited attribute score S e As close to the target attribute score S as possible t The multi-attribute editing loss is designed, and is specifically expressed as follows:
in the formula, L mae Indicating the loss of the multi-property editing,represents the value of the target attribute score at position i,a value representing the edited attribute score at position i,representing the value of the original attribute score at the position i, and log representing logarithmic operation;
b. multi-attribute retention loss
In order to ensure that the property not edited does not change, the multi-property retention loss is designed, and is specifically expressed as follows:
in the formula, L map Indicating that the multi-attribute retention loss is,represents the value of the target attribute score at position i,a value representing the edited attribute score at position i,the value of the original attribute score at the i position, | | · | | non-woven phosphor 2 Is represented by 2 -a norm;
c. space retention loss
In order to prevent the original hidden variables from being excessively changed during editing, the space conservation loss is designed, and is specifically expressed as follows:
L sp =||N g || 2
in the formula, L sp Represents the space conservation loss, N g Representing global property editing direction, | · | | non-woven phosphor 2 Is represented by 2 -a norm.
5. The method for editing human face multiple attributes based on global attribute editing direction according to claim 1, wherein: in step 4), in a model testing stage, a global attribute editing direction output by the optimal global attribute editing network is acted on a given face image in a hidden space to obtain a multi-attribute edited face image which has an excellent attribute editing effect and can maintain identity information, and the method includes the following steps:
4.1) in the testing stage, the attribute association diagram and the attribute semantic embedded set of the data set are subjected to an optimal improved graph convolution neural network to obtain an initial attribute editing direction, wherein the initial attribute editing direction is expressed as follows:
in the formula, N init Indicating initial Property edit Direction, M GCN Represents an optimal improved graph convolution neural network,representing a data set attribute association graph, and E represents an attribute semantic embedded set;
4.2) inputting the scale factor customized by the user and the initial attribute editing direction into the optimal improved fully-connected network, and outputting to obtain the global attribute editing direction, wherein the process is represented as follows:
N g =M F (N init ×Alpha test )
in the formula, N g Indicating Global Property edit Direction, M F Fully connected network representing an optimum improvement, N init Indicating initial Property edit Direction, Alpha test Representing a user-defined scale factor;
4.3) for a given original image, the original image is subjected to a pre-trained hidden variable encoder to obtain an original hidden variable, then the global attribute editing direction is acted on the original hidden variable to obtain an edited hidden variable, finally the edited hidden variable is sent to a pre-trained decoder, and a final multi-attribute edited face image is obtained through output, wherein the process is represented as follows:
I e =G(w o +N g )
in the formula I e Representing a multi-attribute edited face image, G representing a pre-trained decoder, w o Representing original hidden variables, N g Indicating a global property edit direction.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210628783.2A CN115082292A (en) | 2022-06-06 | 2022-06-06 | Human face multi-attribute editing method based on global attribute editing direction |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210628783.2A CN115082292A (en) | 2022-06-06 | 2022-06-06 | Human face multi-attribute editing method based on global attribute editing direction |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115082292A true CN115082292A (en) | 2022-09-20 |
Family
ID=83249604
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210628783.2A Pending CN115082292A (en) | 2022-06-06 | 2022-06-06 | Human face multi-attribute editing method based on global attribute editing direction |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115082292A (en) |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130108175A1 (en) * | 2011-10-28 | 2013-05-02 | Raymond William Ptucha | Image Recomposition From Face Detection And Facial Features |
US20140160134A1 (en) * | 2011-08-31 | 2014-06-12 | Timur Nuruahitovich Bekmambetov | Visualization of a natural language text |
CN111368662A (en) * | 2020-02-25 | 2020-07-03 | 华南理工大学 | Method, device, storage medium and equipment for editing attribute of face image |
CN111914617A (en) * | 2020-06-10 | 2020-11-10 | 华南理工大学 | Face attribute editing method based on balanced stack type generation countermeasure network |
CN111932444A (en) * | 2020-07-16 | 2020-11-13 | 中国石油大学(华东) | Face attribute editing method based on generation countermeasure network and information processing terminal |
CN112734873A (en) * | 2020-12-31 | 2021-04-30 | 北京深尚科技有限公司 | Image attribute editing method, device, equipment and medium for resisting generation network |
CN113963409A (en) * | 2021-10-25 | 2022-01-21 | 百果园技术(新加坡)有限公司 | Training of face attribute editing model and face attribute editing method |
CN114240736A (en) * | 2021-12-06 | 2022-03-25 | 中国科学院沈阳自动化研究所 | Method for simultaneously generating and editing any human face attribute based on VAE and cGAN |
-
2022
- 2022-06-06 CN CN202210628783.2A patent/CN115082292A/en active Pending
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140160134A1 (en) * | 2011-08-31 | 2014-06-12 | Timur Nuruahitovich Bekmambetov | Visualization of a natural language text |
US20130108175A1 (en) * | 2011-10-28 | 2013-05-02 | Raymond William Ptucha | Image Recomposition From Face Detection And Facial Features |
CN111368662A (en) * | 2020-02-25 | 2020-07-03 | 华南理工大学 | Method, device, storage medium and equipment for editing attribute of face image |
CN111914617A (en) * | 2020-06-10 | 2020-11-10 | 华南理工大学 | Face attribute editing method based on balanced stack type generation countermeasure network |
CN111932444A (en) * | 2020-07-16 | 2020-11-13 | 中国石油大学(华东) | Face attribute editing method based on generation countermeasure network and information processing terminal |
CN112734873A (en) * | 2020-12-31 | 2021-04-30 | 北京深尚科技有限公司 | Image attribute editing method, device, equipment and medium for resisting generation network |
CN113963409A (en) * | 2021-10-25 | 2022-01-21 | 百果园技术(新加坡)有限公司 | Training of face attribute editing model and face attribute editing method |
CN114240736A (en) * | 2021-12-06 | 2022-03-25 | 中国科学院沈阳自动化研究所 | Method for simultaneously generating and editing any human face attribute based on VAE and cGAN |
Non-Patent Citations (2)
Title |
---|
随海亮;马军山;李丽莹;: "基于生成对抗网络与FACS的面部表情合成研究", 软件导刊, no. 06, 15 June 2020 (2020-06-15) * |
黄韬;贾西平;林智勇;马震远;: "基于生成对抗网络的文本引导人物图像编辑方法", 广东技术师范大学学报, no. 03, 25 June 2020 (2020-06-25) * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109447906B (en) | Picture synthesis method based on generation countermeasure network | |
CN111814706B (en) | Face recognition and attribute classification method based on multitask convolutional neural network | |
Hou et al. | Improving variational autoencoder with deep feature consistent and generative adversarial training | |
CN107798349B (en) | Transfer learning method based on depth sparse self-coding machine | |
CN109472024A (en) | A kind of file classification method based on bidirectional circulating attention neural network | |
CN108108677A (en) | One kind is based on improved CNN facial expression recognizing methods | |
CN108830287A (en) | The Chinese image, semantic of Inception network integration multilayer GRU based on residual error connection describes method | |
CN111127146B (en) | Information recommendation method and system based on convolutional neural network and noise reduction self-encoder | |
CN109815826A (en) | The generation method and device of face character model | |
CN110188794B (en) | Deep learning model training method, device, equipment and storage medium | |
Xu et al. | (Retracted) Method of generating face image based on text description of generating adversarial network | |
CN117521672A (en) | Method for generating continuous pictures by long text based on diffusion model | |
CN115393933A (en) | Video face emotion recognition method based on frame attention mechanism | |
CN110276396A (en) | Picture based on object conspicuousness and cross-module state fusion feature describes generation method | |
Qu et al. | Perceptual-DualGAN: perceptual losses for image to image translation with generative adversarial nets | |
Da et al. | Brain CT image classification with deep neural networks | |
CN117522697A (en) | Face image generation method, face image generation system and model training method | |
CN116543289B (en) | Image description method based on encoder-decoder and Bi-LSTM attention model | |
Sawant et al. | Text to image generation using GAN | |
CN115082292A (en) | Human face multi-attribute editing method based on global attribute editing direction | |
CN116311472A (en) | Micro-expression recognition method and device based on multi-level graph convolution network | |
Fang et al. | Facial makeup transfer with GAN for different aging faces | |
CN116503499A (en) | Sketch drawing generation method and system based on cyclic generation countermeasure network | |
Yauri-Lozano et al. | Generative Adversarial Networks for text-to-face synthesis & generation: A quantitative–qualitative analysis of Natural Language Processing encoders for Spanish | |
CN113947520A (en) | Method for realizing face makeup conversion based on generation of confrontation network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |