AU2019430859A1 - Generative adversarial mechanism and attention mechanism-based standard face generation method - Google Patents

Generative adversarial mechanism and attention mechanism-based standard face generation method Download PDF

Info

Publication number
AU2019430859A1
AU2019430859A1 AU2019430859A AU2019430859A AU2019430859A1 AU 2019430859 A1 AU2019430859 A1 AU 2019430859A1 AU 2019430859 A AU2019430859 A AU 2019430859A AU 2019430859 A AU2019430859 A AU 2019430859A AU 2019430859 A1 AU2019430859 A1 AU 2019430859A1
Authority
AU
Australia
Prior art keywords
image
face
network
model
sub
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
AU2019430859A
Other versions
AU2019430859B2 (en
Inventor
Chunwen PAN
Weilin Wu
Wei Xie
Xiaoyuan Yu
Langwen ZHANG
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China University of Technology SCUT
Original Assignee
South China University of Technology SCUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT filed Critical South China University of Technology SCUT
Publication of AU2019430859A1 publication Critical patent/AU2019430859A1/en
Application granted granted Critical
Publication of AU2019430859B2 publication Critical patent/AU2019430859B2/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

A generative adversarial mechanism and attention mechanism-based standard face generation method, comprising: a dataset design step, constructing, according to database-related annotation data, face code having a plurality of non-limiting factors for a face image, and taking the code and the face image as inputs of a model; a model design and training step, using a generative adversarial mechanism and an attention mechanism to design a corresponding network structure, and using the constructed data pair to perform model training, so as to obtain a network model weight; and a model prediction step, predicting the acquired face image by means of the model. The present invention applies deep learning network technology to standard face generation to generate a colour, front-facing, and standard face image under normal light illumination. The method using a deep learning network is capable of obtaining an accurate standard face photograph, reducing the difficulty of matching with data in a single-sample database, and laying a solid foundation for subsequent face feature extraction and single-sample facial recognition.

Description

A Standard Face Generation Method Based on a Generative Adversarial Mechanism and an Attention Mechanism
Technical Field The invention relates to the technical field of deep learning applications, in particular to a standard face generation method based on a generative adversarial mechanism and an attention mechanism.
Technical Background In recent years, video surveillance has been popularized in large and medium cities across the country, and has been widely used in the construction of social security prevention and control systems, and has become a powerful technical means for public security agencies to investigate and solve cases. Especially in mass incidents, major cases and robberies, the evidential clues obtained from video surveillance videos play a key role in the rapid solving of cases. At present, domestic public security agencies mainly use video surveillance videos to find after-crime clues and evidences, and locking in the suspect's identity by comparing the face information of key suspects with personal information in the public security bureau's database. However, there are many restrictive factors in the face information of the suspect in the surveillance video, such as expression information interference, posture interference or shooting illumination interference. Since most of the facial information images in the public security bureau's database are only a single sample of the ID photo, when the facial images interfered by the above-mentioned multiple restrictive factors are subjected to recognition processing, the success rate is greatly restricted, and it is easy to cause missed detections and wrong detections etc.
In recent years, the field of artificial intelligence has been mentioned in the scope of national key construction. This indicates that the combination of artificial intelligence and related industries is an inevitable trend of the country's development towards intelligence, and it is of great significance to promote the development of the industry towards intelligence and automation. The most important thing in the field of artificial intelligence is to design corresponding deep learning network models for different industry tasks. With the increase of computer computing power, the difficulty of network training has been greatly reduced, and the accuracy of network prediction has also been continuously improved. The basic characteristics of deep learning network are strong model fitting ability, large amount of information and high precision, which can meet the different needs of different industries. For the face recognition problem with multiple non-limiting factors, the key issue is how to generate a standard front-facing face image to meet the needs of subsequent face image feature extraction and recognition. At present, it is urgent to solve this problem, design a corresponding and reasonable deep learning network framework, use high-performance computer processing capabilities to train the network, and then generate more standard front-facing face images, improve the accuracy of face matching and reduce the occurrence of false detections during face recognition.
Summary of the Invention The object of the present invention is to solve the above-mentioned shortcomings in the prior art, and provide a standard face generation method based on a generative adversarial mechanism and an attention mechanism, and use a deep learning network framework to
design related models, thereby obtaining a more standard front-facing face image, laying a solid foundation for subsequent face feature extraction and single-sample face recognition.
The object of the present invention may be achieved by adopting the following technical solutions:
A standard face generation method based on a generative adversarial mechanism and an
attention mechanism. The generation method comprises: data set design steps, model design and training steps, and model prediction steps; the data set design steps are mainly based on the current mainstream RaFD data set and IAIR face data set, according to the relevant annotated data of the database, a face code with a various non-limiting factors is constructed for each face image, comprising face expression factors, face posture factors, and shooting illumination factors etc., and the code and face image are used as inputs to the model; the model design and training steps are mainly to use the related principles of the generative adversarial mechanism and the attention mechanism to design the corresponding network structure, and to use the constructed data pair for model training to obtain the network model weight; the model prediction step is mainly to predict the result after model processing is performed on the face image acquired in reality.
Specifically, the operation steps are as follows:
Si. a data construction, collecting face data in a RaFD face data set and a IAIR face data set, constructing a face code with multiple non-limiting factors for each face image, then classifying the face data; wherein the non-limiting factors comprise face expression factors, face posture factors and shooting illumination factors; an encoded face image forms an information unit U = {LE, A,}, comprising an 8-bit illumination code L, an 8-bit
expression code E and a 19-bit posture code A, ;
S2. establishing a network model based on the generative adversarial mechanism and the attention mechanism; the network model comprises three sub-networks, each correspond to an image generator sub-network for generating a standard face, a model discriminator sub network for discriminating generated results, and an image restoration network for restoring the generated results; first, using the image generator sub-network and the attention mechanism to generate a standard face on an input face image; then, using the model discriminator sub-network to discriminate the generated image, finally, constructing an image restoration network, restoring the generated image, and comparing a restoration result with the input image to optimize constraints of the network model;
S3. a model training, using an image unit generated in step Si, using an image with multiple non-limiting factors as input to optimize outputs of the image generator sub network, the model discriminator sub-network, and the image restoration sub-network, and labeling similarities, to achieve a convergence of the network model based on the generative adversarial mechanism and the attention mechanism;
S4. a model prediction, extracting an face in an actual image as an input of the model, finally obtaining a more standard front face image output by controlling a unified information unit.
Further, in step Si, the face information in the face data set is correspondingly encoded, and divided into two types: non-limiting factor face images and standard front natural face images;
The process of step Si is as follows;
S11. face information code. For different face data in the data set, a face code with multiple non-limiting factors is constructed for each face image, wherein the non-limiting factors comprise, but are not limited to, face expression factors, face posture factors, and shooting illumination factors etc.
The specific rules for coding face images are as follows:
A) the face expression factors are divided into eight situations, namely happy, angry, sad, contemptuous, disappointed, scared, surprised and natural; a face expression is encoded as
E =(E 1 , E 2 ,..., E ,), where E, represents the/-th expression, = 0, 1, 2, ... , 8, its value
is [0,1], E =(0,0,...,1) means a natural expression;
B) the face illumination factors are divided into eight situations, mainly front illumination, left illumination, right illumination and a combination of these three illuminations namely front illumination, left illumination, right illumination, front left illumination, front right illumination, left right illumination, no illumination, and full illumination; illumination information of the face is encoded as L =(L1, L, L, ),e. , where L represents the n-th
illumination situation, n=0, 1, 2, ... , 8, its value is [0,1], L, =(0,0,...,1) represents front
illumination image information;
C) the face posture factors are divided into 19 situations, comprising 9 poses of the left face at 100 intervals, 9 poses of the right face at 10 intervals, and front face posture images, that is left 900, left 800, left 700, left 600, left 500, left 400, left 30, left 20, left 10, front face, right 10, right 200, right 300, right 400, right 50, right 60, right 70, right
80, right 90; posture information of the face is encoded as , =(4 1, A 2 ,,., -,19
where A, represents the m-th face pose, m = 0, 1, 2,..., 19, its value is [0,1],
4 =(0,0,...,1) represents front posture information. Finally, the face information code is integrated into the unified information code U = {L, E , A,} , which is a 35-bit one
dimensional information.
S12. classifying face data, classifying the encoded face data into non-limiting factor face images and standard front natural clear face images, specifically:
Face images with unified code information UO =(L,(0,0,...,1), E(0,0,...,1), A,(0,0,...,1),)
are taken as the standard frontal natural clear face images, and used as target images of the
model; remaining face images are taken as the non-limiting factor face images, and used as input images of the model.
Further, in the step S2,
assuming that the input image is Y, its corresponding original unified information code is
Uy, the generated standard face image is I, the unified information code UO corresponds
to the standard face image I, the corresponding standard face image in the UO database is
I, the unified information code corresponding to the standard face image I is Uo.
In the image generator sub-network, the content of its inputs are image Y and a unified information code Uo. The invention designs two codec networks Gc and Gf, generating a color information mask C and an attention mask F by combining with the attention mechanism; then through the following synthesis mechanism to generate the standard face:
C=Gc(Y,Uo), F=Gf(Y,Uo)
1,=(1-F)DC+FD Y
wherein 0 represents element-wise multiplications of matrices.
Therefore, the codec network Gc mainly focuses on the color information and texture information of the face, and the codec network Gfmainly focuses on the areas that need to be changed in the face;
In the model discriminator sub-network, the content of its input is an image Io generated by the image generator sub-network. Similarly, the invention also designs two deep convolution networks: image discrimination sub-network Di and information code discrimination sub-network Du, to respectively distinguish a difference between a generated standard face image I, and a corresponding standard face image I in a database,
and a difference between a unified information code UO corresponding to the generated standard face image I, and a unified information code Uo corresponding to the corresponding standard face image I in the database;
In the image restoration sub-network, the content of its input is the original unified information code Uy corresponding to the generated standard face image o and an input image Y. The restoration sub-network is consistent with the image generator sub-network, and its network restoration result is Y. By comparing the restoration result with the input image Y of the overall network, the goal of a loop optimization network result is achieved.
Further, the processing flow of the network model based on the generative adversarial mechanism and the attention mechanism is as follows:
First, input the unified information code Uo corresponding to the input image Y and the standard face image Iinto the image generator sub-network to generate the standard face image I,; the image generator sub-network incorporates the attention mechanism;
Then, in order to distinguish between real images and generated images, send the generated standard face image o and the corresponding standard face image I(that is real image ]) in the database to the image discrimination sub-network Diin the model discrimination sub-network for discrimination, and at the same time, sending the unified information code UO corresponding to the generated standard face image I, and the
unified information code Uo corresponding to the standard face image I in the database to the information code discrimination sub-network Du in the model discriminator sub network for discrimination, optimizing through continuous loop so that the image generator sub-network and the model discriminator sub-network achieve common progress;
Finally, in order to achieve the purpose of loop optimizing the network model, the present invention designs an image restoration sub-network, the generated standard face image I,
is further restored according to the original unified information code Uy corresponding to the original input image Y, and the restoration result is compared with the input image Y. The entire network realizes the convergence of the overall network model by continuously optimizing the corresponding loss function. Finally, the removal of non-limiting environmental factors from the face image is realized.
Further, in the step S3, the model training achieves the convergence of the model by optimizing a loss function, wherein a design process of the loss function is as follows specifically:
1) optimizing a difference between the generated standard face image I, by discrimination
and the corresponding standard face image Iin the database: setting an image loss function
as shown in L, = D, (I)-D, (I)[, where H and W are a height and a width of an HxW2 output face image, D, (I,) and D, (I) are evaluation results of the image I, and I by the
image discrimination sub-network; then, considering an effectiveness of a gradient loss, add a gradient-based penalty to the image loss function, which may improve the efficiency of convergence and the quality of image generation, that is, the image loss function is
designed as L, = I D, (I")-_DI (I)1 2+A] I VD, (I")-1 2 , where V(-) HxW 2 HxW
represents a gradient operation of the image, and A, is a weight of the penalty;
2) optimizing a difference of a conditional unified information code: setting a conditional expression loss function, that is, distinguishing a difference between the generated standard face image I and the corresponding standard face image Iin the database, each
corresponding to unified information code UO and Uo; therefore, the conditional
expression loss function is designed as follows: Lu = Do (Ij )- U 0 , where N is a
length of an output uniform information code, and then, in the conditional expression loss function, adding a mapping relationship between the input image Y and a corresponding original unified information encode Uy, which may improve the discriminating ability of the discriminator, therefore, the conditional expression loss function is designed as 1 21 2 Lu = N D (I )-U D(Y)-U , where Uy is an original unified information
code corresponding to the input image Y, Uo is a unified information code corresponding to
the standard face image I, Du (I) and Du (Y) are discrimination results of the
information code discriminating sub-network on the images I and Y respectively;
3) optimizing a difference between a result of the image restoration sub-network and the original input image: restoring the image I generated by an input generator with the original unified information code Uy, and then compared with the original input image Y.
Therefore, a restoration loss function is designed as L, = I G(G(I, U)) -Y , where hxw 1
h and w represent a height and a width of the image.
Therefore, a loss function of the entire network model is as follows: L = L,+ Lu + L,.
By optimizing the loss function, the convergence of the network model is achieved, and a generator structure and weights for generating standard faces are obtained.
Further, for the generation of the actual face image in step S4, first use the face positioning method based on the face HOG image to obtain the face image in the actual image; then, the generator trained by the model and the manually set unified information code are used to realize the rapid standard face generation of the face in the actual image. In addition, it is foreseeable that by setting different unified information codes, it is possible to change other structures of the face, such as it is feasible to control other expressions, or further changing the face posture.
Compared with the prior art, the present invention has the following advantages and effects:
The present invention applies the deep learning network technology to the standard face generation task to generate colorful, forward-facing standard face images under normal illumination; using the deep learning network method, accurate standard front face photos may be obtained, the difficulty of matching data in a single-sample database is reduced, and a solid foundation is laid for subsequent face feature extraction and single-sample face recognition.
Brief Description of the Figures Figure 1 is a flowchart of a model training and a model application in an embodiment of the present invention;
Figure 2 is a flowchart of a data construction of a database in an embodiment of the present invention;
Figure 3 is an overall design diagram of a network model in an embodiment of the present invention;
Figure 4 is a specific structure diagram of an image generation network in an embodiment of the present invention;
Figure 5 is a specific structure diagram of an image discrimination network in an embodiment of the present invention.
Description In order to better clarify the objectives, technical solutions, and advantages of the embodiments of the present invention, the technical solutions of the embodiments of the present invention will be described clearly and completely in conjunction with the accompanying figures of the embodiments of the present invention. Obviously, the described embodiments are part of the embodiments of the present invention, rather than all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of the present invention.
Embodiments
This embodiment discloses a standard face generation method based on a generative adversarial mechanism and an attention mechanism, which mainly involves the following types of technologies: 1) design of training data: using existing data sets to design unified information codes; 2) design of network model structure: taking the generation of adversarial network framework and loop optimization network method as the basic network structure; 3) standard face generation method: adding an attention mechanism to
the generator to restrict the accuracy of standard face generation.
This embodiment is based on the TensorFlow framework and the Pycharm development environment: the TensorFlow framework is a development framework based on the python language, which can build a reasonable deep learning network conveniently and quickly, and has good cross-platform interaction capabilities. TensorFlow provides interfaces for many package functions and various image processing functions in the deep learning architecture, comprising image processing functions related to OpenCV. The TensorFlow framework can also use GPU to train and verify the model, which improves the efficiency of calculation.
The Pycharm development environment under the Windows platform or Linux platform becomes the integrated development environment (IDE), which is currently one of the first choices for deep learning network design and development. Pycharm provides customers with new templates, design tools, testing and debugging tools, and may provide customers with an interface to directly call remote servers.
The present embodiment discloses a standard face generation method based on a generative adversarial mechanism and an attention mechanism. The main process comprises two stages, model training and model application.
In the model training stage: first, processing the existing face data set, and generating a data set that meets the model training by designing a unified information code mechanism; then, using a cloud server with high computing power to train the network model, by optimizing a loss function and adjusting the network model parameters until the network model converges to obtain the generator structure and weights for generating standard faces.
In the model application stage: first, using the HOG face image processing method to extract the actual picture to obtain the actual face image; then, calling the trained network model to use a face image with non-limiting factors and the designed unified information code as inputs, performing standard face generation; finally obtaining a colorful, front facing face image.
Figure 1 is a flowchart of a standard face generation method based on a generative adversarial mechanism and an attention mechanism disclosed in this embodiment. Specific steps are as follows:
Step 1. Since the current face database mainly focuses on recognition tasks, there is no face image database with uniform information code required by the present invention. Therefore, it is necessary to integrate existing databases to construct a suitable database.
Figure 2 shows a construction process of a face image and a unified information code in the database.
Step 2. Figure 3 is an illustrative diagram of the overall architecture of the network model. The entire model framework mainly comprises three sub-networks, which correspond to an image generator sub-network for generating standard faces, a model discriminator sub network for discriminating the generated results, and an image restoration network for restoring the generated results, wherein parameter sharing is carried out between the image generator sub-network and the image restoration sub-network, the image generator sub network mainly combines the attention mechanism to generate face images. Figure 4 is a specific network structure of the image generator sub-network, and Figure 5 is a specific network structure of the model discriminator sub-network.
The main parameters are as follows:
1) The image generator sub-network has the same parameters as the image restoration sub network, and comprises two generators respectively, namely the color information generator and the attention mask generator, as follows specifically:
The color information generator comprises 8 convolution layers and 7 deconvolution layers. The convolution kernel size of all convolution layers is 5, the step size is 1, and finally a 3-channel color information image is generated;
The attention mask generator comprises 8 convolution layers and 7 deconvolution layers. The convolution kernel size of all convolution layers is 5, the step length is 1, and finally a 1-channel attention mask is generated.
2) The model discriminator sub-network comprises two parts, namely an information code discriminating sub-network and an image discriminating sub-network, as follows specifically: the information code discriminating sub-network comprises 6 convolution layers and 1 fully connected layer, the convolution kernel size of the convolution layer is 5, the step size is 1, and finally a one-dimensional unified information code of length N is generated; the image discrimination sub-network comprises 6 convolution layers, the convolution kernel size is 5, and the step size is 1.
Step 3. The model training is carried out on a high-performance GPU. The specific training parameters are designed as follows: Adam optimizer can be used, and the parameters are set to 0.9/0.999; the learning rate is set to 0.0001; the training epoch is set to 100; the batch setting for training depends on the training samples of the data.
Step 4. Model prediction, extracting the face in an actual image as the input of the model, by controlling the unified information unit, finally obtain a more standard front-facing image output.
The above-mentioned embodiments are preferred embodiments of the present invention, but the embodiments of the present invention are not limited by the above-mentioned embodiments. Any other changes, modifications, substitutions, combinations, simplifications made without departing from the spirit and principle of the present invention, all should be equivalent replacement methods, and they are all included in the protection scope of the present invention.

Claims (8)

  1. Claims 1. A standard face generation method based on a generative adversarial mechanism and an attention mechanism, characterized in that, the generation method comprises the following steps: Si. a data construction, collecting face data, constructing a face code with multiple non-limiting factors for each face image, then classifying the face data; wherein the non limiting factors comprise face expression factors, face posture factors and shooting illumination factors; an encoded face image forms an information unit U = {L., E, A,}
    , comprising an 8-bit illumination code L, an 8-bit expression code E and a 19-bit posture
    code A,;
    S2. establishing a network model based on the generative adversarial mechanism and the attention mechanism; the network model comprises three sub-networks, each correspond to an image generator sub-network for generating a standard face, a model discriminator sub-network for discriminating generated results, and an image restoration network for restoring the generated results; first, using the image generator sub-network and the attention mechanism to generate a standard face on an input face image; then, using the model discriminator sub-network to discriminate the generated image, finally, constructing an image restoration network, restoring the generated image, and comparing a restoration result with the input image to optimize constraints of the network model; S3. a model training, using the information unit U = {L, E , A,} as an input to optimize outputs of the image generator sub-network, the model discriminator sub-network, and the image restoration sub-network, and labeling similarities, to achieve a convergence of the network model based on the generative adversarial mechanism and the attention mechanism; S4. a model prediction, extracting an face image in an actual image as an input of the network model, finally obtaining a standard front face image output by controlling the information unit U.
  2. 2. The standard face generation method based on a generative adversarial mechanism and an attention mechanism according to claim 1, characterized in that, the face expression factors are divided into eight situations, namely happy, angry, sad, contemptuous, disappointed, scared, surprised and natural; a face expression is encoded as E =(EI,E 2 ,..., E ), where E, represents the -th expression, =0, 1, 2, ..., 8, its value is [0,1], E =(0,0,...,1) means a natural expression; the face illumination factors are divided into eight situations, namely front illumination, left illumination, right illumination, front left illumination, front right illumination, left right illumination, no illumination, and full illumination; illumination information of the face is encoded as L,=(LI, ... L, ),where L, represents the n-th illumination situation, its value is [0,1], L=(0,0,...,1) represents full illumination image information; the face posture factors are divided into 19 situations, namely left 900, left 80°, left 70°, left 60, left 50°, left 40°, left 30°, left 20, left 10, front face, right 10, right 20, right 30, right 400, right 500, right 60, right 70, right 80, right 90; posture information of the face is encoded as A, =(AI, A,1 ... IAm,---, Ai9),where Am represents the m-th face pose, m = 0, 1, 2, ... , 19, its value is [0,1], A, =(0,0,...,1) represents front posture information.
  3. 3. The standard face generation method based on a generative adversarial mechanism and an attention mechanism according to claim 2, characterized in that, the process of classifying face data in step Si is as follows: classifying the encoded face data into non limiting factor face images and standard front natural clear face images, wherein, face images with unified code information UO = r(L,(0,0,...,1), E(0,0,...,1), A,(0,0,...,1),) are taken as the standard frontal natural
    clear face images, and used as target images of the model, and remaining face images are taken as the non-limiting factor face images, and used as input images of the model.
  4. 4. The standard face generation method based on a generative adversarial mechanism and an attention mechanism according to claim 1, characterized in that, inputs of the image generator sub-network are image Y and a standard face unified information code Uo, the image generator sub-network comprises two codec networks Gc and Gf, wherein the codec network Gc focuses on face color information and texture information, the codec network Gf focuses on areas that need to be changed on the face, generating a color information mask C and an attention mask F by combining with the attention mechanism, then through the following synthesis mechanism to generate the standard face:
    C=Gc(Y,UO), F=Gf(Y,UO)
    I, =(1- F) C +F Y
    wherein 0 represents element-wise multiplications of matrices;
    in the model discriminator sub-network, its input is an image o generated by the image generator sub-network; the model discriminator sub-network comprises two deep convolution networks, image discrimination sub-network Di and information code discrimination sub-network Du, to respectively distinguish a difference between a generated standard face image I, and a corresponding standard face image I in a database,
    and a difference between a unified information code UO corresponding to the generated
    standard face image I, and a unified information code Uo corresponding to the
    corresponding standard face image I in the database; an input of the image restoration sub-network is an original unified information code Uy corresponding to the generated standard face image Io and an input image Y, an output
    of the image restoration sub-network is a network restoration result Y; by comparing the
    restoration results Y with the input image Y of an overall network, a loop optimization network result is realized.
  5. 5. The standard face generation method based on a generative adversarial mechanism and an attention mechanism according to claim 4, characterized in that, the process of step S2 is as follows: first, input the unified information code Uo corresponding to the input image Y and the standard face image I into the image generator sub-network incorporating the attention mechanism to generate the standard face image I, ;
    then, send the generated standard face image Io and the corresponding standard face image I in the database to the deep convolution network D in the model discriminator sub
    network for discrimination, and at the same time, sending the unified information code UO
    corresponding to the generated standard face image I, and the unified information code Uo
    corresponding to the standard face image Iin the database to the deep convolution network Du in the model discriminator sub-network for discrimination, so that the image generator sub-network and the model discriminator sub-network are optimized simultaneously; finally, inputting the generated standard face image I, to the image restoration network, restoring based on the original unified information code Uy corresponding to the original input image Y, and comparing the restoration result Y with the input image Y, and continuously optimizing a corresponding loss function to achieve the convergence of the network model based on the generative adversarial mechanism and the attention mechanism.
  6. 6. The standard face generation method based on a generative adversarial mechanism and an attention mechanism according to claim 1, characterized in that, in the step S3, the model training achieves the convergence of the model by optimizing a loss function, wherein a design process of the loss function is as follows: optimizing a difference between the generated standard face image I, by
    discrimination and the corresponding standard face image Iin the database: setting an
    image loss function as shown in L, = D, (I,)-D, (I) , where H and W are a
    height and a width of an output face image, D, (I) and D, (I) are evaluation results of
    the image I, and I by the image discrimination sub-network; then, considering an effectiveness of a gradient loss, add a gradient-based penalty to the image loss function, that is, the image loss function is designed as
    LI = D (I)-D (I)2+1 VD (I1)-12 , where V() represents a gradient HxW 2 HxW2 operation of the image, and A, is a weight of the penalty;
    optimizing a difference of a conditional unified information code: setting a conditional expression loss function, that is, distinguishing a difference between the generated standard face image I, and the corresponding standard face image I in the
    database, each corresponding to unified information code UO and Uo; the conditional
    expression loss function is designed as follows: 12 Lu = Do (I )-UO , where Nis a length of an output uniform information code, N and then, in the conditional expression loss function, adding a mapping relationship between the input image Y and a corresponding original unified information encode Uy, therefore, the conditional expression loss function is designed as follows: 1 21 2 L=yNDo (J0 )- U0 2+Do()-U',hheeUy is an original unified information code corresponding to the input image Y, Uo is a unified information code corresponding to the standard face image I, Du (I) and Du (Y) are discrimination results of the information code discriminating sub-network on the images I, and Y respectively; optimizing a difference between a result of the image restoration sub-network and the original input image: restoring the image I generated by an input generator with the original unified information code Uy, and then compared with the original input image Y, therefore, a restoration loss function is designed as L, = G(G(I,, U) - Y, where hxw 1 h and w represent a height and a width of the image, and G represents the image generator sub-network; a loss function of the entire network model is as follows: L = L,+ Lu + L,.
  7. 7. The standard face generation method based on a generative adversarial mechanism and an attention mechanism according to claim 1, characterized in that, the process of step S4 is as follows: first, using a face positioning method based on a face HOG image to obtain a face image in the actual image; then, using a generator trained by the network model and a manually set unified information code to realize a rapid standard face generation of a face in the actual image.
  8. 8. The standard face generation method based on a generative adversarial mechanism and an attention mechanism according to claim 1, characterized in that, in the step Sl, collecting face data in a RaFD face data set and a IAIR face data set.
AU2019430859A 2019-02-19 2019-10-18 Generative adversarial mechanism and attention mechanism-based standard face generation method Active AU2019430859B2 (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
CN201910121233.X 2019-02-19
CN201910121233.XA CN109934116B (en) 2019-02-19 2019-02-19 Standard face generation method based on confrontation generation mechanism and attention generation mechanism
PCT/CN2019/112045 WO2020168731A1 (en) 2019-02-19 2019-10-18 Generative adversarial mechanism and attention mechanism-based standard face generation method

Publications (2)

Publication Number Publication Date
AU2019430859A1 true AU2019430859A1 (en) 2021-04-29
AU2019430859B2 AU2019430859B2 (en) 2022-12-08

Family

ID=66985683

Family Applications (1)

Application Number Title Priority Date Filing Date
AU2019430859A Active AU2019430859B2 (en) 2019-02-19 2019-10-18 Generative adversarial mechanism and attention mechanism-based standard face generation method

Country Status (3)

Country Link
CN (1) CN109934116B (en)
AU (1) AU2019430859B2 (en)
WO (1) WO2020168731A1 (en)

Families Citing this family (42)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109934116B (en) * 2019-02-19 2020-11-24 华南理工大学 Standard face generation method based on confrontation generation mechanism and attention generation mechanism
CN110633655A (en) * 2019-08-29 2019-12-31 河南中原大数据研究院有限公司 Attention-attack face recognition attack algorithm
CN110619315B (en) * 2019-09-24 2020-10-30 重庆紫光华山智安科技有限公司 Training method and device of face recognition model and electronic equipment
CN110796111B (en) * 2019-11-05 2020-11-10 腾讯科技(深圳)有限公司 Image processing method, device, equipment and storage medium
CN111144314B (en) * 2019-12-27 2020-09-18 北京中科研究院 Method for detecting tampered face video
CN111242078A (en) * 2020-01-20 2020-06-05 重庆邮电大学 Face-righting generation method based on self-attention mechanism
CN111325319B (en) * 2020-02-02 2023-11-28 腾讯云计算(北京)有限责任公司 Neural network model detection method, device, equipment and storage medium
CN111325809B (en) * 2020-02-07 2021-03-12 广东工业大学 Appearance image generation method based on double-impedance network
CN111275613A (en) * 2020-02-27 2020-06-12 辽宁工程技术大学 Editing method for generating confrontation network face attribute by introducing attention mechanism
CN111400531B (en) * 2020-03-13 2024-04-05 广州文远知行科技有限公司 Target labeling method, device, equipment and computer readable storage medium
CN112036281B (en) * 2020-07-29 2023-06-09 重庆工商大学 Facial expression recognition method based on improved capsule network
CN112199637B (en) * 2020-09-21 2024-04-12 浙江大学 Regression modeling method for generating contrast network data enhancement based on regression attention
CN112258402A (en) * 2020-09-30 2021-01-22 北京理工大学 Dense residual generation countermeasure network capable of rapidly removing rain
CN112200055B (en) * 2020-09-30 2024-04-30 深圳市信义科技有限公司 Pedestrian attribute identification method, system and device of combined countermeasure generation network
CN112508800A (en) * 2020-10-20 2021-03-16 杭州电子科技大学 Attention mechanism-based highlight removing method for surface of metal part with single gray image
CN112580011B (en) * 2020-12-25 2022-05-24 华南理工大学 Portrait encryption and decryption system facing biological feature privacy protection
CN112686817B (en) * 2020-12-25 2023-04-07 天津中科智能识别产业技术研究院有限公司 Image completion method based on uncertainty estimation
CN112802160B (en) * 2021-01-12 2023-10-17 西北大学 U-GAT-IT-based improved method for migrating cartoon style of Qin cavity character
CN112766160B (en) * 2021-01-20 2023-07-28 西安电子科技大学 Face replacement method based on multi-stage attribute encoder and attention mechanism
CN112800937B (en) * 2021-01-26 2023-09-05 华南理工大学 Intelligent face recognition method
CN112818850B (en) * 2021-02-01 2023-02-10 华南理工大学 Cross-posture face recognition method and system based on progressive neural network and attention mechanism
CN112950661B (en) * 2021-03-23 2023-07-25 大连民族大学 Attention-based generation method for generating network face cartoon
CN113688857A (en) * 2021-04-26 2021-11-23 贵州电网有限责任公司 Method for detecting foreign matters in power inspection image based on generation countermeasure network
CN113255738A (en) * 2021-05-06 2021-08-13 武汉象点科技有限公司 Abnormal image detection method based on self-attention generation countermeasure network
CN113255788B (en) * 2021-05-31 2023-04-07 西安电子科技大学 Method and system for generating confrontation network face correction based on two-stage mask guidance
CN113239867B (en) * 2021-05-31 2023-08-11 西安电子科技大学 Mask area self-adaptive enhancement-based illumination change face recognition method
CN113239870B (en) * 2021-05-31 2023-08-11 西安电子科技大学 Identity constraint-based face correction method and system for generating countermeasure network
CN113255530B (en) * 2021-05-31 2024-03-29 合肥工业大学 Attention-based multichannel data fusion network architecture and data processing method
CN113837953B (en) * 2021-06-11 2024-04-12 西安工业大学 Image restoration method based on generation countermeasure network
CN113361489B (en) * 2021-07-09 2022-09-16 重庆理工大学 Decoupling representation-based face orthogonalization model construction method and training method
CN113239914B (en) * 2021-07-13 2022-02-25 北京邮电大学 Classroom student expression recognition and classroom state evaluation method and device
CN113658040A (en) * 2021-07-14 2021-11-16 西安理工大学 Face super-resolution method based on prior information and attention fusion mechanism
CN113705400B (en) * 2021-08-18 2023-08-15 中山大学 Single-mode face living body detection method based on multi-mode face training
CN113743284A (en) * 2021-08-30 2021-12-03 杭州海康威视数字技术股份有限公司 Image recognition method, device, equipment, camera and access control equipment
CN114022930B (en) * 2021-10-28 2024-04-16 天津大学 Automatic generation method of portrait credentials
CN114399431A (en) * 2021-12-06 2022-04-26 北京理工大学 Dim light image enhancement method based on attention mechanism
CN114359034B (en) * 2021-12-24 2023-08-08 北京航空航天大学 Face picture generation method and system based on hand drawing
CN114331904B (en) * 2021-12-31 2023-08-08 电子科技大学 Face shielding recognition method
CN114663539B (en) * 2022-03-09 2023-03-14 东南大学 2D face restoration technology under mask based on audio drive
CN114943585B (en) * 2022-05-27 2023-05-05 天翼爱音乐文化科技有限公司 Service recommendation method and system based on generation of countermeasure network
CN115546848B (en) * 2022-10-26 2024-02-02 南京航空航天大学 Challenge generation network training method, cross-equipment palmprint recognition method and system
CN116486464B (en) * 2023-06-20 2023-09-01 齐鲁工业大学(山东省科学院) Attention mechanism-based face counterfeiting detection method for convolution countermeasure network

Family Cites Families (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5371083B2 (en) * 2008-09-16 2013-12-18 Kddi株式会社 Face identification feature value registration apparatus, face identification feature value registration method, face identification feature value registration program, and recording medium
JP5310506B2 (en) * 2009-03-26 2013-10-09 ヤマハ株式会社 Audio mixer
CN101777116B (en) * 2009-12-23 2012-07-25 中国科学院自动化研究所 Method for analyzing facial expressions on basis of motion tracking
CN102496174A (en) * 2011-12-08 2012-06-13 中国科学院苏州纳米技术与纳米仿生研究所 Method for generating face sketch index for security monitoring
CN102938065B (en) * 2012-11-28 2017-10-20 北京旷视科技有限公司 Face feature extraction method and face identification method based on large-scale image data
CN103186774B (en) * 2013-03-21 2016-03-09 北京工业大学 A kind of multi-pose Face expression recognition method based on semi-supervised learning
CN104361328B (en) * 2014-11-21 2018-11-02 重庆中科云丛科技有限公司 A kind of facial image normalization method based on adaptive multiple row depth model
US10275113B2 (en) * 2014-12-19 2019-04-30 Hewlett-Packard Development Company, L.P. 3D visualization
GB201613138D0 (en) * 2016-07-29 2016-09-14 Unifai Holdings Ltd Computer vision systems
US10475174B2 (en) * 2017-04-06 2019-11-12 General Electric Company Visual anomaly detection system
CN107292813B (en) * 2017-05-17 2019-10-22 浙江大学 A kind of multi-pose Face generation method based on generation confrontation network
CN107506770A (en) * 2017-08-17 2017-12-22 湖州师范学院 Diabetic retinopathy eye-ground photography standard picture generation method
CN107909061B (en) * 2017-12-07 2021-03-30 电子科技大学 Head posture tracking device and method based on incomplete features
CN108510061B (en) * 2018-03-19 2022-03-29 华南理工大学 Method for synthesizing face by multiple monitoring videos based on condition generation countermeasure network
CN108564119B (en) * 2018-04-04 2020-06-05 华中科技大学 Pedestrian image generation method in any posture
CN108520503B (en) * 2018-04-13 2020-12-22 湘潭大学 Face defect image restoration method based on self-encoder and generation countermeasure network
CN109934116B (en) * 2019-02-19 2020-11-24 华南理工大学 Standard face generation method based on confrontation generation mechanism and attention generation mechanism

Also Published As

Publication number Publication date
CN109934116A (en) 2019-06-25
AU2019430859B2 (en) 2022-12-08
WO2020168731A1 (en) 2020-08-27
CN109934116B (en) 2020-11-24

Similar Documents

Publication Publication Date Title
AU2019430859A1 (en) Generative adversarial mechanism and attention mechanism-based standard face generation method
Wan et al. Region-aware reflection removal with unified content and gradient priors
WO2021129466A1 (en) Watermark detection method, device, terminal and storage medium
Yin et al. Yes," Attention Is All You Need", for Exemplar based Colorization
CN114550223B (en) Person interaction detection method and device and electronic equipment
Lee et al. Visual question answering over scene graph
Wu et al. Visual transformers: Where do transformers really belong in vision models?
JP2014164656A (en) Image processing method and program
CN114694185B (en) Cross-modal target re-identification method, device, equipment and medium
Yuan et al. Contextualized spatio-temporal contrastive learning with self-supervision
CN116049397A (en) Sensitive information discovery and automatic classification method based on multi-mode fusion
CN113657272B (en) Micro video classification method and system based on missing data completion
Liu et al. Loctex: Learning data-efficient visual representations from localized textual supervision
JP2022505320A (en) Search method, search device, storage medium
Zheng et al. La-net: Layout-aware dense network for monocular depth estimation
Mengiste et al. Transfer-Learning and Texture Features for Recognition of the Conditions of Construction Materials with Small Data Sets
Tan et al. Blind face restoration for under-display camera via dictionary guided transformer
CN110942463B (en) Video target segmentation method based on generation countermeasure network
Chen et al. Multi-stage degradation homogenization for super-resolution of face images with extreme degradations
Liu et al. BGRDNet: RGB-D salient object detection with a bidirectional gated recurrent decoding network
CN111638926A (en) Method for realizing artificial intelligence in Django framework
CN111144492B (en) Scene map generation method for mobile terminal virtual reality and augmented reality
Qi et al. An efficient deep learning hashing neural network for mobile visual search
Orhei Urban landmark detection using computer vision
Fu et al. Deep fusion feature presentations for nonaligned person re-identification

Legal Events

Date Code Title Description
FGA Letters patent sealed or granted (standard patent)