CN109934116B - Standard face generation method based on confrontation generation mechanism and attention generation mechanism - Google Patents

Standard face generation method based on confrontation generation mechanism and attention generation mechanism Download PDF

Info

Publication number
CN109934116B
CN109934116B CN201910121233.XA CN201910121233A CN109934116B CN 109934116 B CN109934116 B CN 109934116B CN 201910121233 A CN201910121233 A CN 201910121233A CN 109934116 B CN109934116 B CN 109934116B
Authority
CN
China
Prior art keywords
image
face
network
standard
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910121233.XA
Other languages
Chinese (zh)
Other versions
CN109934116A (en
Inventor
谢巍
余孝源
潘春文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China University of Technology SCUT
Original Assignee
South China University of Technology SCUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT filed Critical South China University of Technology SCUT
Priority to CN201910121233.XA priority Critical patent/CN109934116B/en
Publication of CN109934116A publication Critical patent/CN109934116A/en
Priority to PCT/CN2019/112045 priority patent/WO2020168731A1/en
Priority to AU2019430859A priority patent/AU2019430859B2/en
Application granted granted Critical
Publication of CN109934116B publication Critical patent/CN109934116B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The invention discloses a standard face generation method based on a confrontation mechanism and an attention mechanism, which comprises the following steps: a data set design step, namely constructing a face code with various non-limiting factors for a face image according to related annotation data of a database, and taking the code and the face image as input of a model; designing and training a model, namely designing a corresponding network structure by using a generation countermeasure mechanism and an attention mechanism, and performing model training by using the constructed data pair to further obtain the weight of the network model; and model prediction, namely predicting the acquired face image through a model. The invention applies the deep learning network technology to the standard face generation for generating the colorful, positive and normal lighting standard face images; by using the deep learning network method, accurate standard face photos can be obtained, the difficulty in matching with data in a single sample database is reduced, and a solid foundation is laid for the subsequent feature extraction of the face and the single sample face recognition.

Description

Standard face generation method based on confrontation generation mechanism and attention generation mechanism
Technical Field
The invention relates to the technical field of deep learning application, in particular to a standard face generation method based on a confrontation generation mechanism and an attention generation mechanism.
Background
In recent years, video monitoring is popularized in large and medium-sized cities in China, is widely applied to the construction of a social security prevention and control system, and becomes a powerful technical means for detecting and solving the case by public security organs. Particularly, in group events, particularly in very big cases and two robbery cases, evidence clues acquired from video surveillance videos play a key role in rapidly detecting cases. At present, the domestic public security administration mainly uses a video surveillance video to search for a crime clue and crime evidence afterwards, and the identity of a suspect is locked by comparing the face information of the important suspect with the personnel information in the database of the public security administration. However, there are many limiting factors in the face information of the suspect in the surveillance video, such as facial expression information interference, posture interference or interference of shooting illumination. Because most of face information images of people in the public security bureau database only have single identification photo samples, when the face images interfered by the various restrictive factors are identified, the success rate is greatly restricted, and the situations of missing detection, error detection and the like are often caused.
In recent years, the field of artificial intelligence has been referred to as the scope of national emphasis. The combination of artificial intelligence and related industries is a necessary trend of developing towards intellectualization in China, and the method has important significance for promoting the development of industries towards intellectualization and automation. The most important thing in the artificial intelligence field is to design a corresponding deep learning network model aiming at different industry tasks. With the improvement of computer computing power, the difficulty of network training is greatly reduced, and the network prediction precision is continuously improved. The deep learning network has the basic characteristics of strong model fitting capability, large information amount and high precision, and can meet different requirements in different industries. For the face recognition problem with various non-limiting factors, the key problem is how to generate a standard front face image to meet the requirement of subsequent face image feature extraction and recognition. At present, a corresponding and reasonable deep learning network framework is urgently needed to be designed, a high-performance computer processing capacity is utilized to train a network, and then a standard front face image can be generated, the accuracy of face matching is improved, and the occurrence of false detection during face recognition is reduced.
Disclosure of Invention
The invention aims to solve the defects in the prior art, and provides a standard face generation method based on a confrontation generation mechanism and an attention generation mechanism.
The purpose of the invention can be achieved by adopting the following technical scheme:
a standard human face generation method based on a confrontation mechanism and an attention mechanism generation method comprises the following steps: designing a data set, designing and training a model and predicting the model; the data set design step is mainly that a face code with various non-limiting factors is constructed for each face image according to related labeling data of a database through a current mainstream RaFD data set and an IAIR face data set, wherein the face code comprises a face expression factor, a face posture factor, a shooting illumination factor and the like, and the code and the face image are used as the input of a model; the model design and training step mainly comprises the steps of designing a corresponding network structure by utilizing a correlation principle of a countermeasure generation mechanism and an attention generation mechanism, and performing model training by utilizing the constructed data pair to further obtain network model weight; the model prediction step is mainly to carry out model processing on a face image obtained in reality to obtain a predicted result.
Specifically, the operation steps are as follows:
s1, data construction, wherein face data in a RaFD face data set and an IAIR face data set are collected, face codes with various non-limiting factors are constructed for each face image, then the face data are classified, the non-limiting factors comprise face expression factors, face posture factors and shooting illumination factors, and the coded face image constitutes an information unit U ═ Lu,Eu,AuComprises an 8-bit illumination code Lu8-bit expression code EuAnd 19-bit attitude code Au
S2, establishing a network model based on a confrontation generating mechanism and an attention generating mechanism, wherein the network model comprises three sub-networks, namely an image generator sub-network for generating a standard human face, a model discriminator sub-network for discriminating a generated result and an image restoring sub-network for restoring through the generated result; firstly, standard face generation is carried out on an input human face image by utilizing an image generator sub-network and an attention mechanism; then, utilizing a model discriminator subnetwork to discriminate the generated image, finally constructing an image reduction subnetwork, reducing the generated image, comparing a reduction result with an input image, and carrying out optimization constraint on a network model;
s3, model training, namely optimizing the similarity between the output and the label of the image generator sub-network, the model discriminator sub-network and the image restoration sub-network by using the image unit generated in the step S1 and taking the image with various non-limiting factors as input, and realizing the convergence of the network model based on the generation countermeasure mechanism and the attention mechanism;
and S4, model prediction is carried out, the human face in the actual image is extracted and used as the input of the model, and the standard front face image output is finally obtained by controlling the unified information unit.
Further, in step S1, the face information in the face data set is correspondingly encoded and divided into two types, i.e., a non-limiting face image and a standard front natural face image;
the step S1 procedure is as follows;
and S11, encoding the face information. And constructing a face code with a plurality of non-limiting factors for each face image according to different face data in the data set, wherein the non-limiting factors comprise but are not limited to facial expression factors, facial posture factors, shooting illumination factors and the like.
The rule for coding the face image is as follows:
A) the facial expression factors are divided into eight conditions, namely distraction, anger, sadness, slight, disappointment, fear, surprise and nature, and the facial expression is coded into Eu=(Eu1,Eu2,...,Eu8) In which EulRepresents the expression of the first expression, wherein l is 0,1,2, …,8, and the value is [0,1],Eu1, expressed as a natural expression;
B) the face illumination factor is divided into eight conditions, mainly including front illumination, left illumination, right illumination and the combination of the three types of illumination, namely front illumination, left illumination, right illumination, left illumination, no illumination and full illumination, and the illumination information of the face is encoded into Lu=(Lu1,Lu2,...,Lu8) Wherein L isunRepresents the nth lighting situation, n is 0,1,2, …,8, and takes the value of [0,1],LuExpressed as front side illumination image information, (0, 0.., 1);
C) the face pose factors are divided into 19 cases including 9 poses of the left face at intervals of 10 °, 9 poses of the right face at intervals of 10 °, and face pose images, i.e., left 90 °, left 80 °, left 70 °, left 60 °, left 50 °, left 40 °, left 30 °, left 20 °, left 10 °, front face, right 10 °, right 20 °, right 30 °, right 40 °, right 50 °, right 60 °, right 70 °, right 80 °, right 90 °, and encoding the pose information of the face into au=(Au1,Au2,...,Aum,...,Au19) Wherein A isumRepresents the mth human face pose, and m is 0,1,2, …,19, and takes the value of [0,1],AuAs the frontal posture information, (0, 0.., 1) is expressed. Finally, the face information code is integrated into a unified information code U ═ Lu,Eu,AuWhich is a 35-bit one-dimensional information.
S12, classifying the face data, namely classifying the encoded face data into a non-limiting face image and a standard front natural clear face image, wherein the specific steps are as follows:
unify the coding information into U0=(Lu(0,0,...,1),Eu(0,0,...,1),Au(0, 0., 1),) as a standard front natural clear face image, and as a target image of the model; the rest of the face image is used as the non-limiting face image and is used as the input image of the model.
Further, in step S2,
suppose the input image is Y and its corresponding original unified information is encoded as UyThe generated standard face image is IoStandard face image IoCorresponding unified information coding
Figure BDA0001971939690000041
The corresponding standard face image in the database is I, and the unified information code corresponding to the standard face image I is U0
In the image generator subnetwork, the input content is the image Y and the uniform information code U0. The invention designs two codec networks GcAnd GfRespectively generating a color information mask C and an attention mask F by combining an attention mechanism; then, a standard human face is generated through the following synthesis mechanism:
C=Gc(Y,U0),F=Gf(Y,U0)
Io=(1-F)⊙C+F⊙Y
wherein |, indicates an operation of element-by-element multiplication of the matrix.
Thus, the codec network GcOf primary interest are color information and texture information of human faces, codec network GfThe main concern is the area of the face that needs to be changed;
in the model discriminator subnetwork, the content of its input is the image I generated by the image generator subnetworko. Similarly, the invention also designs two deep convolution networks, namely an image discrimination subnetwork DISum information coding discrimination sub-network DURespectively used for judging the generated standard face image IoThe difference with the corresponding standard face image I in the database, and the generated standard face image IoCorresponding unified information coding
Figure BDA0001971939690000052
Unified information code U corresponding to standard face image I in database0The difference between them;
in the image restoring sub-network, the input content is the generated standard face image IoOriginal unified information encoding U corresponding to input image Yy. The restore sub-network is identical to the image generator sub-network, the network restore result is
Figure BDA0001971939690000053
The aim of circularly optimizing the network result is achieved by comparing the reduction result with the input image Y of the whole network.
Further, the process flow of the network model based on the generation of the countermeasure mechanism and the attention mechanism is as follows:
firstly, unified information coding U corresponding to an input image Y and a standard face image I0Input into a sub-network of image generators for generating a standard face image IoWherein the image is generatedThe device sub-network integrates an attention mechanism;
then, in order to distinguish the real image from the generated image, the generated standard face image IoSending the standard face image I (namely the real image I) corresponding to the database into the image discrimination sub-network D in the model discriminator sub-networkIThe standard face image I is generated at the same time of discriminationoCorresponding unified information coding
Figure BDA0001971939690000051
Unified information coding U corresponding to standard face image I in database0Information coding discrimination subnetwork D in model discriminator subnetworkUJudging, and making the image generator sub-network and the model discriminator sub-network progress together through continuous circulation optimization;
finally, in order to realize the purpose of circularly optimizing the network model, the invention designs an image reduction sub-network, and generates a standard face image IoFurther encoding U according to original unified information corresponding to original input image YyRestoration is performed and the restoration result is compared with the input image Y. The whole network realizes the convergence of the whole network model by continuously optimizing the corresponding loss function. And finally, removing the non-limiting environmental factors of the face image.
Further, the model training in step S3 realizes the convergence of the model by optimizing the loss function, wherein the loss function design process specifically includes:
1) standard face image I generated by optimization discriminationoDifference between standard face image I corresponding to the database: the image loss function is set as follows
Figure BDA0001971939690000061
Where H and W are the height and width of the output face image, respectively, DI(Io) And DI(I) Discriminating sub-network pairs of images I for images respectivelyoThe judgment result of I; then, considering the effectiveness of the gradient loss, adding gradient-based to the image loss functionPenalty term capable of improving convergence efficiency and image generation quality, i.e. image loss function is designed as
Figure BDA0001971939690000062
Wherein
Figure BDA0001971939690000066
Representing gradient operations of the image, λIIs a penalty term weight;
2) optimizing the difference of conditional uniform information coding: setting conditional expression loss function, i.e. discriminating generated standard face image IoUnified information coding corresponding to standard face image I in database
Figure BDA0001971939690000063
And U0The difference between them, therefore, the conditional expression loss function is designed as follows:
Figure BDA0001971939690000064
wherein N is the output uniform information coding length. Then, the input image Y and the corresponding original uniform information code U are added in the conditional expression loss functionyThe conditional expression loss function is designed to be such that the discrimination ability of the discriminator is improved by the mapping relation between them
Figure BDA0001971939690000065
UyEncoding the original unified information corresponding to the input image Y, U0For uniform information coding, D, corresponding to the standard face image IU(Io) And DU(Y) discrimination sub-network pair image I for information codingoThe result of discrimination with Y;
3) optimizing the difference between the result of the image restoration subnetwork and the original input image: image I generated by an input generatoroAnd original unified information coding UyAnd restoring and comparing with the original input image Y. The reduction loss function is thus designed as
Figure BDA0001971939690000071
Where h and w represent the height and width of the image.
Thus, the training loss function for the entire network is as follows:
L=LI+LU+Lr
by optimizing the loss function, the convergence of the network model is realized, and the generator structure and the weight for generating the standard human face are obtained.
Further, for the generation of the actual face image in step S4, firstly, a face positioning method based on the face HOG image is used to obtain the face image in the actual image; and then, a generator for model training and artificially set unified information coding are utilized to realize the rapid standard face generation of the human face in the actual image. Furthermore, it is anticipated that other configurations of the face can be changed by setting different unified message codes, such as controlling other expressions, or further changes in the pose of the face, which should all be possible.
Compared with the prior art, the invention has the following advantages and effects:
the deep learning network technology is applied to a standard face generation task and is used for generating colorful, forward and normal-illumination standard face images; by using the deep learning network method, accurate standard face photos can be obtained, the difficulty in matching with data in a single sample database is reduced, and a solid foundation is laid for the subsequent feature extraction of the face and the single sample face recognition.
Drawings
FIG. 1 is a flow chart of model training and model application according to an embodiment of the present invention;
FIG. 2 is a flow chart of data construction of a database according to an embodiment of the present invention;
FIG. 3 is a diagram of the overall design of a network model in an embodiment of the invention;
FIG. 4 is a detailed block diagram of an image generation network in an embodiment of the present invention;
fig. 5 is a specific structural diagram of an image discrimination network in the embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Examples
The embodiment discloses a standard face generation method based on a confrontation generation mechanism and an attention generation mechanism, which mainly relates to the following technologies: 1) design of training data: designing a unified information code by using the existing data set; 2) and (3) designing a network model structure: generating a countermeasure network framework and a cycle optimization network method as a basic network structure; 3) the standard face generation method comprises the following steps: and an attention mechanism is added into the generator to restrict the generation accuracy of the standard human face.
This example is based on the TensorFlow framework and Pycharm development environment: the TensorFlow framework is a development framework based on python language, can conveniently and quickly build a reasonable deep learning network, and has good cross-platform interaction capability. TensorFlow provides interfaces for a plurality of encapsulation functions and various image processing functions in the deep learning architecture, including OpenCV related image processing functions. The TensorFlow framework can use the GPU to train and verify the model at the same time, and calculation efficiency is improved.
The development environment (IDE) is a development environment of Pycharm under Windows platform or Linux platform, which is one of the first choices in deep learning network design and development. Pycharm provides new templates, design tools and testing and debugging tools for clients, and simultaneously can provide an interface for the clients to directly call a remote server.
The embodiment discloses a standard face generation method based on a confrontation generation mechanism and an attention generation mechanism.
In the model training phase: firstly, processing the existing face data set, and generating a data set which accords with model training by designing a unified information coding mechanism; and then, training the network model by using a cloud server with high computational power, and obtaining a generator structure and weight for generating the standard human face by optimizing the loss function and adjusting network model parameters until the network model is converged.
In the model application stage: firstly, extracting an actual picture by using an HOG (human eye group) face image processing method to obtain an actual face image; then, calling a trained network model, and taking a face image with non-limiting factors and a designed unified information code as input to generate a standard face; finally, a colorful and frontal face image is obtained.
Fig. 1 is a flowchart of a standard face generation method based on a confrontation mechanism and an attention mechanism disclosed in this embodiment. The method comprises the following specific steps:
step one, because the current face database mainly takes recognition tasks and does not meet the face image database with unified information coding required by the invention, the existing database needs to be integrated to construct a proper database.
Fig. 2 is a process of constructing a face image and unified information code in a database.
Step two, fig. 3 is an overall architecture diagram of the network model. The whole model framework mainly comprises three sub-networks, namely an image generator sub-network for generating a standard human face, a model discriminator sub-network for discriminating a generated result and an image restoring sub-network for restoring the generated result; the image generator sub-network and the image restoration sub-network share parameters, and the image generator sub-network mainly combines an attention mechanism to generate the face image. Fig. 4 is a specific network structure of the image generator sub-network, and fig. 5 is a specific network structure of the model discriminator sub-network.
The main parameters are as follows:
1) the image generator sub-network and the image restoration sub-network have the same parameters and respectively comprise two generators, namely a color information generator and an attention mask generator, and the parameters are as follows:
the color information generator comprises 8 convolutional layers and 7 deconvolution layers, the size of a convolutional kernel of each convolutional layer is 5, the step length is 1, and finally a color information image of 3 channels is generated;
the attention mask generator contains 8 convolutional layers and 7 deconvolution layers, the convolutional kernel size of all convolutional layers is 5 steps 1, and finally 1-channel attention masks are generated.
2) The model discriminator subnetwork comprises two parts, namely an information coding discrimination subnetwork and an image discrimination subnetwork, and the specific steps are as follows: the information coding discrimination sub-network comprises 6 convolutional layers and 1 full-connection layer, the size of a convolutional kernel of each convolutional layer is 5, the step length is 1, and finally one-dimensional unified information codes with the length of N are generated; the image discrimination subnetwork contains 6 convolution layers, the convolution kernel size is 5, and the step size is 1.
Step three, training the model on a high-performance GPU, wherein specific training parameters are designed as follows: an Adam optimizer with parameters set to 0.9/0.999 can be used; the learning rate was set to 0.0001; the epoch for training is set to 100; the training batch setting depends on the training sample of data.
And step four, model prediction, namely extracting the human face in the actual image to be used as the input of the model, and finally obtaining a relatively standard front face image to be output by controlling the unified information unit.
The above embodiments are preferred embodiments of the present invention, but the present invention is not limited to the above embodiments, and any other changes, modifications, substitutions, combinations, and simplifications which do not depart from the spirit and principle of the present invention should be construed as equivalents thereof, and all such changes, modifications, substitutions, combinations, and simplifications are intended to be included in the scope of the present invention.

Claims (7)

1. A standard human face generation method based on a confrontation mechanism and an attention mechanism is characterized by comprising the following steps:
s1, data structureThe method comprises the steps of establishing and collecting face data, establishing face codes with various non-limiting factors for each face image, and then classifying the face data, wherein the non-limiting factors comprise face expression factors, face posture factors and shooting illumination factors, and the coded face images form an information unit U ═ { L ═ L { (L) }u,Eu,AuComprises an 8-bit illumination code Lu8-bit expression code EuAnd 19-bit attitude code Au
S2, establishing a network model based on a confrontation generating mechanism and an attention generating mechanism, wherein the network model comprises three sub-networks, namely an image generator sub-network for generating a standard human face, a model discriminator sub-network for discriminating a generated result and an image restoring sub-network for restoring through the generated result; firstly, standard face generation is carried out on an input human face image by utilizing an image generator sub-network and an attention mechanism; then, utilizing a model discriminator subnetwork to discriminate the generated image, finally constructing an image reduction subnetwork, reducing the generated image, comparing a reduction result with an input image, and carrying out optimization constraint on a network model;
s3, model training, using information unit U ═ Lu,Eu,AuOptimizing the output and label similarity of the image generator sub-network, the model discriminator sub-network and the image restoration sub-network by taking the images as input, and realizing the convergence of the network model based on a generation countermeasure mechanism and an attention mechanism;
s4, model prediction is carried out, a face image in an actual image is extracted to be used as the input of a network model, and finally a standard front face image is obtained through a control information unit U to be output;
the facial expression factors are divided into eight conditions, namely distraction, anger, sadness, slight, disappointment, fear, surprise and nature, and the facial expression is coded into Eu=(Eu1,Eu2,...,Eu8) In which EulRepresents the expression of the first expression, wherein l is 0,1,2, …,8, and the value is [0,1],Eu1, expressed as a natural expression;
the human face illumination factors are divided into eight conditions, namely front illumination, left side illumination, right left illumination, right side illumination, left and right illumination, no illumination and full illumination, and the illumination information of the human face is encoded into Lu=(Lu1,Lu2,...,Lu8) Wherein L isunRepresents the nth lighting situation and takes the value of [0, 1%],LuExpressed as full-illumination image information, (0, 0.., 1);
the human face posture factor is divided into 19 conditions, namely, 90 degrees on the left side, 80 degrees on the left side, 70 degrees on the left side, 60 degrees on the left side, 50 degrees on the left side, 40 degrees on the left side, 30 degrees on the left side, 20 degrees on the left side, 10 degrees on the right side, 20 degrees on the right side, 30 degrees on the right side, 40 degrees on the right side, 50 degrees on the right side, 60 degrees on the right side, 70 degrees on the right side, 80 degrees onu=(Au1,Au2,...,Aum,...,Au19) Wherein A isumRepresents the mth human face pose, and m is 0,1,2, …,19, and takes the value of [0,1],AuAs the frontal posture information, (0, 0.., 1) is expressed.
2. The method for generating a standard human face based on a confrontation mechanism and an attention mechanism according to claim 1, wherein the classifying process of the human face data in step S1 is as follows: the encoded face data is classified into a non-limiting factor face image and a standard front natural sharp face image, wherein,
unify the coding information into U0=(Lu(0,0,...,1),Eu(0,0,...,1),Au(0, 0., 1),) as a standard front natural sharp face image, and using this as a target image of the model, and the remaining face images as non-limiting face images, and using this as input images of the model.
3. The standard face generation method based on the generation of confrontation mechanism and attention mechanism as claimed in claim 1,
the image generator sub-network having as its inputImage Y and standard face unified information coding U0The image generator subnetwork comprises two codec networks GcAnd GfWherein the codec network GcCodec network G focusing on color information and texture information of human facefPaying attention to the area needing to be changed in the human face, respectively generating a color information mask C and an attention mask F by combining an attention mechanism, and then generating a standard human face by the following synthesis mechanism:
C=Gc(Y,U0),F=Gf(Y,U0)
Io=(1-F)⊙C+F⊙Y
wherein |, indicates an element-by-element multiplication operation of the matrix;
the model discriminator subnetwork has the input of the image I generated by the image generator subnetworkoThe model discriminator subnetwork comprises two deep convolution network image discrimination subnetworks DISum information coding discrimination sub-network DURespectively used for judging the generated standard face image IoThe difference with the corresponding standard face image I in the database, and the generated standard face image IoCorresponding unified information coding
Figure FDA0002630913040000032
Unified information code U corresponding to standard face image I in database0The difference between them;
the image restoring sub-network inputs the generated standard face image IoOriginal unified information encoding U corresponding to input image YyThe output is the network restoration result
Figure FDA0002630913040000034
By subjecting the reduction result to
Figure FDA0002630913040000033
And comparing the image with the input image Y of the whole network to realize the result of the circular optimization network.
4. The method for generating a standard human face based on a confrontation mechanism and an attention mechanism as claimed in claim 3, wherein said step S2 is as follows:
firstly, unified information coding U corresponding to an input image Y and a standard face image I0Inputting the image into an image generator sub-network of a fusion attention mechanism for generating a standard face image Io
Then, the generated standard face image IoSending the standard face image I corresponding to the database into a deep convolution network D in a model discriminator subnetworkIThe standard face image I is generated at the same time of discriminationoCorresponding unified information coding
Figure FDA0002630913040000031
Unified information coding U corresponding to standard face image I in database0Deep convolutional network D in model arbiter subnetworkUCarrying out discrimination to simultaneously optimize the image generator sub-network and the model discriminator sub-network;
finally, the generated standard face image IoInputting the image into an image recovery subnetwork, and coding U according to the original uniform information corresponding to the original input image YyCarrying out reduction and obtaining the reduction result
Figure FDA0002630913040000035
And comparing the image with the input image Y, and realizing the convergence of the network model based on the generation countermeasure mechanism and the attention mechanism by continuously optimizing the corresponding loss function.
5. The method as claimed in claim 1, wherein the model training in step S3 optimizes the loss function to achieve convergence of the model, wherein the loss function design process is as follows:
standard face image I generated by optimization discriminationoDifference between standard face image I corresponding to the database: the image loss function is set as follows
Figure FDA0002630913040000041
Where H and W are the height and width of the output face image, respectively, DI(Io) And DI(I) Discriminating sub-network pairs of images I for images respectivelyoThe judgment result of I; then, considering the effectiveness of gradient loss, a penalty term based on gradient is added to the image loss function, namely the image loss function is designed to be
Figure FDA0002630913040000042
Wherein
Figure FDA0002630913040000043
Representing gradient operations of the image, λIIs a penalty term weight;
optimizing the difference of conditional uniform information coding: setting conditional expression loss function, i.e. discriminating generated standard face image IoUnified information coding corresponding to standard face image I in database
Figure FDA0002630913040000044
And U0The difference between them, the conditional expression loss function is designed as follows:
Figure FDA0002630913040000045
wherein N is the length of the outputted unified information code, and then, in the conditional expression loss function, the input image Y and the corresponding original unified information code U are addedyAnd therefore, the conditional expression loss function is designed as follows:
Figure FDA0002630913040000046
wherein U isyCompiling original unified information corresponding to input image YCode, U0For uniform information coding, D, corresponding to the standard face image IU(Io) And DU(Y) discrimination sub-network pair image I for information codingoThe result of discrimination with Y;
optimizing the difference between the result of the image restoration subnetwork and the original input image: image I generated by an input generatoroAnd original unified information coding UyThe restoration is performed and compared with the original input image Y, and therefore, the restoration loss function is designed to
Figure FDA0002630913040000047
Wherein h and w represent the height and width of the image, and G represents the image generator subnetwork;
the loss function of the entire network model is as follows:
L=LI+LU+Lr
6. the method for generating a standard human face based on a confrontation mechanism and an attention mechanism as claimed in claim 1, wherein the procedure of step S4 is as follows:
firstly, acquiring a face image in an actual image by using a face positioning method based on a face HOG image;
and then, a generator for network model training and artificially set unified information coding are utilized to realize the rapid standard face generation of the human face in the actual image.
7. The standard face generation method based on the confrontation mechanism and attention mechanism generation of claim 1, wherein in step S1, face data in the RaFD face data set and the IAIR face data set are collected.
CN201910121233.XA 2019-02-19 2019-02-19 Standard face generation method based on confrontation generation mechanism and attention generation mechanism Active CN109934116B (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
CN201910121233.XA CN109934116B (en) 2019-02-19 2019-02-19 Standard face generation method based on confrontation generation mechanism and attention generation mechanism
PCT/CN2019/112045 WO2020168731A1 (en) 2019-02-19 2019-10-18 Generative adversarial mechanism and attention mechanism-based standard face generation method
AU2019430859A AU2019430859B2 (en) 2019-02-19 2019-10-18 Generative adversarial mechanism and attention mechanism-based standard face generation method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910121233.XA CN109934116B (en) 2019-02-19 2019-02-19 Standard face generation method based on confrontation generation mechanism and attention generation mechanism

Publications (2)

Publication Number Publication Date
CN109934116A CN109934116A (en) 2019-06-25
CN109934116B true CN109934116B (en) 2020-11-24

Family

ID=66985683

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910121233.XA Active CN109934116B (en) 2019-02-19 2019-02-19 Standard face generation method based on confrontation generation mechanism and attention generation mechanism

Country Status (3)

Country Link
CN (1) CN109934116B (en)
AU (1) AU2019430859B2 (en)
WO (1) WO2020168731A1 (en)

Families Citing this family (42)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109934116B (en) * 2019-02-19 2020-11-24 华南理工大学 Standard face generation method based on confrontation generation mechanism and attention generation mechanism
CN110633655A (en) * 2019-08-29 2019-12-31 河南中原大数据研究院有限公司 Attention-attack face recognition attack algorithm
CN110619315B (en) * 2019-09-24 2020-10-30 重庆紫光华山智安科技有限公司 Training method and device of face recognition model and electronic equipment
CN110796111B (en) * 2019-11-05 2020-11-10 腾讯科技(深圳)有限公司 Image processing method, device, equipment and storage medium
CN111144314B (en) * 2019-12-27 2020-09-18 北京中科研究院 Method for detecting tampered face video
CN111242078A (en) * 2020-01-20 2020-06-05 重庆邮电大学 Face-righting generation method based on self-attention mechanism
CN111325319B (en) * 2020-02-02 2023-11-28 腾讯云计算(北京)有限责任公司 Neural network model detection method, device, equipment and storage medium
CN111325809B (en) * 2020-02-07 2021-03-12 广东工业大学 Appearance image generation method based on double-impedance network
CN111275613A (en) * 2020-02-27 2020-06-12 辽宁工程技术大学 Editing method for generating confrontation network face attribute by introducing attention mechanism
CN111400531B (en) * 2020-03-13 2024-04-05 广州文远知行科技有限公司 Target labeling method, device, equipment and computer readable storage medium
CN112036281B (en) * 2020-07-29 2023-06-09 重庆工商大学 Facial expression recognition method based on improved capsule network
CN112199637B (en) * 2020-09-21 2024-04-12 浙江大学 Regression modeling method for generating contrast network data enhancement based on regression attention
CN112258402A (en) * 2020-09-30 2021-01-22 北京理工大学 Dense residual generation countermeasure network capable of rapidly removing rain
CN112508800A (en) * 2020-10-20 2021-03-16 杭州电子科技大学 Attention mechanism-based highlight removing method for surface of metal part with single gray image
CN112686817B (en) * 2020-12-25 2023-04-07 天津中科智能识别产业技术研究院有限公司 Image completion method based on uncertainty estimation
CN112580011B (en) * 2020-12-25 2022-05-24 华南理工大学 Portrait encryption and decryption system facing biological feature privacy protection
CN112802160B (en) * 2021-01-12 2023-10-17 西北大学 U-GAT-IT-based improved method for migrating cartoon style of Qin cavity character
CN112766160B (en) * 2021-01-20 2023-07-28 西安电子科技大学 Face replacement method based on multi-stage attribute encoder and attention mechanism
CN112800937B (en) * 2021-01-26 2023-09-05 华南理工大学 Intelligent face recognition method
CN112818850B (en) * 2021-02-01 2023-02-10 华南理工大学 Cross-posture face recognition method and system based on progressive neural network and attention mechanism
CN112950661B (en) * 2021-03-23 2023-07-25 大连民族大学 Attention-based generation method for generating network face cartoon
CN113688857A (en) * 2021-04-26 2021-11-23 贵州电网有限责任公司 Method for detecting foreign matters in power inspection image based on generation countermeasure network
CN113255738A (en) * 2021-05-06 2021-08-13 武汉象点科技有限公司 Abnormal image detection method based on self-attention generation countermeasure network
CN113255788B (en) * 2021-05-31 2023-04-07 西安电子科技大学 Method and system for generating confrontation network face correction based on two-stage mask guidance
CN113239870B (en) * 2021-05-31 2023-08-11 西安电子科技大学 Identity constraint-based face correction method and system for generating countermeasure network
CN113255530B (en) * 2021-05-31 2024-03-29 合肥工业大学 Attention-based multichannel data fusion network architecture and data processing method
CN113239867B (en) * 2021-05-31 2023-08-11 西安电子科技大学 Mask area self-adaptive enhancement-based illumination change face recognition method
CN113837953B (en) * 2021-06-11 2024-04-12 西安工业大学 Image restoration method based on generation countermeasure network
CN113361489B (en) * 2021-07-09 2022-09-16 重庆理工大学 Decoupling representation-based face orthogonalization model construction method and training method
CN113239914B (en) * 2021-07-13 2022-02-25 北京邮电大学 Classroom student expression recognition and classroom state evaluation method and device
CN113658040A (en) * 2021-07-14 2021-11-16 西安理工大学 Face super-resolution method based on prior information and attention fusion mechanism
CN113705400B (en) * 2021-08-18 2023-08-15 中山大学 Single-mode face living body detection method based on multi-mode face training
CN113743284A (en) * 2021-08-30 2021-12-03 杭州海康威视数字技术股份有限公司 Image recognition method, device, equipment, camera and access control equipment
CN114022930B (en) * 2021-10-28 2024-04-16 天津大学 Automatic generation method of portrait credentials
CN114399431A (en) * 2021-12-06 2022-04-26 北京理工大学 Dim light image enhancement method based on attention mechanism
CN114359034B (en) * 2021-12-24 2023-08-08 北京航空航天大学 Face picture generation method and system based on hand drawing
CN114331904B (en) * 2021-12-31 2023-08-08 电子科技大学 Face shielding recognition method
CN114663539B (en) * 2022-03-09 2023-03-14 东南大学 2D face restoration technology under mask based on audio drive
CN114943585B (en) * 2022-05-27 2023-05-05 天翼爱音乐文化科技有限公司 Service recommendation method and system based on generation of countermeasure network
CN115546848B (en) * 2022-10-26 2024-02-02 南京航空航天大学 Challenge generation network training method, cross-equipment palmprint recognition method and system
CN116486464B (en) * 2023-06-20 2023-09-01 齐鲁工业大学(山东省科学院) Attention mechanism-based face counterfeiting detection method for convolution countermeasure network
CN117808854A (en) * 2024-02-29 2024-04-02 腾讯科技(深圳)有限公司 Image generation method, model training method, device and electronic equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2234417B1 (en) * 2009-03-26 2014-06-04 Yamaha Corporation Audio mixer
CN104361328A (en) * 2014-11-21 2015-02-18 中国科学院重庆绿色智能技术研究院 Facial image normalization method based on self-adaptive multi-column depth model
CN107909061A (en) * 2017-12-07 2018-04-13 电子科技大学 A kind of head pose tracks of device and method based on incomplete feature
CN108564119A (en) * 2018-04-04 2018-09-21 华中科技大学 A kind of any attitude pedestrian Picture Generation Method
US20180293734A1 (en) * 2017-04-06 2018-10-11 General Electric Company Visual anomaly detection system

Family Cites Families (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5371083B2 (en) * 2008-09-16 2013-12-18 Kddi株式会社 Face identification feature value registration apparatus, face identification feature value registration method, face identification feature value registration program, and recording medium
CN101777116B (en) * 2009-12-23 2012-07-25 中国科学院自动化研究所 Method for analyzing facial expressions on basis of motion tracking
CN102496174A (en) * 2011-12-08 2012-06-13 中国科学院苏州纳米技术与纳米仿生研究所 Method for generating face sketch index for security monitoring
CN102938065B (en) * 2012-11-28 2017-10-20 北京旷视科技有限公司 Face feature extraction method and face identification method based on large-scale image data
CN103186774B (en) * 2013-03-21 2016-03-09 北京工业大学 A kind of multi-pose Face expression recognition method based on semi-supervised learning
WO2016099556A1 (en) * 2014-12-19 2016-06-23 Hewlett-Packard Development Company, Lp 3d visualization
GB201613138D0 (en) * 2016-07-29 2016-09-14 Unifai Holdings Ltd Computer vision systems
CN107292813B (en) * 2017-05-17 2019-10-22 浙江大学 A kind of multi-pose Face generation method based on generation confrontation network
CN107506770A (en) * 2017-08-17 2017-12-22 湖州师范学院 Diabetic retinopathy eye-ground photography standard picture generation method
CN108510061B (en) * 2018-03-19 2022-03-29 华南理工大学 Method for synthesizing face by multiple monitoring videos based on condition generation countermeasure network
CN108520503B (en) * 2018-04-13 2020-12-22 湘潭大学 Face defect image restoration method based on self-encoder and generation countermeasure network
CN109934116B (en) * 2019-02-19 2020-11-24 华南理工大学 Standard face generation method based on confrontation generation mechanism and attention generation mechanism

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2234417B1 (en) * 2009-03-26 2014-06-04 Yamaha Corporation Audio mixer
CN104361328A (en) * 2014-11-21 2015-02-18 中国科学院重庆绿色智能技术研究院 Facial image normalization method based on self-adaptive multi-column depth model
US20180293734A1 (en) * 2017-04-06 2018-10-11 General Electric Company Visual anomaly detection system
CN107909061A (en) * 2017-12-07 2018-04-13 电子科技大学 A kind of head pose tracks of device and method based on incomplete feature
CN108564119A (en) * 2018-04-04 2018-09-21 华中科技大学 A kind of any attitude pedestrian Picture Generation Method

Also Published As

Publication number Publication date
CN109934116A (en) 2019-06-25
AU2019430859B2 (en) 2022-12-08
WO2020168731A1 (en) 2020-08-27
AU2019430859A1 (en) 2021-04-29

Similar Documents

Publication Publication Date Title
CN109934116B (en) Standard face generation method based on confrontation generation mechanism and attention generation mechanism
Huang et al. Text-guided graph neural networks for referring 3d instance segmentation
Wan et al. Region-aware reflection removal with unified content and gradient priors
CN112464851A (en) Smart power grid foreign matter intrusion detection method and system based on visual perception
CN111523378B (en) Human behavior prediction method based on deep learning
CN110598019B (en) Repeated image identification method and device
CN110599395A (en) Target image generation method, device, server and storage medium
CN113435365B (en) Face image migration method and device
CN114550223B (en) Person interaction detection method and device and electronic equipment
CN112070071B (en) Method and device for labeling objects in video, computer equipment and storage medium
CN111177469A (en) Face retrieval method and face retrieval device
KR20220076398A (en) Object recognition processing apparatus and method for ar device
JP2014164656A (en) Image processing method and program
Wu et al. Visual transformers: Where do transformers really belong in vision models?
CN114038067B (en) Coal mine personnel behavior detection method, equipment and storage medium
CN111144492B (en) Scene map generation method for mobile terminal virtual reality and augmented reality
Roy et al. Learning spatial-temporal graphs for active speaker detection
CN113674230A (en) Method and device for detecting key points of indoor backlight face
Wang et al. Accelerating real‐time object detection in high‐resolution video surveillance
Sun et al. FastPR: One-stage Semantic Person Retrieval via Self-supervised Learning
Feng et al. Coal mine image dust and fog clearing algorithm based on deep learning network
Li et al. Automated deep learning system for power line inspection image analysis and processing: Architecture and design issues
CN117094895B (en) Image panorama stitching method and system
Vyshnivskyi et al. HUMAN POSE ESTIMATION SYSTEM USING DEEP LEARNING ALGORITHMS
Ibrahim et al. Face Mask Detection System using Convolutional Neural Network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant