CN109684973A - The facial image fill system of convolutional neural networks based on symmetrical consistency - Google Patents

The facial image fill system of convolutional neural networks based on symmetrical consistency Download PDF

Info

Publication number
CN109684973A
CN109684973A CN201811549357.XA CN201811549357A CN109684973A CN 109684973 A CN109684973 A CN 109684973A CN 201811549357 A CN201811549357 A CN 201811549357A CN 109684973 A CN109684973 A CN 109684973A
Authority
CN
China
Prior art keywords
convolutional layer
output
activation
facial image
deconvolution
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811549357.XA
Other languages
Chinese (zh)
Other versions
CN109684973B (en
Inventor
左旺孟
李晓明
刘铭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin Institute of Technology
Original Assignee
Harbin Institute of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin Institute of Technology filed Critical Harbin Institute of Technology
Priority to CN201811549357.XA priority Critical patent/CN109684973B/en
Publication of CN109684973A publication Critical patent/CN109684973A/en
Application granted granted Critical
Publication of CN109684973B publication Critical patent/CN109684973B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • G06V40/171Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • G06T2207/30201Face
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The facial image fill system of convolutional neural networks based on symmetrical consistency, belong to image completion technical field, solves the problems, such as that the existing facial image fill system based on convolutional neural networks filling effect due to not can guarantee the symmetrical consistency of filled face is poor.The facial image fill system: light stream network is using partial occlusion facial image and its flip horizontal image as input, using obtained light stream vectors as the absolute coordinate for making the flip horizontal anamorphose, and make the flip horizontal anamorphose deformation flipchart by way of bilinear interpolation.Illumination network corrects by obtained illumination correction coefficient the illumination patterns of deformation flipchart using partial occlusion facial image and its flip horizontal image as input.Symmetrical missing pixel fills the deformation flipchart after subsystem corrects illumination patterns and its corresponding residue blocks template as its input, exports the filled facial image of missing pixel.

Description

The facial image fill system of convolutional neural networks based on symmetrical consistency
Technical field
The present invention relates to a kind of facial image fill systems, belong to image completion technical field.
Background technique
Facial image filling is intended to fill out the facial image not blocked from the facial image of a width partial occlusion, main The shelter in facial image for impaired facial image being restored or being removed partial occlusion.For partial occlusion Facial image, fill out the facial image of high quality, be all the hot spot of graph and image processing area research all the time and difficult Point.
In recent years, in order to obtain better facial image filling effect, scholar attempts for convolutional neural networks to be applied to Facial image padding field, by based on the encoder and decoder of convolutional neural networks come in partial occlusion facial image Missing pixel is filled, and different Loss constraints is used in combination, such as perception loss, segmentation loss and local discriminant damage Lose etc..However, the existing facial image fill system based on convolutional neural networks does not consider intrinsic symmetrical of facial image Information, i.e. left and right face symmetry.Such as when the left face pixel missing in the part in facial image, picture can be lacked using with left face Plain symmetrical right face pixel is filled left face missing pixel, in another example being filled out by carrying out constraint to partial occlusion facial image The mode filled guarantees left and right face symmetry.It follows that the existing facial image fill system based on convolutional neural networks Filling effect needs to be further increased.
Summary of the invention
The present invention is by the existing facial image fill system based on convolutional neural networks of solution because not can guarantee filling people The symmetrical consistency of face and the problem of filling effect difference, propose a kind of face of convolutional neural networks based on symmetrical consistency Image completion system.
The facial image fill system of convolutional neural networks of the present invention based on symmetrical consistency, for part The missing pixel of the facial image that the facial image blocked is filled not blocked, the facial image of partial occlusion includes Symmetrical missing pixel and asymmetric missing pixel;
It relative to the pixel of face median line is axisymmetricly miss status with the pixel for symmetrical missing pixel;
It relative to the pixel of face median line is axisymmetricly not lack shape with the pixel for asymmetric missing pixel State;
The facial image fill system includes asymmetric missing pixel filling subsystem and symmetrical missing pixel filling System, the two are based on convolutional neural networks to realize;
Asymmetric missing pixel filling subsystem includes light stream network and illumination network;
Light stream network is inputted the flip horizontal image of partial occlusion facial image and partial occlusion facial image as it, Using obtained light stream vectors as the absolute coordinate for making the flip horizontal anamorphose, and make this by way of bilinear interpolation Flip horizontal anamorphose obtains deformation flipchart;
Illumination network is inputted the flip horizontal image of partial occlusion facial image and partial occlusion facial image as it, And the illumination patterns of deformation flipchart are corrected by obtained illumination correction coefficient, the deformation flipchart after illumination patterns correction For the filled partial occlusion facial image of asymmetric missing pixel;
Symmetrical missing pixel fills subsystem for the filled partial occlusion facial image of asymmetric missing pixel and the figure As corresponding residue blocks template as its input, the output filled facial image of missing pixel.
The facial image fill system of convolutional neural networks of the present invention based on symmetrical consistency, is based on existing On the basis of the facial image fill system of convolutional neural networks, introduce be made of light stream network and illumination network it is asymmetric Missing pixel fills subsystem.Light stream network is by the flip horizontal image of partial occlusion facial image and partial occlusion facial image As its input, using obtained light stream vectors as the absolute coordinate for making the flip horizontal anamorphose, and inserted by bilinearity The mode of value makes the flip horizontal anamorphose, obtains deformation flipchart.Face and partial occlusion face in deformation flipchart The posture of face in image is identical with expression.Illumination network is equally by partial occlusion facial image and partial occlusion facial image Flip horizontal image as its input, and correct by obtained illumination correction coefficient the illumination patterns of deformation flipchart. After deformation flipchart is multiplied with illumination correction coefficient, human face light distribution and the human face light of partial occlusion facial image are distributed Unanimously.Symmetrical missing pixel of the invention fills the corresponding residue of the deformation flipchart after subsystem corrects illumination patterns Template is blocked as its input, exports the filled facial image of missing pixel.
The facial image fill system of convolutional neural networks of the present invention based on symmetrical consistency, passes through light stream net Network is associated with to construct the pixel in partial occlusion facial image between left and right face, and use and asymmetric missing pixel bilateral symmetry Non- missing pixel asymmetric missing pixel is filled, to realize preliminary filling to partial occlusion facial image.Pass through Illumination network establishes the illumination correction coefficient in partial occlusion facial image between left and right face, and using illumination correction coefficient to first The illumination patterns of step filling facial image are corrected, and the illumination patterns one of its illumination patterns Yu partial occlusion facial image are made It causes.This filling mode for considering the intrinsic symmetric information of facial image can fill for facial image and provide the structure for meeting identity Information.Therefore, of the present invention to be based on symmetrical one compared with the facial image fill system with existing based on convolutional neural networks The facial image fill system of the convolutional neural networks of cause property can guarantee the symmetrical consistency of filled face, and filling effect is more It is good.
Detailed description of the invention
The convolution mind to of the present invention based on symmetrical consistency will hereinafter be carried out based on the embodiments and with reference to the accompanying drawings Facial image fill system through network is described in more detail, in which:
Fig. 1 is the filling flow chart that the asymmetric missing pixel that embodiment refers to fills subsystem;
Fig. 2 is the network structure for the light stream network that embodiment refers to;
Fig. 3 is the network structure for the illumination network that embodiment refers to;
Fig. 4 is that the training network that embodiment refers to constrains symmetrical missing pixel filling subsystem study by rebuilding loss Flow chart;
Fig. 5 is the facial image fill system of the convolutional neural networks based on symmetrical consistency described in embodiment for true The filling effect figure of facial image is blocked in fact.
Specific embodiment
It is filled out below in conjunction with facial image of the attached drawing to the convolutional neural networks of the present invention based on symmetrical consistency Charging system is described further.
Embodiment: the present embodiment is explained in detail below with reference to Fig. 1 to Fig. 5.
The facial image fill system of convolutional neural networks based on symmetrical consistency described in the present embodiment, for portion The facial image blocked is divided to be filled the facial image not blocked, the missing pixel packet of the facial image of partial occlusion Include symmetrical missing pixel and asymmetric missing pixel;
It relative to the pixel of face median line is axisymmetricly miss status with the pixel for symmetrical missing pixel;
It relative to the pixel of face median line is axisymmetricly not lack shape with the pixel for asymmetric missing pixel State;
The facial image fill system includes asymmetric missing pixel filling subsystem and symmetrical missing pixel filling System, the two are based on convolutional neural networks to realize;
Asymmetric missing pixel filling subsystem includes light stream network and illumination network;
Light stream network is inputted the flip horizontal image of partial occlusion facial image and partial occlusion facial image as it, Using obtained light stream vectors as the absolute coordinate for making the flip horizontal anamorphose, and make this by way of bilinear interpolation Flip horizontal anamorphose obtains deformation flipchart;
Illumination network is inputted the flip horizontal image of partial occlusion facial image and partial occlusion facial image as it, And the illumination patterns of deformation flipchart are corrected by obtained illumination correction coefficient, the deformation flipchart after illumination patterns correction For the filled partial occlusion facial image of asymmetric missing pixel;
Symmetrical missing pixel fills subsystem for the filled partial occlusion facial image of asymmetric missing pixel and the figure As corresponding residue blocks template as its input, the output filled facial image of missing pixel.
In the present embodiment, light stream network includes light stream encoder and light stream decoder, and light stream encoder includes N1A volume Lamination, light stream decoder include N1A warp lamination;
Illumination network includes illumination encoder and illumination decoder, and illumination encoder includes N2A convolutional layer, illumination decoding Device includes N2A warp lamination;
Symmetrical missing pixel filling subsystem includes rebuilding encoder and rebuilding decoder, and rebuilding encoder includes N3A volume Lamination, rebuilding decoder includes N3A warp lamination;
N1、N2And N3It is all larger than or equal to 2.
In the present embodiment, N1=8;
Light stream encoder includes convolutional layer C1~convolutional layer C8;
Convolutional layer C1 is for successively carrying out first with the feature of connecting of its flip horizontal image to partial occlusion facial image Convolution operation and the first activation operation;
Convolutional layer C2 is used to successively carry out the output of convolutional layer C1 the second convolution operation, normalization operation and the second activation Operation;
Convolutional layer C3 is used to successively carry out the output of convolutional layer C2 third convolution operation, block normalization operation and third to swash Operation living;
Convolutional layer C4 is used to successively carry out the output of convolutional layer C3 the operation of Volume Four product, block normalization operation and the 4th swashs Operation living;
Convolutional layer C5 is used to successively carry out the output of convolutional layer C4 the 5th convolution operation, block normalization operation and the 5th swashs Operation living;
Convolutional layer C6 is used to successively carry out the output of convolutional layer C5 the 6th convolution operation, block normalization operation and the 6th swashs Operation living;
Convolutional layer C7 is used to successively carry out the output of convolutional layer C6 the 7th convolution operation, block normalization operation and the 7th swashs Operation living;
Convolutional layer C8 is used to successively carry out the output of convolutional layer C7 the 8th convolution operation and the 8th activation operates;
Light stream decoder includes warp lamination D1~warp lamination D8;
Warp lamination D1 is used to successively carry out the output of convolutional layer C8 the first deconvolution operation, block normalization operation and the Nine activation operations;
Warp lamination D2 be used for the output of warp lamination D1 successively carry out the second deconvolution operation, block normalization operation and Tenth activation operation;
Warp lamination D3 be used for the output of warp lamination D2 successively carry out third deconvolution operation, block normalization operation and 11st activation operation;
Warp lamination D4 be used for the output of warp lamination D3 successively carry out the 4th deconvolution operation, block normalization operation and 12nd activation operation;
Warp lamination D5 be used for the output of warp lamination D4 successively carry out the 5th deconvolution operation, block normalization operation and 13rd activation operation;
Warp lamination D6 be used for the output of warp lamination D5 successively carry out the 6th deconvolution operation, block normalization operation and 14th activation operation;
Warp lamination D7 be used for the output of warp lamination D6 successively carry out the 7th deconvolution operation, block normalization operation and 15th activation operation;
Warp lamination D8 is used to successively carry out the output of warp lamination D7 the 8th deconvolution operation, the 16th activation operates It is operated with bilinear interpolation;
The output of warp lamination D8 is deformation flipchart;
The convolution operation that first convolution operation is 64 4*4, step-length is 2;
The convolution operation that second convolution operation is 128 4*4, step-length is 2;
The convolution operation that third convolution operation is 256 4*4, step-length is 2;
It is 512 4*4 that Volume Four product, which operates, the convolution operation that step-length is 2;
5th convolution operation to the 8th convolution operation is 1024 4*4, the convolution operation that step-length is 2;
First deconvolution is operated to the deconvolution that third deconvolution operation is 1024 4*4, step-length is 2 and is operated;
The deconvolution operation that 4th deconvolution operation is 512 4*4, step-length is 2;
The deconvolution operation that 5th deconvolution operation is 256 4*4, step-length is 2;
The deconvolution operation that 6th deconvolution operation is 128 4*4, step-length is 2;
The deconvolution operation that 7th deconvolution operation is 64 4*4, step-length is 2;
The deconvolution operation that 8th deconvolution operation is 2 4*4, step-length is 2;
First activation operation to the 7th activation operation is all made of LReLU function, the 8th activation operation to the 15th activation behaviour It is all made of ReLU function, the 16th activation operation uses Tanh function.
In the present embodiment, N2=8;
Illumination encoder includes convolutional layer C9~convolutional layer C16;
Convolutional layer C9 is for successively carrying out the 9th with the feature of connecting of its flip horizontal image to partial occlusion facial image Convolution operation and the 17th activation operation;
Convolutional layer C10 is for successively carrying out the tenth convolution operation, block normalization operation and the tenth to the output of convolutional layer C9 Eight activation operations;
Convolutional layer C11 is used to successively carry out the output of convolutional layer C10 the 11st convolution operation, block normalization operation and the 19 activation operations;
Convolutional layer C12 is used to successively carry out the output of convolutional layer C11 the 12nd convolution operation, block normalization operation and the 20 activation operations;
Convolutional layer C13 is used to successively carry out the output of convolutional layer C12 the 13rd convolution operation, block normalization operation and the 21 activation operations;
Convolutional layer C14 is used to successively carry out the output of convolutional layer C13 the 14th convolution operation, block normalization operation and the 22 activation operations;
Convolutional layer C15 is used to successively carry out the output of convolutional layer C14 the 15th convolution operation, block normalization operation and the 23 activation operations;
Convolutional layer C16 is used to successively carry out the output of convolutional layer C15 the 16th convolution operation and the 24th activation is grasped Make;
Illumination decoder includes warp lamination D9~warp lamination D16;
Warp lamination D9 be used for the output of convolutional layer C16 successively carry out the 9th deconvolution operation, block normalization operation and 25th activation operation;
Warp lamination D10 is for successively carrying out the tenth deconvolution operation, block normalization operation to the output of warp lamination D9 With the 26th activation operation;
Warp lamination D11 is used to successively carry out the output of warp lamination D10 the 11st deconvolution operation, block normalization behaviour Make and the 27th activation operates;
Warp lamination D12 is used to successively carry out the output of warp lamination D11 the 12nd deconvolution operation, block normalization behaviour Make and the 28th activation operates;
Warp lamination D13 is used to successively carry out the output of warp lamination D12 the 13rd deconvolution operation, block normalization behaviour Make and the 29th activation operates;
Warp lamination D14 is used to successively carry out the output of warp lamination D13 the 14th deconvolution operation, block normalization behaviour Make and the 30th activation operates;
Warp lamination D15 is used to successively carry out the output of warp lamination D14 the 15th deconvolution operation, block normalization behaviour Make and the 31st activation operates;
Warp lamination D16 is used for the output to warp lamination D15 and carries out the 16th deconvolution operation;
The output of warp lamination D16 is illumination correction coefficient;
The convolution operation that 9th convolution operation is 64 4*4, step-length is 2;
The convolution operation that tenth convolution operation is 128 4*4, step-length is 2;
The convolution operation that 11st convolution operation is 256 4*4, step-length is 2;
The convolution operation that 12nd convolution operation is 512 4*4, step-length is 2;
13rd convolution operation to the 16th convolution operation is 1024 4*4, the convolution operation that step-length is 2;
9th deconvolution is operated to the deconvolution that the 11st deconvolution operation is 1024 4*4, step-length is 2 and is operated;
The deconvolution operation that 12nd deconvolution operation is 512 4*4, step-length is 2;
The deconvolution operation that 13rd deconvolution operation is 256 4*4, step-length is 2;
The deconvolution operation that 14th deconvolution operation is 128 4*4, step-length is 2;
The deconvolution operation that 15th deconvolution operation is 64 4*4, step-length is 2;
The deconvolution operation that 16th deconvolution operation is 2 4*4, step-length is 2;
17th activation operation to the 23rd activation operation is all made of LReLU function, the 24th activation operation to the 31 activation operations are all made of ReLU function.
In the present embodiment, N3=8;
Rebuilding encoder includes convolutional layer C17~convolutional layer C24;
Convolutional layer C17 is for the remaining screening corresponding to the filled partial occlusion facial image of asymmetric missing pixel The series connection feature of gear template successively carries out the 17th convolution operation and the 32nd activation operation;
Convolutional layer C18 is used to successively carry out the output of convolutional layer C17 the 18th convolution operation, block normalization operation and the 33 activation operations;
Convolutional layer C19 is used to successively carry out the output of convolutional layer C18 the 19th convolution operation, block normalization operation and the 34 activation operations;
Convolutional layer C20 is used to successively carry out the output of convolutional layer C19 the 20th convolution operation, block normalization operation and the 35 activation operations;
Convolutional layer C21 be used for the output of convolutional layer C20 successively carry out the 21st convolution operation, block normalization operation and 36th activation operation;
Convolutional layer C22 be used for the output of convolutional layer C21 successively carry out the 22nd convolution operation, block normalization operation and 37th activation operation;
Convolutional layer C23 be used for the output of convolutional layer C22 successively carry out the 23rd convolution operation, block normalization operation and 38th activation operation;
Convolutional layer C24 is used to successively carry out the output of convolutional layer C23 the 24th convolution operation and the 39th activation Operation;
Rebuilding decoder includes warp lamination D17~warp lamination D24;
Warp lamination D17 is used to successively carry out the output of convolutional layer C24 the 17th deconvolution operation, block normalization behaviour Make, first forgets operation, fisrt feature serial operation and the 40th activation operation;
Warp lamination D18 is used to successively carry out the output of warp lamination D17 eighteen incompatibilities convolution operation, block normalization behaviour Make, second forgets operation, second feature serial operation and the 41st activation operation;
Warp lamination D19 is used to successively carry out the output of warp lamination D18 the 19th deconvolution operation, block normalization behaviour Make, third forgets operation, third feature serial operation and the 42nd activation operate;
Warp lamination D20 is used to successively carry out the output of warp lamination D19 the 20th deconvolution operation, block normalization behaviour Make, fourth feature serial operation and the 43rd activation operate;
Warp lamination D21 is used to successively carry out the output of warp lamination D20 the 21st deconvolution operation, block normalizes Operation, fifth feature serial operation and the 44th activation operation;
Warp lamination D22 is used to successively carry out the output of warp lamination D21 the 22nd deconvolution operation, block normalizes Operation, sixth feature serial operation and the 45th activation operation;
Warp lamination D23 is used to successively carry out the output of warp lamination D22 the 23rd deconvolution operation, block normalizes Operation, seventh feature serial operation and the 46th activation operation;
Warp lamination D24 is used to successively carry out the output of warp lamination D23 the 24th deconvolution operation and the 40th Seven activation operations;
The output of warp lamination D24 is the filled facial image of missing pixel;
The convolution operation that 17th convolution operation is 64 4*4, step-length is 2;
The convolution operation that 18th convolution operation is 128 4*4, step-length is 2;
The convolution operation that 19th convolution operation is 256 4*4, step-length is 2;
The convolution operation that 20th convolution operation is 512 4*4, step-length is 2;
21st convolution operation to the 24th convolution operation is 1024 4*4, the convolution operation that step-length is 2;
17th deconvolution is operated to the deconvolution that the 19th deconvolution operation is 1024 4*4, step-length is 2 and is operated;
The deconvolution operation that 20th deconvolution operation is 512 4*4, step-length is 2;
The deconvolution operation that 21st deconvolution operation is 256 4*4, step-length is 2;
The deconvolution operation that 22nd deconvolution operation is 128 4*4, step-length is 2;
The deconvolution operation that 23rd deconvolution operation is 64 4*4, step-length is 2;
The deconvolution operation that 24th deconvolution operation is 3 4*4, step-length is 2;
32nd activation operation to the 38th activation operation and the 40th activation operation to the 46th activation behaviour It is all made of LReLU function, the 39th activation operation uses ReLU function, and the 47th activation operation uses Sigmoid letter Number;
First, which forgets operation to third forgetting operation, is all made of DropOut function;
The output of operation is forgotten in output and first that fisrt feature serial operation is series connection convolutional layer C23;
The output of operation is forgotten in output and second that second feature serial operation is series connection convolutional layer C22;
The output of operation is forgotten in output and third that third feature serial operation is series connection convolutional layer C21;
Fourth feature serial operation be connect convolutional layer C20 output and warp lamination D20 block normalization operation it is defeated Out;
Fifth feature serial operation be connect convolutional layer C19 output and warp lamination D21 block normalization operation it is defeated Out;
Sixth feature serial operation be connect convolutional layer C18 output and warp lamination D22 block normalization operation it is defeated Out;
Seventh feature serial operation be connect convolutional layer C17 output and warp lamination D23 block normalization operation it is defeated Out.
The facial image fill system of convolutional neural networks based on symmetrical consistency described in the present embodiment further includes instruction Practice network;
Training network constrains light stream e-learning, concrete mode by the loss of face key point and full variational regularization item Are as follows:
The corresponding L people that do not block in facial image g of facial image is blocked using face critical point detection algorithm detection part Face key pointAnd L face key point in facial image g is not blocked by flip horizontal The L face key point of the flip horizontal image g' of facial image g is not blocked
For the x-axis coordinate for not blocking i-th of face key point in facial image g,Not block in facial image g The y-axis coordinate of i face key point,For the x-axis coordinate of j-th of face key point in flip horizontal image g',For level The y-axis coordinate of j-th of face key point in flipped image g';
In order to which flip horizontal image g' is aligned with facial image g is not blocked, it is expected that
X-axis coordinate and y-axis coordinate are normalized in [- 1,1];
Face key point is lost into llmIs defined as:
In formula,For the x-axis coordinate value of the corresponding light stream vectors of i-th of face key point in flip horizontal image g',For the y-axis coordinate value of the corresponding light stream vectors of i-th of face key point in flip horizontal image g';
According to the absolute coordinate for the light stream vectors Φ that light stream network obtains, full variational regularization item l is definedTVAre as follows:
lTV=‖ ▽xΦx2+‖▽yΦx2+‖▽xΦy2+‖▽yΦy2
In formula, ▽xFor gradient along the x-axis direction, ▽yFor gradient along the y-axis direction, ΦxFor the x-axis of light stream vectors Coordinate value, ΦyFor the y-axis coordinate value of light stream vectors;
Training network constrains the generation of light stream vectors by the loss of face key point and full variational regularization item jointly;
Training network constrains illumination e-learning, concrete mode by illumination consistency loss are as follows:
According to the light stream vectors Φ that light stream network exports, the corresponding deformation flipchart I of partial occlusion facial image is obtainedw:
In formula, N isFour adjacent positions, Io' be partial occlusion facial image flip horizontal image;
Similarly, it can get and do not block the corresponding deformation flipchart I of facial image gw';
It defines illumination consistency and loses LlAre as follows:
In formula, R is illumination correction coefficient.
The training network of the present embodiment constrains symmetrical missing pixel by rebuilding loss and fills subsystem study, specific side Formula are as follows:
Rebuild loss include the filled facial image of missing pixel it is corresponding do not block facial image g it is European away from It is lost from losing and rebuilding recognition of face network characterization;
By the corresponding Euclidean distance loss definition for not blocking facial image g of the filled facial image of missing pixel Are as follows:
In formula,For the filled facial image of missing pixel;
Recognition of face network characterization is lost is defined as:
In formula,When the facial image blocked for importation, preparatory trained VGG-Face network obtain the L layers of convolution feature, ψl(g) when not blocking facial image g for input, obtain l layers of preparatory trained VGG-Face network Convolution feature, Cl、HlAnd WlRespectively importation block facial image when, preparatory trained VGG-Face network obtains L layers of convolution feature port number, height and width.
The training network of the present embodiment also constrains symmetrical missing pixel filling subsystem study by perception symmetric loss, Concrete mode are as follows:
The Symmetric Loss that perception symmetric loss is characterized on layer is inputted asymmetric respectively using the mode of shared sub-network The flip horizontal image of missing pixel filled partial occlusion facial image and the image obtains rebuilding in decoder l layers Feature ΩlWith Ω 'l, define perception symmetric loss are as follows:
In formula, ClIndicate feature ΩlOr Ω 'lPort number, ΦIt indicates to carry out down-sampling to the output Φ of light stream network It arrives and ΩlOr Ω 'lSize it is consistent,Indicate corresponding to the filled partial occlusion facial image of asymmetric missing pixel Residue block mould and be down sampled to and ΩlOr Ω 'lSize it is consistent.
The training network of the present embodiment also constrains symmetrical missing pixel by differentiating loss and fills subsystem study, specifically Mode are as follows:
Training network is by differentiating that network differentiates loss to obtain;
Differentiate that network includes that global differentiation network and position differentiate network, the two network structure having the same;
The overall situation differentiates that network using the facial image of partial occlusion as its input, exports the overall situation and differentiates loss;
Position differentiate network successively using left eye, right eye, nose and the mouth position in the facial image of partial occlusion as It is inputted, and is upsampled to uniform sizes respectively, and is sequentially output left eye, right eye, nose and the differentiation at mouth position loss;
The overall situation differentiates that network includes convolutional layer E1~convolutional layer E5;
Convolutional layer E1 is used to successively carry out the facial image of partial occlusion the 25th convolution operation and the 48th and swashs Operation living;
Convolutional layer E2 is used to successively carry out the output of convolutional layer E1 the 26th convolution operation, block normalization operation and the 49 activation operations;
Convolutional layer E3 is used to successively carry out the output of convolutional layer E2 the 27th convolution operation, block normalization operation and the 50 activation operations;
Convolutional layer E4 is used to successively carry out the output of convolutional layer E3 the 28th convolution operation, block normalization operation and the 51 activation operations;
Convolutional layer E5 is used to successively carry out the output of convolutional layer E4 the 29th convolution operation and the 52nd activation is grasped Make;
25th convolution operation is 64 4*4, the convolution operation that step-length is 2;
The convolution operation that 26th convolution operation is 128 4*4, step-length is 2;
The convolution operation that 27th convolution operation is 256 4*4, step-length is 2;
The convolution operation that 28th convolution operation is 512 4*4, step-length is 1;
The convolution operation that 29th convolution operation is 1 4*4, step-length is 1;
48th activation operation to the 51st activation operation is all made of LReLU function, and the 52nd activation operation is adopted With Sigmoid function;
The entropy loss that the feature of T × T and the 0 of T × T or 1 that convolutional layer E5 is exported are intersected differentiates loss to be global;
The entropy loss that the feature of T × T and the 0 of T × T or 1 that position differentiation network exports are intersected is that position differentiates damage It loses.
The training network of the present embodiment is lacked using Adam optimization algorithm to asymmetric missing pixel filling subsystem and symmetrically It loses pixel filling subsystem and carries out end-to-end training.
The facial image fill system of convolutional neural networks based on symmetrical consistency described in the present embodiment, training network Symmetrical missing pixel filling subsystem study is constrained by perception symmetric loss, perception symmetric loss can constrain symmetrical missing Perception Features consistency loss in pixel filling subsystem between face or so one side of something face, so that symmetrical missing pixel filling The filled facial image of missing pixel of system output has symmetrical consistency.
For the filling for blocking facial image of the facial image and synthesis really blocked, convolutional Neural is based on existing The facial image fill system of network is compared, the face figure of the convolutional neural networks based on symmetrical consistency described in the present embodiment As fill system is generating facial image details, identity characteristic due to introducing light stream network, illumination network and perception symmetric and losing There is better filling effect in terms of keeping with facial symmetry.
Fig. 5 is the facial image fill system of the convolutional neural networks based on symmetrical consistency described in embodiment for true The filling effect figure of facial image is blocked in fact, and facial image, the corresponding filling effect figure of the second behavior are really blocked in the first behavior.
Although describing the present invention herein with reference to specific implementation method, it should be understood that, these realities Applying example only is the example of principles and applications.It should therefore be understood that can be permitted exemplary embodiment More modifications, and can be designed that other arrangements, without departing from spirit of the invention as defined in the appended claims and Range.It should be understood that different appurtenances can be combined by being different from method described in original claim It is required that and feature described herein.It will also be appreciated that the feature in conjunction with described in separate embodiments can be used at it In his embodiment.

Claims (10)

1. the facial image fill system of the convolutional neural networks based on symmetrical consistency, for the facial image to partial occlusion It is filled the facial image not blocked, which is characterized in that the missing pixel of the facial image of partial occlusion includes pair Claim missing pixel and asymmetric missing pixel;
It relative to the pixel of face median line is axisymmetricly miss status with the pixel for symmetrical missing pixel;
It relative to the pixel of face median line is axisymmetricly non-miss status with the pixel for asymmetric missing pixel;
The facial image fill system includes that asymmetric missing pixel filling subsystem and symmetrical missing pixel fill subsystem, The two is based on convolutional neural networks to realize;
Asymmetric missing pixel filling subsystem includes light stream network and illumination network;
Light stream network is inputted the flip horizontal image of partial occlusion facial image and partial occlusion facial image as it, will To light stream vectors as the absolute coordinate for making the flip horizontal anamorphose, and make the level by way of bilinear interpolation Flipped image deformation, obtains deformation flipchart;
Illumination network leads to using the flip horizontal image of partial occlusion facial image and partial occlusion facial image as its input The illumination correction coefficient crossed corrects the illumination patterns of deformation flipchart, and the deformation flipchart after illumination patterns are corrected is non- The filled partial occlusion facial image of symmetrical missing pixel;
Symmetrical missing pixel fills subsystem for the filled partial occlusion facial image of asymmetric missing pixel and the image pair The residue answered blocks template as its input, exports the filled facial image of missing pixel.
2. the facial image fill system of the convolutional neural networks as described in claim 1 based on symmetrical consistency, feature It is, light stream network includes light stream encoder and light stream decoder, and light stream encoder includes N1A convolutional layer, light stream decoder packet Include N1A warp lamination;Illumination network includes illumination encoder and illumination decoder, and illumination encoder includes N2A convolutional layer, light It include N according to decoder2A warp lamination;Symmetrical missing pixel filling subsystem includes rebuilding encoder and rebuilding decoder, weight Building encoder includes N3A convolutional layer, rebuilding decoder includes N3A warp lamination;
N1、N2And N3It is all larger than or equal to 2.
3. the facial image fill system of the convolutional neural networks as claimed in claim 2 based on symmetrical consistency, feature It is, N1=8;
Light stream encoder includes convolutional layer C1~convolutional layer C8;
Convolutional layer C1 is for successively carrying out the first convolution with the feature of connecting of its flip horizontal image to partial occlusion facial image Operation and the first activation operation;
Convolutional layer C2 is used to successively carry out the output of convolutional layer C1 the second convolution operation, normalization operation and the second activation and grasps Make;
Convolutional layer C3 is used to successively carry out the output of convolutional layer C2 third convolution operation, block normalization operation and third activation behaviour Make;
Convolutional layer C4 is used to successively carry out the output of convolutional layer C3 the operation of Volume Four product, block normalization operation and the 4th activation and grasps Make;
Convolutional layer C5 is used to successively carry out the output of convolutional layer C4 the 5th convolution operation, block normalization operation and the 5th activation and grasps Make;
Convolutional layer C6 is used to successively carry out the output of convolutional layer C5 the 6th convolution operation, block normalization operation and the 6th activation and grasps Make;
Convolutional layer C7 is used to successively carry out the output of convolutional layer C6 the 7th convolution operation, block normalization operation and the 7th activation and grasps Make;
Convolutional layer C8 is used to successively carry out the output of convolutional layer C7 the 8th convolution operation and the 8th activation operates;
Light stream decoder includes warp lamination D1~warp lamination D8;
Warp lamination D1 is used to successively carry out the output of convolutional layer C8 the first deconvolution operation, block normalization operation and the 9th swashs Operation living;
Warp lamination D2 is for successively carrying out the second deconvolution operation, block normalization operation and the tenth to the output of warp lamination D1 Activation operation;
Warp lamination D3 is for successively carrying out third deconvolution operation, block normalization operation and the tenth to the output of warp lamination D2 One activation operation;
Warp lamination D4 is for successively carrying out the 4th deconvolution operation, block normalization operation and the tenth to the output of warp lamination D3 Two activation operations;
Warp lamination D5 is for successively carrying out the 5th deconvolution operation, block normalization operation and the tenth to the output of warp lamination D4 Three activation operations;
Warp lamination D6 is for successively carrying out the 6th deconvolution operation, block normalization operation and the tenth to the output of warp lamination D5 Four activation operations;
Warp lamination D7 is for successively carrying out the 7th deconvolution operation, block normalization operation and the tenth to the output of warp lamination D6 Five activation operations;
Warp lamination D8 is used to successively carry out the output of warp lamination D7 the 8th deconvolution operation, the 16th activation operates and double Linear interpolation operation;
The output of warp lamination D8 is deformation flipchart;
The convolution operation that first convolution operation is 64 4*4, step-length is 2;
The convolution operation that second convolution operation is 128 4*4, step-length is 2;
The convolution operation that third convolution operation is 256 4*4, step-length is 2;
It is 512 4*4 that Volume Four product, which operates, the convolution operation that step-length is 2;
5th convolution operation to the 8th convolution operation is 1024 4*4, the convolution operation that step-length is 2;
First deconvolution is operated to the deconvolution that third deconvolution operation is 1024 4*4, step-length is 2 and is operated;
The deconvolution operation that 4th deconvolution operation is 512 4*4, step-length is 2;
The deconvolution operation that 5th deconvolution operation is 256 4*4, step-length is 2;
The deconvolution operation that 6th deconvolution operation is 128 4*4, step-length is 2;
The deconvolution operation that 7th deconvolution operation is 64 4*4, step-length is 2;
The deconvolution operation that 8th deconvolution operation is 2 4*4, step-length is 2;
First activation operation to the 7th activation operation is all made of LReLU function, and the 8th activation operation is equal to the 15th activation operation Using ReLU function, the 16th activation operation uses Tanh function.
4. the facial image fill system of the convolutional neural networks as claimed in claim 3 based on symmetrical consistency, feature It is, N2=8;
Illumination encoder includes convolutional layer C9~convolutional layer C16;
Convolutional layer C9 is for successively carrying out the 9th convolution with the feature of connecting of its flip horizontal image to partial occlusion facial image Operation and the 17th activation operation;
Convolutional layer C10 is used to successively carry out the output of convolutional layer C9 the tenth convolution operation, block normalization operation and the 18th swashs Operation living;
Convolutional layer C11 is for successively carrying out the 11st convolution operation, block normalization operation and the 19th to the output of convolutional layer C10 Activation operation;
Convolutional layer C12 is for successively carrying out the 12nd convolution operation, block normalization operation and the 20th to the output of convolutional layer C11 Activation operation;
Convolutional layer C13 is for successively carrying out the 13rd convolution operation, block normalization operation and the 20th to the output of convolutional layer C12 One activation operation;
Convolutional layer C14 is for successively carrying out the 14th convolution operation, block normalization operation and the 20th to the output of convolutional layer C13 Two activation operations;
Convolutional layer C15 is for successively carrying out the 15th convolution operation, block normalization operation and the 20th to the output of convolutional layer C14 Three activation operations;
Convolutional layer C16 is used to successively carry out the output of convolutional layer C15 the 16th convolution operation and the 24th activation operates;
Illumination decoder includes warp lamination D9~warp lamination D16;
Warp lamination D9 is for successively carrying out the 9th deconvolution operation, block normalization operation and second to the output of convolutional layer C16 15 activation operations;
Warp lamination D10 is used to successively carry out the output of warp lamination D9 the tenth deconvolution operation, block normalization operation and the 26 activation operations;
Warp lamination D11 be used for the output of warp lamination D10 successively carry out the 11st deconvolution operation, block normalization operation and 27th activation operation;
Warp lamination D12 be used for the output of warp lamination D11 successively carry out the 12nd deconvolution operation, block normalization operation and 28th activation operation;
Warp lamination D13 be used for the output of warp lamination D12 successively carry out the 13rd deconvolution operation, block normalization operation and 29th activation operation;
Warp lamination D14 be used for the output of warp lamination D13 successively carry out the 14th deconvolution operation, block normalization operation and 30th activation operation;
Warp lamination D15 be used for the output of warp lamination D14 successively carry out the 15th deconvolution operation, block normalization operation and 31st activation operation;
Warp lamination D16 is used for the output to warp lamination D15 and carries out the 16th deconvolution operation;
The output of warp lamination D16 is illumination correction coefficient;
The convolution operation that 9th convolution operation is 64 4*4, step-length is 2;
The convolution operation that tenth convolution operation is 128 4*4, step-length is 2;
The convolution operation that 11st convolution operation is 256 4*4, step-length is 2;
The convolution operation that 12nd convolution operation is 512 4*4, step-length is 2;
13rd convolution operation to the 16th convolution operation is 1024 4*4, the convolution operation that step-length is 2;
9th deconvolution is operated to the deconvolution that the 11st deconvolution operation is 1024 4*4, step-length is 2 and is operated;
The deconvolution operation that 12nd deconvolution operation is 512 4*4, step-length is 2;
The deconvolution operation that 13rd deconvolution operation is 256 4*4, step-length is 2;
The deconvolution operation that 14th deconvolution operation is 128 4*4, step-length is 2;
The deconvolution operation that 15th deconvolution operation is 64 4*4, step-length is 2;
The deconvolution operation that 16th deconvolution operation is 2 4*4, step-length is 2;
17th activation operation to the 23rd activation operation is all made of LReLU function, the 24th activation operation to the 30th One activation operation is all made of ReLU function.
5. the facial image fill system of the convolutional neural networks as claimed in claim 4 based on symmetrical consistency, feature It is, N3=8;
Rebuilding encoder includes convolutional layer C17~convolutional layer C24;
Convolutional layer C17 blocks mould for the residue corresponding to the filled partial occlusion facial image of asymmetric missing pixel The series connection feature of plate successively carries out the 17th convolution operation and the 32nd activation operation;
Convolutional layer C18 is for successively carrying out the 18th convolution operation, block normalization operation and the 30th to the output of convolutional layer C17 Three activation operations;
Convolutional layer C19 is for successively carrying out the 19th convolution operation, block normalization operation and the 30th to the output of convolutional layer C18 Four activation operations;
Convolutional layer C20 is for successively carrying out the 20th convolution operation, block normalization operation and the 30th to the output of convolutional layer C19 Five activation operations;
Convolutional layer C21 is for successively carrying out the 21st convolution operation, block normalization operation and third to the output of convolutional layer C20 16 activation operations;
Convolutional layer C22 is for successively carrying out the 22nd convolution operation, block normalization operation and third to the output of convolutional layer C21 17 activation operations;
Convolutional layer C23 is for successively carrying out the 23rd convolution operation, block normalization operation and third to the output of convolutional layer C22 18 activation operations;
Convolutional layer C24 is used to successively carry out the output of convolutional layer C23 the 24th convolution operation and the 39th activation operates;
Rebuilding decoder includes warp lamination D17~warp lamination D24;
Warp lamination D17 is used to successively carry out the output of convolutional layer C24 the 17th deconvolution operation, block normalization operation, the One forgets operation, fisrt feature serial operation and the 40th activation operation;
Warp lamination D18 be used for the output of warp lamination D17 successively carry out eighteen incompatibilities convolution operation, block normalization operation, Second forgets operation, second feature serial operation and the 41st activation operation;
Warp lamination D19 be used for the output of warp lamination D18 successively carry out the 19th deconvolution operation, block normalization operation, Third forgets operation, third feature serial operation and the 42nd activation operation;
Warp lamination D20 be used for the output of warp lamination D19 successively carry out the 20th deconvolution operation, block normalization operation, Fourth feature serial operation and the 43rd activation operation;
Warp lamination D21 is used to successively carry out the output of warp lamination D20 the 21st deconvolution operation, block normalization behaviour Make, fifth feature serial operation and the 44th activation operate;
Warp lamination D22 is used to successively carry out the output of warp lamination D21 the 22nd deconvolution operation, block normalization behaviour Make, sixth feature serial operation and the 45th activation operate;
Warp lamination D23 is used to successively carry out the output of warp lamination D22 the 23rd deconvolution operation, block normalization behaviour Make, seventh feature serial operation and the 46th activation operate;
Warp lamination D24 is used to successively carry out the output of warp lamination D23 the 24th deconvolution operation and the 47th and swashs Operation living;
The output of warp lamination D24 is the filled facial image of missing pixel;
The convolution operation that 17th convolution operation is 64 4*4, step-length is 2;
The convolution operation that 18th convolution operation is 128 4*4, step-length is 2;
The convolution operation that 19th convolution operation is 256 4*4, step-length is 2;
The convolution operation that 20th convolution operation is 512 4*4, step-length is 2;
21st convolution operation to the 24th convolution operation is 1024 4*4, the convolution operation that step-length is 2;
17th deconvolution is operated to the deconvolution that the 19th deconvolution operation is 1024 4*4, step-length is 2 and is operated;
The deconvolution operation that 20th deconvolution operation is 512 4*4, step-length is 2;
The deconvolution operation that 21st deconvolution operation is 256 4*4, step-length is 2;
The deconvolution operation that 22nd deconvolution operation is 128 4*4, step-length is 2;
The deconvolution operation that 23rd deconvolution operation is 64 4*4, step-length is 2;
The deconvolution operation that 24th deconvolution operation is 3 4*4, step-length is 2;
32nd activation operation to the 38th activation operation and the 40th activation operation are equal to the 46th activation operation Using LReLU function, the 39th activation operation uses ReLU function, and the 47th activation operation uses Sigmoid function;
First, which forgets operation to third forgetting operation, is all made of DropOut function;
The output of operation is forgotten in output and first that fisrt feature serial operation is series connection convolutional layer C23;
The output of operation is forgotten in output and second that second feature serial operation is series connection convolutional layer C22;
The output of operation is forgotten in output and third that third feature serial operation is series connection convolutional layer C21;
Fourth feature serial operation is the output of the output of series connection convolutional layer C20 and the block normalization operation of warp lamination D20;
Fifth feature serial operation is the output of the output of series connection convolutional layer C19 and the block normalization operation of warp lamination D21;
Sixth feature serial operation is the output of the output of series connection convolutional layer C18 and the block normalization operation of warp lamination D22;
Seventh feature serial operation is the output of the output of series connection convolutional layer C17 and the block normalization operation of warp lamination D23.
6. the facial image fill system of the convolutional neural networks as claimed in claim 5 based on symmetrical consistency, feature It is, the facial image fill system further includes trained network;
Training network constrains light stream e-learning, concrete mode by the loss of face key point and full variational regularization item are as follows:
The corresponding L people that do not block in facial image g of facial image is blocked using face critical point detection algorithm detection part Face key pointAnd L face key point in facial image g is not blocked by flip horizontal The L face key point of the flip horizontal image g' of facial image g is not blocked
For the x-axis coordinate for not blocking i-th of face key point in facial image g,Not block in facial image g i-th The y-axis coordinate of face key point,For the x-axis coordinate of j-th of face key point in flip horizontal image g',It is turned over for level Turn the y-axis coordinate of j-th of face key point in image g';
In order to which flip horizontal image g' is aligned with facial image g is not blocked, it is expected that
X-axis coordinate and y-axis coordinate are normalized in [- 1,1];
Face key point is lost into llmIs defined as:
In formula,For the x-axis coordinate value of the corresponding light stream vectors of i-th of face key point in flip horizontal image g', For the y-axis coordinate value of the corresponding light stream vectors of i-th of face key point in flip horizontal image g';
According to the absolute coordinate for the light stream vectors Φ that light stream network obtains, full variational regularization item l is definedTVAre as follows:
In formula,For gradient along the x-axis direction,For gradient along the y-axis direction, ΦxFor the x-axis coordinate of light stream vectors Value, ΦyFor the y-axis coordinate value of light stream vectors;
Training network constrains the generation of light stream vectors by the loss of face key point and full variational regularization item jointly;
Training network constrains illumination e-learning, concrete mode by illumination consistency loss are as follows:
According to the light stream vectors Φ that light stream network exports, the corresponding deformation flipchart I of partial occlusion facial image is obtainedw:
In formula, N isFour adjacent positions, Io'For the flip horizontal image of partial occlusion facial image;
Similarly, it can get and do not block the corresponding deformation flipchart I of facial image gw'
It defines illumination consistency and loses LlAre as follows:
Ll=| | Iw'⊙R-g||2
In formula, R is illumination correction coefficient.
7. the facial image fill system of the convolutional neural networks as claimed in claim 6 based on symmetrical consistency, feature It is, training network constrains symmetrical missing pixel by rebuilding loss and fills subsystem study, concrete mode are as follows:
Rebuilding loss includes the corresponding Euclidean distance damage for not blocking facial image g of the filled facial image of missing pixel It becomes estranged and rebuilds the loss of recognition of face network characterization;
By the corresponding Euclidean distance loss for not blocking facial image g of the filled facial image of missing pixel is defined as:
In formula,For the filled facial image of missing pixel;
Recognition of face network characterization is lost is defined as:
In formula,When the facial image blocked for importation, obtain l layers of preparatory trained VGG-Face network Convolution feature, ψlIt (g) is the l layers of convolution that preparatory trained VGG-Face network obtains when inputting not blocking facial image g Feature, Cl、HlAnd WlRespectively importation block facial image when, l that preparatory trained VGG-Face network obtains Port number, height and the width of layer convolution feature.
8. the facial image fill system of the convolutional neural networks as claimed in claim 7 based on symmetrical consistency, feature It is, training network also constrains symmetrical missing pixel filling subsystem study, concrete mode by perception symmetric loss are as follows:
The Symmetric Loss that perception symmetric loss is characterized on layer inputs asymmetric missing using the mode of shared sub-network respectively The flip horizontal image of partial occlusion facial image and the image after pixel filling obtains rebuilding l layers of feature in decoder ΩlWith Ω 'l, define perception symmetric loss are as follows:
In formula, ClIndicate feature ΩlOr Ω 'lPort number, ΦIndicate to the output Φ of light stream network be down sampled to ΩlOr Ω 'lSize it is consistent,Indicate corresponding to the filled partial occlusion facial image of asymmetric missing pixel surplus The remaining mould that blocks is down sampled to and ΩlOr Ω 'lSize it is consistent.
9. the facial image fill system of the convolutional neural networks as claimed in claim 8 based on symmetrical consistency, feature It is, training network also constrains symmetrical missing pixel by differentiating loss and fills subsystem study, concrete mode are as follows:
Training network is by differentiating that network differentiates loss to obtain;
Differentiate that network includes that global differentiation network and position differentiate network, the two network structure having the same;
The overall situation differentiates that network using the facial image of partial occlusion as its input, exports the overall situation and differentiates loss;
Position differentiate network successively using in the facial image of partial occlusion left eye, right eye, nose and mouth position it is defeated as its Enter, be upsampled to uniform sizes respectively, and is sequentially output left eye, right eye, nose and the differentiation at mouth position loss;
The overall situation differentiates that network includes convolutional layer E1~convolutional layer E5;
Convolutional layer E1 is used to successively carry out the facial image of partial occlusion the 25th convolution operation and the 48th activation is grasped Make;
Convolutional layer E2 is for successively carrying out the 26th convolution operation, block normalization operation and the 40th to the output of convolutional layer E1 Nine activation operations;
Convolutional layer E3 is for successively carrying out the 27th convolution operation, block normalization operation and the 50th to the output of convolutional layer E2 Activation operation;
Convolutional layer E4 is for successively carrying out the 28th convolution operation, block normalization operation and the 50th to the output of convolutional layer E3 One activation operation;
Convolutional layer E5 is used to successively carry out the output of convolutional layer E4 the 29th convolution operation and the 52nd activation operates;
25th convolution operation is 64 4*4, the convolution operation that step-length is 2;
The convolution operation that 26th convolution operation is 128 4*4, step-length is 2;
The convolution operation that 27th convolution operation is 256 4*4, step-length is 2;
The convolution operation that 28th convolution operation is 512 4*4, step-length is 1;
The convolution operation that 29th convolution operation is 1 4*4, step-length is 1;
48th activation operation to the 51st activation operation is all made of LReLU function, and the 52nd activation operation uses Sigmoid function;
The entropy loss that the feature of T × T and the 0 of T × T or 1 that convolutional layer E5 is exported are intersected differentiates loss to be global;
The entropy loss that the feature of T × T and the 0 of T × T or 1 that position differentiation network exports are intersected is that position differentiates loss.
10. the facial image fill system of the convolutional neural networks as claimed in claim 9 based on symmetrical consistency, feature It is, training network fills subsystem to asymmetric missing pixel using Adam optimization algorithm and symmetrical missing pixel fills subsystem System carries out end-to-end training.
CN201811549357.XA 2018-12-18 2018-12-18 Face image filling system based on symmetric consistency convolutional neural network Active CN109684973B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811549357.XA CN109684973B (en) 2018-12-18 2018-12-18 Face image filling system based on symmetric consistency convolutional neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811549357.XA CN109684973B (en) 2018-12-18 2018-12-18 Face image filling system based on symmetric consistency convolutional neural network

Publications (2)

Publication Number Publication Date
CN109684973A true CN109684973A (en) 2019-04-26
CN109684973B CN109684973B (en) 2023-04-07

Family

ID=66186790

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811549357.XA Active CN109684973B (en) 2018-12-18 2018-12-18 Face image filling system based on symmetric consistency convolutional neural network

Country Status (1)

Country Link
CN (1) CN109684973B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111488811A (en) * 2020-03-31 2020-08-04 长沙千视通智能科技有限公司 Face recognition method and device, terminal equipment and computer readable medium
CN113569598A (en) * 2020-04-29 2021-10-29 华为技术有限公司 Image processing method and image processing apparatus
CN113989846A (en) * 2021-10-29 2022-01-28 北京百度网讯科技有限公司 Method for detecting key points in image and method for training key point detection model
CN114792295A (en) * 2022-06-23 2022-07-26 深圳憨厚科技有限公司 Method, device, equipment and medium for correcting blocked object based on intelligent photo frame

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106910247A (en) * 2017-03-20 2017-06-30 厦门幻世网络科技有限公司 Method and apparatus for generating three-dimensional head portrait model
CA2987846A1 (en) * 2016-12-07 2018-06-07 Idemia Identity & Security France Image processing system
CN108334816A (en) * 2018-01-15 2018-07-27 桂林电子科技大学 The Pose-varied face recognition method of network is fought based on profile symmetry constraint production
CN108520503A (en) * 2018-04-13 2018-09-11 湘潭大学 A method of based on self-encoding encoder and generating confrontation network restoration face Incomplete image
CN108932693A (en) * 2018-06-15 2018-12-04 中国科学院自动化研究所 Face editor complementing method and device based on face geological information

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2987846A1 (en) * 2016-12-07 2018-06-07 Idemia Identity & Security France Image processing system
CN106910247A (en) * 2017-03-20 2017-06-30 厦门幻世网络科技有限公司 Method and apparatus for generating three-dimensional head portrait model
CN108334816A (en) * 2018-01-15 2018-07-27 桂林电子科技大学 The Pose-varied face recognition method of network is fought based on profile symmetry constraint production
CN108520503A (en) * 2018-04-13 2018-09-11 湘潭大学 A method of based on self-encoding encoder and generating confrontation network restoration face Incomplete image
CN108932693A (en) * 2018-06-15 2018-12-04 中国科学院自动化研究所 Face editor complementing method and device based on face geological information

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
JIANKANG DENG等: "UV-GAN: Adversarial Facial UV Map Completion for Pose-Invariant Face Recognition", 《2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION》 *
SATOSHI IIZUKA等: "Globally and locally consistent image completion", 《ACM TRANSACTIONS ON GRAPHICS》 *
谢鹏程: "多姿态人脸识别的研究与实现", 《中国优秀硕士学位论文全文数据库 (信息科技辑)》 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111488811A (en) * 2020-03-31 2020-08-04 长沙千视通智能科技有限公司 Face recognition method and device, terminal equipment and computer readable medium
CN111488811B (en) * 2020-03-31 2023-08-22 长沙千视通智能科技有限公司 Face recognition method, device, terminal equipment and computer readable medium
CN113569598A (en) * 2020-04-29 2021-10-29 华为技术有限公司 Image processing method and image processing apparatus
WO2021218238A1 (en) * 2020-04-29 2021-11-04 华为技术有限公司 Image processing method and image processing apparatus
CN113989846A (en) * 2021-10-29 2022-01-28 北京百度网讯科技有限公司 Method for detecting key points in image and method for training key point detection model
CN114792295A (en) * 2022-06-23 2022-07-26 深圳憨厚科技有限公司 Method, device, equipment and medium for correcting blocked object based on intelligent photo frame
CN114792295B (en) * 2022-06-23 2022-11-04 深圳憨厚科技有限公司 Method, device, equipment and medium for correcting blocked object based on intelligent photo frame

Also Published As

Publication number Publication date
CN109684973B (en) 2023-04-07

Similar Documents

Publication Publication Date Title
CN109684973A (en) The facial image fill system of convolutional neural networks based on symmetrical consistency
Song et al. Constructing stronger and faster baselines for skeleton-based action recognition
Gao et al. Unified information fusion network for multi-modal RGB-D and RGB-T salient object detection
CN108537754A (en) The facial image recovery system of figure is guided based on deformation
CN104933755B (en) A kind of stationary body method for reconstructing and system
CN112465955B (en) Dynamic human body three-dimensional reconstruction and visual angle synthesis method
CN108734194B (en) Virtual reality-oriented single-depth-map-based human body joint point identification method
CN112819947A (en) Three-dimensional face reconstruction method and device, electronic equipment and storage medium
CN108932536A (en) Human face posture method for reconstructing based on deep neural network
CN110427799A (en) Based on the manpower depth image data Enhancement Method for generating confrontation network
CN110633628B (en) RGB image scene three-dimensional model reconstruction method based on artificial neural network
CN113033570A (en) Image semantic segmentation method for improving fusion of void volume and multilevel characteristic information
CN112598775B (en) Multi-view generation method based on contrast learning
CN109191366B (en) Multi-view human body image synthesis method and device based on human body posture
CN111914618B (en) Three-dimensional human body posture estimation method based on countermeasure type relative depth constraint network
CN107767357A (en) A kind of depth image super-resolution method based on multi-direction dictionary
CN107194380A (en) The depth convolutional network and learning method of a kind of complex scene human face identification
CN112861659A (en) Image model training method and device, electronic equipment and storage medium
CN114550308B (en) Human skeleton action recognition method based on space-time diagram
CN116645328A (en) Intelligent detection method for surface defects of high-precision bearing ring
CN113192186B (en) 3D human body posture estimation model establishing method based on single-frame image and application thereof
CN108629781A (en) A kind of hair method for drafting
CN114758205A (en) Multi-view feature fusion method and system for 3D human body posture estimation
CN114998520A (en) Three-dimensional interactive hand reconstruction method and system based on implicit expression
CN109658326A (en) A kind of image display method and apparatus, computer readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant